Operations | Monitoring | ITSM | DevOps | Cloud

Alerting

Walking Through a Call From Pingdom Alert to DigitalOcean Managed Kubernetes

SolarWinds® Pingdom® is an external synthetic monitoring agent designed to monitor your systems from the outside in. If you know what clues to look for, it can provide a great place to triage where a problem is occurring in the system. So how does a Pingdom call work, and how can you use it to debug what’s happening inside the system?

Correlating Pingdom Alerts With AppOptics and Loggly in DigitalOcean Kubernetes

So SolarWinds® Pingdom® has alerted you to an issue—what do you do now? In this article, I’ll explain the features and capabilities of a full monitoring stack in SolarWinds and how you can use it to get to the bottom of a 3 a.m. Pingdom wake-up call. The Setup For our web service, we use a simple architecture of a front-end Flask application with a Postgres back end served behind an edge SSL-terminating NGINX instance on the DigitalOcean Managed Kubernetes service.

OnPage Recognized in Gartner's Market Guide for Emergency Mass Notification Solutions

Gartner’s Market Guide for Emergency Mass Notification Solutions (EMNS) is a trusted report for security and risk management leaders. It provides insight into effective crisis communication procedures and identifies solutions that help perfect emergency management plans. The EMNS Market Guide has a large, loyal readership in several industries including, state and local government, healthcare, IT support and higher education.

Creating your first health alarm in Netdata

The per-second metrics and interactive visualizations in the Netdata Agent don’t mean much if you don’t know what you should be looking at, or whether anything is going wrong on your node in the first place. That’s why Netdata has a built-in health watchdog to notify you when metrics show an anomaly or full-blown incident that demands your immediate attention. Every Netdata Agent comes with hundreds of preconfigured charts that you don’t need to edit in order to take advantage of, but you may want to create your own based on your infrastructure, node, workload, or applications.

Incident Communications With Alina Anderson

Incidents happen. They’re disruptive, they can be stressful, and if they aren’t managed well, they can cause chaos on your team. How your team manages incidents is only half the battle. How you let other stakeholders know what is going on is the other half. Alina Anderson from Smartsheet joined the Community team in our booth this year at PagerDuty Summit to talk about Incident Communications, and we’ve shared that conversation as an episode of our Page It to the Limit podcast.

What's in store for IT Ops in 2021? Top execs from leading enterprises share their predictions

2020 is (finally) over, and it’s safe to say that this very challenging year taught us once again that (as the old Danish proverb says) it’s difficult making predictions, especially about the future. Who would have imagined in January 2020 that we would find ourselves where we are today… And yet, as Tim Harford once wrote in the Financial Times, predictions are like Pringles: nobody thinks that there’s any great virtue in them but we find them hard to resist.