Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Enabling Customer Service With Full Visibility Into Customer-Impacting Issues

We are delighted to announce a new Status Dashboard for the Zendesk Customer Service integration. The dashboard enables customer service agents to have real-time visibility into major incidents that are impacting their customers within the Zendesk tool suite, so they can proactively update customers when an incident occurs.

Strategies to Reduce Alert Fatigue in Your SOC Team

In a SOC (security operations center), alerts originating from hundreds of systems compete to get attention. What ensues is a security analyst’s battle to beat alert fatigue while effectively defending their organization from cybersecurity threats. Alert fatigue is a major challenge faced by security operations center (SOC) teams. The stakes are even higher since they take on the enormous responsibility of maintaining networks and data systems.

Four Ways to Reduce Patient Churn in Healthcare

Maximum patient satisfaction is achieved through an organization’s ability to provide effective and timely care. Healthcare staff realize that poor clinical care leads to dissatisfaction, frustration and ultimately, patient churn. To reduce patient churn, hospitals must focus on what matters the most—effective care team communication, collaboration and decision making. Patient loyalty and positive word of mouth ensures that an organization continues to generate revenue.

How to configure services in Squadcast: Best practices to reduce MTTR

With a rise in digital platforms, IT infrastructure has grown exponentially complex to a level where multiple application interdependencies coexist with varied architecture & oncall team types. This blog looks at how you can model your infrastructure in Squadcast to reduce your time to respond & resolve incidents.

5 AIOps Trends for 2021

Recently, there has been a steep rise in the research and utilization of Artificial Intelligence (AI). While AI once seemed like nothing more than a fantasy from a sci-fi movie, AI technology is now very much a reality in our everyday lives. Artificial intelligence and machine learning are involved in many of our daily tasks, from search engines that finish your thought, to pulling up directions in Google Maps, and how your Facebook and other social feeds are so perfectly catered to your interests.

How to Analyze Incidents Better with the Right Metrics

An important SRE best practice is analyzing and learning from incidents. When an incident occurs, you shouldn’t think of it as a setback, but as an opportunity to grow. Good incident analysis involves building an incident retrospective. This document will contain everything from incident metrics to the narrative of those involved. These metrics aren’t the whole story, but they can help teams make data-driven decisions. But choosing which metrics are best to analyze can be difficult.

Optimizing Alert Policies with Dynamic Destinations

Targeted reliable notifications are the core of any alerting solution. Blasting out emails may be good for quantity, but Enterprise Alert focuses on the quality, this means notifying the right people at the right time. We often see monitoring and ticketing solutions creating an incident and then relying on the emailed recipient to not only identify and handle the incident but also to close out the ticket that is raised.

Runbooks: What They Are and Why You Need One Yesterday

Let’s talk about The Legend of Zelda: A Link to the Past, and how it relates to DevOps. The game tasks our hero with finding three pendants, which unlock a Master Sword he can use to travel to an alternate realm and ultimately take down the bad guy. The US version of this SNES masterpiece came packaged with a fairly detailed instruction manual that contained an optional guide at the end to help locate the three pendants.