Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Improve your on-call experience with Datadog mobile dashboard widgets

Life happens—even when you’re on-call. You can’t take your laptop everywhere, but whether you’re on the train, at dinner, or at the gym, you can count on the Datadog mobile app for access to key data about the status and performance of your applications. Now, you can use Datadog mobile widgets to build an on-call mobile dashboard directly on your phone’s home screen, so it’s even easier to track the data you care about from anywhere.

Customer Service Ops & PagerDuty Zendesk Integration v3 Full Case Ownership Use Case

PagerDuty's Zendesk Integration enhances communication between engineering and support teams by providing visibility to high-impact incidents via the PagerDuty Status Dashboard that is integrated into the Zendesk interface. Automate workflows for a fast-paced support team and provide the right level of information so they can interact knowledgeably with their customers while also reducing time and effort.

PD, Salesforce Service Cloud, Slack: Proactive Case Escalation & Slack-First Intelligent Swarming

Learn about and see how PagerDuty, Salesforce Service Cloud, and Slack empower collaboration across your organization to accelerate time to resolution. Proactively improve customer satisfaction in real time and break down silos to connect customer service teams with engineering teams to address incidents quickly when seconds matter. Enjoy greater control when resolving issues and anticipating customers' needs through an incident command console that gives customer service agents and stakeholders instant updates on critical, customer-impacting issues.

Five steps to better customer communication

When you’re deep into an incident and there’s alerts firing, decisions to be made, and people to escalate to, it’s easy for outward communication with your customers to fall off the priority list. In many regards this makes sense; it seems natural to put all of your focus and energy into minimising the impact and getting things back on track as soon as possible.

Announcing our $1.9M round of funding

It is with a great deal of anticipation and excitement that I’m announcing our $1.9M round of funding, led by StartupXSeed Ventures along with participation from marquee enterprise SaaS investors Powerhouse Ventures, Secure Octane fund, Kwaish Ventures, Supermorpheus, Titan Capital, 100X Entrepreneurs, Viral Bajaria(CTO, 6Sense), Premal Shah(SVP, 6Sense), Hitesh Chawla(CEO SilverPush), Sumit Jain(CTO, BirdEye) and existing investors Anand Chandrasekaran(EVP, Five9), Rajesh Sawhney(GSF), Ashish To

What's New: Extending our Datadog Capabilities With New PagerDuty Widgets

In the last two years, we have seen the rise of remote and hybrid work, and with that, a proliferation of tools and apps needed to support critical communication and collaboration. Finding that app-life balance has become increasingly complex, so simplifying “how” we work is key for every organization.

Strategies to Reduce Hospital Readmission Rates

The Centers for Medicare & Medicaid Services (CMS) scrutinizes hospital readmission rates across the U.S. each year, and it levies financial penalties on organizations that overshoot acceptable hospital readmission rates. As healthcare systems across the country embark on a journey to introduce patient-centric models to their organizations, they must align their resources with ever-changing regulations for them to thrive.

Now Available: Private Slack Channels

Ever heard the saying “Too many cooks”? If you’ve responded to incidents, you’ll likely understand the parallels. There are cases when incident command on a public channel isn’t the best option: Whatever your reason, we’ve got you covered. Now available, users can spin up a private slack channel for an incident. Read more how to do this here.

Differences between Site Reliability Engineer Vs. Software Engineer Vs. Cloud Engineer Vs. DevOps Engineer

The evolution of Software Engineering over the last decade has lead to the emergence of numerous job roles. So how different is a Software Engineer, DevOps Engineer, Site Reliability Engineer and a Cloud Engineer from each other? In this blog, we drill down and compare the differences between these roles and their functions.

SRE and Fighting Games

When learning SRE, you might find its principles a bit unintuitive. For example, it might be difficult to learn why aiming for 100% reliability is wasteful, or how reliability isn’t the same as availability, or why failure ought to be celebrated. Believe it or not, there is a method to these ideas. My goal in this article is to shed light on the principles and to leave you a believer, such that you’ll take steps towards starting SRE practices.