Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

PagerDuty University in the Era of Virtual Training

Social distancing remains essential as we eventually make our way through a post-COVID-19 world. Gone are the days of gathering 50 people into your largest boardroom, ordering pizza, and training them to be fully ramped on a product they are going to be relying on daily. PagerDuty is here to help during this time—a time where most businesses are having to run a virtual NOC, improve their digital crisis management processes, and speed up digital transformation timelines.

Take a Quick Step to AIOps Success

Selecting the right AIOps platform is just the beginning. It’s crucial for the technology to be implemented quickly and efficiently, and to demonstrate value quickly. This is true for any major technology investment but it is particularly true of AIOps. Why? AIOps, and AI in general, has in recent years been the subject of extreme hype. Its promise seems boundless. At the same time, it is poorly understood by those outside — and even inside — of the IT community.

No room for downtime during lockdown

As of the beginning of June, even though some countries have started to slowly come out of lockdown, one-third of the world’s population is still at home in quarantine – a fact that is truly astounding. Never has reliable access to connected systems been so critical to the ongoing productiveness, emotional wellbeing, and even the survival of individuals and companies worldwide.

Best Practices for Effective Incident Management

Incident management is a set of processes used by operations teams to respond to latency or downtime, and return a service to its normal state. Incident management practices have long been well-defined through frameworks such as ITIL, but as software systems become more complex, teams increasingly need to adapt their incident management processes accordingly.

Stay Ahead of Outages With Proactive Incident Response

How would your daily life be impacted if you had a bird’s-eye view of your operations, their dependencies, and the ability to spot indicators that an incident or outage was likely to happen? What would it mean for your business if you were given minutes or hours to get ahead of disruptions instead of reacting to a surprise? For most organizations, enabling proactive incident response translates directly to dollars saved, brand reputation protected, and less burnout within response teams.

LaunchDarkly Improves Incident Response with FireHydrant

Headquartered in Oakland, California, LaunchDarkly is a feature management platform that empowers all teams to safely deliver and control software through feature flags. By separating code deployments from feature releases, LaunchDarkly enables teams to deploy faster, reduce risk, and iterate continuously. Over 1000 organizations use LaunchDarkly to build, operate, and learn from their software.

PagerDuty Microsoft Teams Integration

Drive real-time ChatOps and Empower HybridOps with the PagerDuty and Microsoft Teams integration. With millions of daily users, Microsoft Teams is an essential communication and collaboration tool for many businesses. Many modern IT Ops and DevOps teams count on Teams to keep everyone on the same page when things are running smoothly—and perhaps even more so when they aren’t.

A (Lobster) Tale of Two Systems aka the ServiceNow Chronicles

Hi Yangsters, I hope everyone is staying safe during these unpredictable times. As a fellow professional in technology, I’m guessing your workload has either remained the same or you’re now extra busy since more people are moving to doing everything online. I’ve been so busy that I haven’t been able to provide a proper update on the ongoings of Bikini Bottom—until now. But don’t worry! We’ll get through this together!