Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Sponsored Post

Advanced Incident Management Strategies for Engineers

The business world is in constant flux, and the way we handle Incident Management (IM) needs to evolve alongside it. Incidents come in all priorities and urgencies, and while some can be addressed with any planning, others are simply unpredictable. That's why businesses can't afford to be caught off guard. The potential consequences of such incidents for businesses have never been greater. A single event can disrupt operations, damage reputations, and result in significant financial losses. Here's where modern and advanced Incident Management practices come into play.

How ilert Can Help Enhance Your Monitoring With Its VictoriaMetrics Integration

The ilert team have been working on an integration of VictoriaMetrics as part of their offering, and we’re happy to share this news today via this joint blog post. Please read on to learn more about ilert and how this new integration of VictoriaMetrics can help enhance your monitoring.

Introducing VictoriaMetrics Integration: Enhancing Your Monitoring with ilert

Continuity and efficiency are pivotal. The alignment of sophisticated monitoring solutions with responsive alerting systems is crucial for maintaining system integrity and performance. With this vision at its core, ilert is excited to unveil the latest addition to its robust catalog of integrations: VictoriaMetrics. This integration marks a significant advancement for DevOps teams and IT professionals who are striving to improve their monitoring and alerting capabilities.

The Reliability Stories You Won't Hear on LinkedIn

We had the pleasure of meeting Ponmani Palanisamy, a Staff Site Reliability Engineer at LinkedIn, at a recent SRE Meetup in Bangalore. Ponmani gave an insightful talk on "Improving data redundancy and rebalancing data in HDFS." We were captivated by his talk and eager to learn more about his experience in the reliability space. We talked about everything including his journey, experiences, and of course, his most memorable war room stories over a steady career of 17 years. Here's what he had to share.

How to create synthetic monitors in OneUptime?

In this video, we will guide you through the step-by-step process of creating synthetic monitors using OneUptime. Synthetic monitoring is a method to monitor your applications by simulating user behavior. It’s an essential tool for ensuring optimal performance and high availability of your web applications.

Building a DevOps Culture in High-Growth Companies: A Leader's Blueprintment

Let's face it, running a high-growth company is exhilarating! You're constantly innovating, customer demand is soaring, and the future feels limitless. But with that growth comes a unique set of challenges you need to navigate to stay ahead of the curve. Let’s say, your development team is churning out new features at breakneck speed. That's fantastic! But can your operations team keep up with deploying them to production? What about potential bugs or security vulnerabilities?

Site Reliability Engineer (SRE) Interview Questions

In this article we will cover the top 25 SRE interview questions to help you prepare for you next SRE interview. As customer demand for reliable and high-performing services continues to grow, the role of Site Reliability Engineers (SRE’s) continues to grow in importance. Whether you are a seasoned SRE or a recent graduate preparing for an SRE interview, these questions will be invaluable for determining your level of expertise and understanding where you need to grow.

Introducing a Brand New Microsoft Teams Integration

We’ve gotten clear feedback from our customers that we’ve needed a strong Microsoft Teams integration. Responders want a full suite of incident management functionality, no matter what chat application their organization uses. We heard you. That’s why we’re proud to announce a brand new MS Teams integration with fully robust incident management lifecycle capabilities.