Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Collaboratively author retrospectives with our new Google Docs integration

When it comes to learning from incidents, your tools should adapt to the way your organization works. Many of you conduct your retrospectives in rich-text document editing tools, like Google Docs. That’s why we’ve introduced the option to export your retrospectives to Google Docs. Retrospective export to Google Docs can be automated as part of your incident management process with a Runbook step.

Continuous Availability vs. Continuous Change

All companies are going through some form of cloud adoption - whether cloud migration for the first time, hybrid cloud adoption, or extending cloud-native with newer microservice architecture. But, according to a recent survey by Aptum*, only 39% of companies are completely satisfied with their current rate of digital transformation. Cloud adoption projects create a continuous state of change for engineering teams juggling to keep things up and running while limiting the impact on customers.

Sponsored Post

Infrastructure monitoring using kube-prometheus operator

Prometheus has emerged as the de-facto open source standard for monitoring Kubernetes implementations. In this tutorial, Kristijan Mitevski shows how infrastructure monitoring can be done using kube-prometheus operator. The blog also covers how the Prometheus Alertmanager cluster can be used to route alerts to Slack using webhooks. In this tutorial by Squadcast, you will learn how to install and configure infrastructure monitoring for your Kubernetes cluster using the kube-prometheus operator, displaying metrics with Grafana, and configuring alerting with Alertmanager.

Build custom API integrations with incident.io

We’re building incident.io as the single place you turn to when things go wrong. When an issue is disrupting your business-as-usual, the last thing you want is to start opening ten different tools to diagnose and fix it! As your central incident hub, we need to give you two powers: Workflows cover the former. Workflows are like a mini incident.io Zapier.

How to Pick the Best Incident Response Software

With the rising complexity of our digital ecosystems, incidents are occurring at an unprecedented rate. To combat the additional strain, incident responders are looking to software to help them establish a scalable, repeatable incident response process that reduces toil and noise and gets the right people on the scene at the right time. The best incident response software addresses the entire lifecycle of an incident.

AIOps' certainty in an uncertain future

BigPanda’s recent coronation as a Unicorn has prompted its leaders to look to the future of IT Operations and how it relates to artificial intelligence (AI) and machine lifiearning (ML). What is BigPanda’s role in improving IT Ops? How can AIOps contribute to greater achievement in global enterprises? These are questions a VP of Product Marketing like BigPanda’s Mohan Kompella, who has spent 15+ years in IT Operations, has been asking.

Sponsored Post

Your Goals Could Be Holding Your DevOps Teams Back

In the era of Agile, organizations are increasingly moving their IT service management teams toward a DevOps world. There are significant challenges to transforming ITSM to DevOps, but one of the most significant is goal setting. In today's face-paced business environment, it's important to establish the parameters for measuring success and determine which objectives teams need to meet to accomplish business goals.

Accelerate AIOps Scalability With New Self-Service Incidents API

BigPanda offers a diverse set of APIs to enterprises looking to move faster and scale incident response workflows seamlessly. APIs are core to automating repeated incident response workflows that enable IT Ops to keep up with the pace of change and innovation agile teams need to thrive. In Q4 of 2021, BigPanda announced the general availability of new self-service APIs including an updated Incidents API.

How Well Does Your Infrastructure Support Major Incident Management?

Effective major incident management depends on many things, including planning, precise execution, effective communication, and applying learnings from previous incidents to update those plans. Traditional major incident management wisdom addresses the importance of the remediation process, but it doesn’t speak on the issue of configuring your IT infrastructure.