Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

Why your IT fails at 3 AM (and how to fix it)

In this insightful webinar, Corey Dockendorf and Michael Riley delve into the critical challenges facing IT leaders today. They tackle three main issues: the lack of visibility with traditional hosting providers, scaling infrastructure efficiently without increasing complexity, and keeping IT teams focused, productive, and well-rested. Learn how modern cloud application platforms and PaaS solutions can transform your operations, enhance transparency, and optimize performance. Discover practical insights, real-life examples, and strategies to improve your IT workflow and leverage smarter solutions for better business outcomes.

Opsgenie is shutting down. Here's what that means, and how incident.io can help

Atlassian recently announced they’ll be shutting down Opsgenie, their popular on-call alerting tool. After June 4, 2025, no new Opsgenie accounts will be created, and by April 5, 2027, the service will shut down completely. Users don’t seem happy about it. If you’re currently using Opsgenie, this news is significant. A key part of your incident response process is disappearing, and Atlassian suggests moving to their other products, like Jira Service Management or Compass.

A seven-step framework for running incident debriefs

Ever wrapped up an incident, thought 'Phew, glad that’s over,' only to feel your stomach drop when you see the dreaded "Incident Debrief" on your calendar? We've all been there. Incident debriefs don't need to feel like sitting through your least favorite school subject. They can (and should!) actually be engaging and useful. At incident.io, we've found a simple, repeatable, and blameless framework.

Is Cloud Still King? The Shifting Landscape of Infrastructure

Believe it or not, we are in the middle of one of the biggest cloud repatriation movements of the past decade. More than ever, companies are rushing to find infrastructure solutions that better suit their needs. Over the past decade, hyperscalers have dominated the market, generating trust and, in some cases, overconfidence in software development. Drawn in by promises of reliability, ease of use, and ultimate flexibility, teams turned to providers like AWS, GCP, and Azure.

Effortless observability for Django applications

Observability is critical for web operations to ensure that the application is working as expected and to identify any potential issues. However, setting up observability has traditionally been challenging because it can take hours to set up all the infrastructure, instrument your code and enable observability in production. But now there is a better way using native support for Django in Charmcraft and Rockcraft which has observability built in and ready to go!

Why Monitoring iManage is Critical for Enhancing End-User Experience in Legal Firms

As a Performance Field Technical Consultant working with customers in the legal industry, my primary focus is to ensure that technology enhances productivity rather than hinders it. Legal professionals rely on iManage as a business-critical application for document management, collaboration, and compliance. However, with the increasing shift to the cloud and integration with platforms like O365, ensuring a seamless user experience has become more complex.

How to keep track of what's running in your Gremlin team

•Part of the Gremlin Office Hours series: A monthly deep dive with Gremlin experts. Reliability testing is ongoing, and tracking that work can be difficult in large organizations. According to our own product metrics, teams run an average of 200 to 500 tests each day! With so much happening, it’s hard to keep track of everything going on—unless you use Gremlin.