Operations | Monitoring | ITSM | DevOps | Cloud

Incident Management

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

AlertOps Expert Guidance

At AlertOps, we believe our job doesn’t end when you complete the procurement of our software service. It has just begun, our support team is only a call, email or chat away, to guide you toward your goal. Our Customer Success team will also be in constant touch, helping you with your issues at hand, guiding you on your usage patterns, recommending any under-leveraged options and highlighting new features and service that we rolled out.

What is IT Monitoring?

IT monitoring involves the use of a combination of technologies to simultaneously ensure IT equipment performs as expected and resolve any identified IT problems. The capabilities of IT monitoring technologies vary; some technologies can perform a basic assessment of equipment across an IT environment, while others can automate the identification and remediation of equipment issues. Your business can leverage monitoring technologies, but optimizing their value requires careful evaluation.

AlertOps Automation

AlertOps is built for today’s fast paced enterprise. Managing an incident can many times be a chaotic time, so a streamlined workflow, that automatically routes to the next step in your business process, helps resolve the issue at hand quickly. End to end automated workflows, aided by rules engine, allow you to optimize your processes and manage tasks efficiently.

The fault line: How to communicate in a crisis

If there’s one universal constant in the world of business, it’s that things will go wrong. Probably at the most inconvenient of times and in the most inconvenient of ways. It’s Murphy’s law, or, if you’re from England the much more fun, “Sod’s law”. These moments can define your business more than any other. Unfortunately, far more than usual day-to-day ever will.

Q&A: Datadog Expands Monitoring Reach with Moogsoft Observability Cloud

Nobody will dispute that a common goal of DevOps pros and SREs, and really any company today, is to delight their customers more by disappointing them less. This was the theme of a recent live webinar focused on announcing a new game-changing partnership between Datadog and Moogsoft. The live session combined remarks by Moogsoft CEO Phil Tee and CTO Dave Casper on bringing together the best of these two technologies with a new seamless integration.

IT Operations Glossary 2021

With increasing complexity and workloads, the world of IT operations is constantly evolving to meet the needs of digital-first organizations. Automation, AI and DevOps are intersecting today like never before. A constant influx of new technologies means new terms. Here's our take on the meaning of leading words and phrases in the space right now.

Getting Started as an SRE? Here are 3 Things You Need to Know.

We live in the era of reliability. The most important feature for a service is how dependable it is in the eyes of a user. Companies are hiring with this in mind. In a 2019 LinkedIn article, site reliability engineers were listed as the 2nd most promising career in the United States. But how do you get started as an SRE? In this blog post, we’ll look at: SRE is a multifaceted role. You will contribute to an organization's code base, policy, culture, and more.

FAANG proofing your Job Applications

There is one thing that hurts more than being rejected by a hiring manager - being rejected because you’re not ex-FAANG. This was not always the case though - FAANG’s combined engineering workforce is currently at 330,000+ and growing at an astounding 20% YoY. This means that at any given point in time, there are tens of thousands of FAANG engineers active in the job market vying for spots in great up-and-coming companies.

AlertOps Flexibility

We believe that our customers should not have to make compromises in their business process to implement and use AlertOps. AlertOps offers total flexibility, meaning it is highly configurable, legitimately addressing your pain points. One of our core tenets, that we use from our ideation stage of our product roadmap, thinking through the various design aspect to allow the maximum flexibility for the user to configure the software to their needs.

4 Things you Need to Know about Writing Better Production Readiness Checklists

When we think of reliability tools, we may overlook the humble checklist. While tools like SLOs represent the cutting edge of SRE, checklists have been recommended in many industries such as surgery and aviation for almost a century. But checklists owe this long and widespread adoption to their usefulness. Checklists can also help limit errors when deploying code to production. In this blog post, we’ll cover: Production checklists should be holistic.