Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Using Event Orchestration to reduce noise and trigger next best action

We often hear from customers that they’re dealing with unmanageable levels of noise and complexity, which makes it harder to pinpoint root cause and get to resolution quickly. All this effort spent on sifting through noise, processing events, and gathering context results in a lot of wasted time. That’s why we’ve launched Event Orchestration, which became generally available to our Event Intelligence and Digital Operations customers on Monday.

Announcing our newest integration: Confluence

Using FireHydrant’s Runbooks, incident and retro data can be automatically sent to Confluence at any point in the incident lifecycle. For example, the moment you’ve resolved an incident FireHydrant can create a fresh Confluence page with all of the critical incident information stored in FireHydrant. When utilizing Runbook conditions, you can choose the perfect moment to send your FireHydrant retro to a Confluence workspace.

Sponsored Post

Five Ways Developers Can Help SREs

Reliability is a team game. More the collaboration between Developers and SREs, greater will be the success of the product. In this blog, we have listed down the five best practices that developers can adopt, to make the SRE's life easier. It is not easy to be a site reliability engineer. Monitoring system infrastructure and aligning them with the key reliability metrics is quite a daunting task. Whereas, a software engineer's job is to deliver high-quality software.

Episode 2: Mooving to Remix: Code You Will be Happy With

Episode 2 of Mooving to… dives into a new tool called Remix, a framework to help create front-end code, you’ll love. This episode focuses on a new web framework that helps streamline your processes and eliminate downtime to the best of your ability. Thom Duran and Andrew Leonard of Moogsoft are joined by Kent C. Dodds, Director of Developer Experience at Remix.

Introducing CommsFlow for Context-Rich and Timely Updates to All Stakeholders

We’re so excited to announce our latest platform feature, CommsFlow™! This addition to the core Blameless product offering allows teams to keep stakeholders updated as the reliability of services and applications change. With our new automated and customizable communication flows, on-call, engineering, and business teams feel a sense of accomplishment and, of course, stay informed.

Get Paid to Write About Mattermost Playbooks

Mattermost Playbooks help software engineering teams orchestrate their work across all tools and teams to plan projects and hit milestones by uniting your tech stack through a single point of collaboration. We want to see how our community is leveraging Playbooks in their own tech stack and share your creations with everyone so the whole community benefits. We’re doing this by launching a new effort to commission original blog articles that show Playbooks in action.

Communicating to Users During Incidents

Imagine you're having a regular day at work, opening up your browser, double checking something for a client in that web app your team built for them, when suddenly, you see this screen: You hit refresh a few times, just to be sure. Nope. Still down. What happens next depends on how well your team has planned for incidents like this (some folks call it unplanned downtime).

Improving your team's on-call experience

Your engineers probably dislike going on-call for your services. Some might even dread it. It doesn't have to be this way. With a few changes to how your team runs on-call, and deals with recurring alerts, you might find your team starting to enjoy it (as unimaginable as that sounds). I wrote this article as a follow-up to Getting over on-call anxiety.