Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

RedIron: Unifying Alerts and Notifications in IT

RedIron Canada, a Managed Services Provider (MSP), Retail Integrator, and Solutions Provider, that specializes in managing cloud-based systems across AWS, Azure, and Oracle. Their expertise in IT monitoring and managed services makes them a trusted partner for retail businesses across North America. RedIron relied on traditional alert notification methods like email and SMS for their IT monitoring operations.

PagerDuty Runbook Automation 2024 Year in Review

Special guest Jeff Hausman, PagerDuty’s Chief Product Development Officer kicks off our 2024 recap for PagerDuty Runbook Automation and Rundeck Open Source. Then Jake and Forrest take us through all of the amazing improvements and new features added to the product, including shout outs to the amazing folks contributing to the Open Source repos and a customer success story from Ryanair.

January 2025 Product Update - Easier Onboarding, Better User Experience, and Reliability Improvements

For the last two months, we have focused on improving the onboarding experience for users so that they can get started with monitoring with minimal effort. We have also added several improvements in the backend to make the service more robust and reliable. Some of the usability improvements are driven by user feedback. Others incorporate what we would personally like to see in such a monitoring service. We have also improved the dashboard user experience.

Enhancing Your Developer Experience: New SDKs for TypeScript, Go, and Terraform and Improved API Documentation

We built FireHydrant to be the kind of platform we’d want to use as developers, giving you the same tools and flexibility we rely on every day. With over 350 publicly accessible API endpoints, we’ve always believed in giving developers the power to customize and extend our platform to meet their exact needs.

What's New: Supercharge workflows with Message Templates

We’re excited to introduce Message Templates, a powerful new feature designed to streamline communication and ensure consistency across teams. With pre-configured templates curated by Enterprise Administrators, OnPage phone app users can now send standardized messages with just a few taps—saving valuable time and reducing the risk of miscommunication in critical situations.

This Month in Datadog: Datadog On-Call is now generally available

Datadog is constantly elevating the approach to cloud monitoring and security. This Month in Datadog updates you on our newest product features, announcements, resources, and events. To learn more about Datadog and start a free 14-day trial, visit Cloud Monitoring as a Service | Datadog. This month, we put the Spotlight on Datadog On-Call.

A Plan to Achieve IT Resilience

Ensuring your organization can continue running critical services, even during unexpected challenges, requires a solid IT resilience plan. An IT resilience plan involves more than just traditional disaster recovery. It focuses on keeping vital applications, data, and business operations intact no matter what happens. In this guide, we’ll explore key components and best practices to help you establish a comprehensive plan for ongoing business continuity.