Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

DevOps vs. Agile

DevOps is a term for, “a cross-disciplinary practice dedicated to the study of building, evolving and operating, rapidly-changing resilient systems at scale.” (Jez Humble) There is no wall between development and operations so they work simultaneously and without silos. The system focuses on uniting the developmental and operations teams in a continuous process. Agile is a software development strategy that focuses on responding to change with cross-functional team communication.

Incident Response Alert Routing

You have identified a data breach, now what? Your Incident Response Playbook is up to date. You have drilled for this, you know who the key players on your team are and you have their home phone numbers, mobile phone numbers, and email addresses, so you get to work. It is seven o’clock in the evening so you are sure everyone is available and ready to respond, you begin typing “that” email and making phone calls, one at a time.

7 Ways SRE Is Changing IT Ops And How To Prepare For Those Changes

SRE best practices are disrupting and catalyzing change in the ways organizations approach IT Operations. In this blog we look at 7 ways SRE is bringing this transition. ‍Site Reliability Engineering is a new practice that has been growing in popularity among many businesses. Also known as SRE, the new activity puts a premium on monitoring, tracking bugs, and creating systems and automations that solve the problem in the long term.

4 Major Capabilities of Automated Incident Management

Automated incident management ensures that critical events are detected, addressed and resolved in a fast, efficient manner. Automation allows incident management tools to integrate with each other and fosters instant communication across the systems. Automation tears down barriers across IT operations (ITOps) teams and ensures all departments are on the same page. Teams gain full visibility into incident status to verify that incidents are addressed by the relevant groups.

Pragmatic Incident Response: Lessons learned from failures by Robert Ross Failover Conf 2021

Incident response is overwhelming. So where do you start? There's a lot of advice out there, but it's mostly theories that aren't taking reality into account. So how do you get a process in place that actually works and scales? In this session, FireHydrant CEO and Co-Founder, Robert Ross, will share quick stories from his experience as an SRE and what tips he’s learned along the way.

JFrog and PagerDuty Extend Ecosystem Integration

JFrog and PagerDuty have deepened their technology integration to further boost IT operators’ and developers’ visibility into the software development lifecycle and accelerate incident resolution. The latest integration, which involves the JFrog Pipelines DevOps pipeline automation solution, simplifies and streamlines how to identify faulty builds that impact production environments.

Introducing 2-way REST capabilities with Enterprise Alert 9

The REST API in Enterprise Alert 9 has now been extended with a 2-way functionality. This allows to call webhooks or REST endpoints from third party systems on alarm status changes (acknowledge, close). Thus, in Enterprise Alert 9, it becomes child’s play to establish a 2-way integration with almost any REST enabled third party system.