Operations | Monitoring | ITSM | DevOps | Cloud

PagerDuty

Failure Fridays at PagerDuty

Rich Lafferty, Staff SRE at PagerDuty and Stevenson Jean-Pierre, Senior Manager, Software Engineering at PagerDuty join Mandi Walls to talk about PagerDuty’s Failure Friday and Failure Any Day practices. PagerDuty has been using failure injection and chaos engineering methods to maintain the reliability of production services. Rich and SJP joined the PagerDuty live stream to talk about how the process works, how it has evolved, and how failure helps improve PagerDuty’s services.

10 Years of Failure Friday at PagerDuty: Fostering Resilience, Learning and Reliability

In today’s fast-paced and ever-evolving world of technology, failure is inevitable. Organizations should embrace failure as a learning opportunity for how to build and deliver more resilient services. At PagerDuty, we’ve practiced Failure Friday for 10 years now. Failure Friday–a practice inspired by the chaos engineering space–involves intentionally injecting failures into our systems to improve reliability and foster a proactive engineering culture.

The Unplanned Show, Episode 6: Defining AIOps with Heather Newburn

“AIOps” is a term some love to hate, but what makes it useful? In this episode, Heath Newburn breaks down the three things to look for in an AIOps solution: reduce noise, create context, and reduce toil. He also explains the challenges with domain-specific approaches, versus domain-agnostic approaches to AIOps. But even within that approach, Heath warns of “gotchas” in rules “tech debt”, data formats, and overall long implementation times.

What's New in PagerDuty iOS and Android Mobile Applications

The PagerDuty Operations Cloud is your platform for action in critical moments. By harnessing the capabilities of AI and automation, it has the ability to detect and diagnose disruptive incidents, assemble the appropriate team members for prompt response, and optimize your digital operations by streamlining infrastructure and workflows.

Gartner Market Guide: Embedding Automation Into the Enterprise

“Existing workload automation strategies are unable to cope with the expansion in complexity of workload types, volumes and locations driven by evolving business demand, as per Gartner. Digital business is slowed without collaboration and automation inside and outside of IT, leading to siloes of capabilities across business and IT teams.Cost optimization is an evolving challenge, driven by technical debt and requirements to demonstrate business value of investments.”

The Unplanned Show, Episode 5: DataOps with Snowflake

Long gone are the days when data is batch loaded into a data warehouse for business intelligence reports that are looked at periodically and if something is broken, a few internal people would have to wait. Today, data pipelines are “infinitely more complicated”, with more sources from cloud services to on premises systems, and supporting data applications that are critical parts of a business’ ecosystem.