Operations | Monitoring | ITSM | DevOps | Cloud

Learning MCP with PagerDuty

Join PagerDuty's Software Engineers José Côrte-Real and Manuel Reis, and host Daniel Afonso, Senior Developer Advocate, for a dive into Model Context Protocol (MCP) - we'll explore what it is, how it works, and showcase practical use cases in action. Plus, get an exclusive sneak peak at PagerDuty's upcoming open-source MCP server and learn how it can enhance your workflows.

The Case for Intelligent Automation in Network Operations

In the last decade or so, network infrastructure has undergone a massive transformation. With the rise of hybrid cloud, distributed applications, and software-defined everything, managing networks has become exponentially more complex. What used to be a stable, predictable environment is now a constantly evolving system of interconnected services, protocols, and devices, each with its own telemetry, APIs, and failure models.

Fix Vulnerabilities Faster: Puppet's Advanced Patching Solution

Break down patching silos and remediate vulnerabilities faster with Puppet. Most CVEs sit unaddressed for weeks, even after your scanner picks them up. Vulnerability Remediation in Advanced Patching (a Puppet Enterprise Advanced exclusive) gives Security and Ops teams an easy-to-use dashboard for finding, fixing, and reporting on vulnerabilities. No more tossing CVEs over the fence. No more finger-pointing when things go wrong. Just swift, efficient vulnerability management.

Looking beyond dev productivity to increase speed ft. Brian Guthrie of Justworks

Speed isn't just about developer productivity—it's about market dominance. Rob sits down with Brian Guthrie, Director of Engineering at Justworks and former ThoughtWorks consultant, to explore why lead time from conception to production should be your organization's north star metric.

Kubernetes Clusters Break in the Weirdest Ways

If you’ve ever spent hours chasing a weird issue in your Kubernetes cluster, you’re in good company. Reddit’s r/kubernetes is full of hilarious and painful stories about clusters going off the rails for reasons no monitoring dashboard ever predicted. And while it’s easy to laugh after the fact, each of these moments highlights just how important observability is because these kinds of problems don’t show up on your radar until it’s too late.

How to ensure your AWS workloads are resilient

Part of the Gremlin Office Hours series: A monthly deep dive with Gremlin experts. Cloud providers like AWS give you plenty of tools to make your workloads more resilient, but it’s up to you to apply them. However, considering how complex some of these tools are, where do you start? And how can you be sure your systems are more reliable as a result?

Let Git Find the Bug for You (No Guessing)

Somewhere in your commit history, a bug snuck in. You could scroll. Panic. Guess. Or — you could let Git find the exact commit that broke your code. In this episode of Wait… Git Can Do That?, we show you how git bisect binary-searches your history to isolate the problem — fast, clean, and testable. Use git bisect start, good, and bad Test each step to narrow it down Or automate it with git bisect run.

Overview of Alerts, Real-Time Analysis, & Traceroute

Learn how Uptime.com alerts you the moment a check goes Up or Down, complete with technical details and root cause analysis for API and Transaction checks. Dive into Real-Time Analysis to track outage timelines and get detailed insight into every alert. Plus, see how Traceroute from global or private probe servers helps identify connection issues quickly and accurately. Stay informed. Respond faster. Resolve smarter.