Operations | Monitoring | ITSM | DevOps | Cloud

AIOps and Performance Monitoring: A One-Two Punch for IT Operations

Sugar Ray Robinson and Jake LaMotta. Marvelous Marvin Hagler and Tommy Hearns. Muhammad Ali and Joe Frazier. All were among history’s greatest boxers, but when they met in the ring, each brought out the best in the other. It’s the same in IT management. There are tools and platforms that on their own are essential to IT operations; but when paired as an infrastructure management tandem, each complements the other, ensuring maximal efficacy of both systems.

PagerDuty Integration Spotlight: InfluxData

InfluxData is an Open Source Platform built for metrics and events — a platform that is purpose-built for time series data. The essential time series toolkit — dashboards, queries, tasks and agents all in one place. InfluxDB is even more programmable and performant with a common API across OSS, cloud and enterprise editions. Send events to PagerDuty to keep your teams informed. Check out InfluxData’s integration.

Looking forward to KubeCon

KubeCon + CloudNativeCon North America is just around the corner. I’ve been looking forward to this event for a long time, especially since 2020 was virtual and it looks like there will be an in person option this year. This should be a great event and there are going to be a ton of awesome sessions. Last year was simply enormous with over 15K attendees who joined virtually.

How to mitigate the 0-day Apache path traversal vulnerability with Puppet or Bolt

Apache has disclosed a critical actively exploited path traversal flaw in the popular Apache web server, version 2.4.49. This path traversal means that an attacker can trivially read the contents of any file on the server that the Apache process has access to. This could expose highly sensitive information, even as critical as the server's own private SSL certificates. See the Sonatype blog for more technical information on the vulnerability.

Facebook, Instagram, and Whatsapp's Outage - Understanding MTTR

Yesterday the most used social media platforms in the world were inaccessible for 6 hours straight. Later, in a press release, Facebook revealed that the outage was due to configuration changes in their routers. There is no doubt that Facebook has an intense incident response plan, yet a small blind spot resulted in a significant business interruption. So how do we avoid this? The truth is, outages and performance issues are bound to happen in any network.

PagerDuty Integration Spotlight: HashiCorp Terraform

Manage your PagerDuty account objects with Terraform! Reap all the benefits of infrastructure as code and give your teams the flexibility they need to manage their services in real time. As infrastructure stacks grow increasingly more complex and involve an ever-growing number of services and systems, teams have looked to abstract configuration to its own layer of code. This concept of configuring infrastructure as code is gaining traction throughout the industry for a variety of reasons.