Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Site Reliability Engineering Meets Traditional Operations

Google has effectively made the discipline of site reliability engineering (SRE) a DevOps best practice by publishing two decades’ worth of lessons in keeping alive the most scalable apps on the planet. As more organizations make the shift (or “transformation,” as it were) to becoming IT organizations, the demand for reliability increases substantially for customer-facing services.

Virtual Offsites: A Collaboration Approach For Distributed Teams

Once a year, PagerDuty’s SREs get together for a three-day, in-person offsite. With the team spread across three time zones in the U.S. and Canada, encompassing two offices and three remote members, face time is rare and valuable. We use our offsites for thoughtful discussions on team health, long-term project roadmap planning, refining and updating our team’s mission, and to simply spend time together as a team.

Predicting The Next Big Wave of DevOps Cultural Transformation

We read with interest a recent article from CloudBees published in The New Stack: How Culture Will Make or Break Cloud Native DevOps and have seen some highly differing views on where the adoption of DevOps is. The Cloudbees article starts by saying that “Software delivery cycles are becoming faster thanks to DevOps-backed continuous integration/continuous delivery (CI/CD) as production pipelines are increasingly ported to scale with microservices on cloud-native environments.”

ActiveMQ architecture and key metrics

Apache ActiveMQ is message-oriented middleware (MOM), a category of software that sends messages between applications. Using standards-based, asynchronous communication, ActiveMQ allows loose coupling of the elements in an IT environment, which is often foundational to enterprise messaging and distributed applications.

Collecting ActiveMQ metrics

In Part 1 of this series, we looked at how ActiveMQ works, and the key metrics you can monitor to ensure proper performance of your messaging infrastructure. In this post, we’ll show you some of the tools that you can use to collect ActiveMQ metrics. This includes tools that ship with ActiveMQ, and some other tools that make use of Java Management Extensions (JMX) to monitor ActiveMQ brokers and destinations.

How to Diagnose and Fix AWS Lambda Iterator Age

AWS Lambda can use stream based services as invocation sources, essentially making your Lambda function a consumer of those streams. Stream sources include Kinesis Streams and DynamoDB streams. When you allow streams to invoke your Lambda function, Lambda will emit a CloudWatch metric called IteratorAge. In this post, we discuss what this metric is and how to fix it if it’s too high.