Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Top 6 EC2 rightsizing recommendations that you can't ignore

Imagine a day at work where you realize that your team’s youngest developer has failed to kill a compute instance; the bill spikes and the budget is breached. Rightsizing recommendations would come to the rescue and play a crucial role in such situations by identifying underutilized, overutilized, or mismanaged resources and suggesting corrective actions.

Better CloudWatch Metrics in Honeycomb with the OpenTelemetry Collector

CloudWatch metrics can be a very useful source of information for a number of AWS services that don’t produce telemetry as well as instrumented code. There are also a number of useful metrics for non-web-request based functions, like metrics on concurrent database requests. We use them at Honeycomb to get statistics on load balancers and RDS instances. The Amazon Data Firehose is able to export directly to Honeycomb as well, which makes getting the data into Honeycomb straightforward.

Preventing Alert Storms with InfluxDB 3's Processing Engine Cache

A common problem in monitoring and alerting systems is not just alerting on what you’re seeing but preventing alert storms from overwhelming operators. When a system generates multiple notifications for the same incident, it leads to alert fatigue and can mask other important issues. For time series data, alert fatigue can result in missed anomalies, delayed responses to critical trends, and difficulty distinguishing real performance degradations from noise.

Dashboard updates: Fewer clicks, more control, faster widget building

You're reviewing your production metrics when suddenly an error spike appears on your dashboard. Your immediate thought isn't "how do I build a new view to investigate this?" but rather "how do I find out the cause quickly?" This is exactly what happened to one of our engineering teams last month when they spotted an unusual pattern in their API response times. Instead of running ad-hoc queries from scratch, they turned to a custom dashboard they had built after a past incident.

Retail digital performance event recap: Key insights from IBM & Catchpoint

We hosted the first IBM and Catchpoint Retail Digital Performance event on Wednesday, March 19, 2025. The sessions offered practical, thought-provoking insights on speed, resilience, and user-centric design—giving attendees fresh strategies to improve digital experiences at scale.

Easiest Way to Monitor Your Java Application Using OpenTelemetry

When you're running a Java application, the JVM is doing a ton of work behind the scenes but unless you're actively collecting its internal metrics, you're essentially flying blind. Fortunately, the JMX Prometheus Receiver paired with the JMX Java Exporter Agent offers one of the simplest and most effective ways to expose JVM performance data.

A Guide to Logging in React Native

Basic console logging is a good starting point for debugging and understanding an app. For larger, more complex apps, it’s helpful to include additional information and persist logs. In this guide, you’ll learn how to create and view logs in React Native and how to create and save custom logs to a file. We’ll focus on JavaScript logs.

What is a Branch in Git and How to Use It - Ultimate Guide

Developing a website or software isn't easy, a team of developers will be developing a new feature, other team will be testing whether the built feature works as expected, other might be fixing the bugs and so on. Managing these different versions of same code base must be a little tricky. Here comes the concept called branch in git which is used as a pointer to a snapshot of your changes. When we talk about branches in git these are the major questions that arises in our mind.