%term

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Now you can use Sentry Insights to trigger alerts and debug issues

Jun 25, 2025 By Ben Coe In Sentry

You deploy a fix late Friday and spend the weekend refreshing dashboards, hoping nothing breaks. You shouldn’t have to babysit a dashboard to know when something’s wrong. With the latest updates to Insights, you can now create alerts directly from any chart. Whether it’s a spike in 4xx errors after a deploy, a jump in P95 latency for an API endpoint, or a drop in throughput for a background job, you can set up alerts with just two clicks.

Read Post

Sentry

Read more about Now you can use Sentry Insights to trigger alerts and debug issues

Trace Distributed Map states for AWS Step Functions with Datadog

Jun 25, 2025 By Abhinav Vedmala In Datadog

AWS Step Functions offers the Distributed Map state, enabling you to coordinate massively parallel workloads within your serverless applications. With this feature, a single Step Functions execution can fan out into up to 10,000 parallel workflows simultaneously, making it possible to efficiently process millions of items in parallel. This capability unlocks new possibilities for large-scale data processing, such as image transformation, log ingestion, or batch analytics.

Read Post

Datadog

Read more about Trace Distributed Map states for AWS Step Functions with Datadog

What is log tagging and how to configure it in Site24x7

Jun 25, 2025 By ManageEngine Site24x7 In Site24x7

In this video, learn what is Site24x7's log tag and how to configure, categorize, filter, and monitor your logs more effectively—so you can create your custom log tag that gives you full visibility into your logs or categorize them even better. Here’s what you’ll learn: Whether you're an IT personnel, DevOps engineer, or security analyst, this video will help you make smarter tags for monitoring decisions.

View Video

Site24x7

Read more about What is log tagging and how to configure it in Site24x7

Infrastructure monitoring with Site24x7 | Cloud, Kubernetes, and Hybrid Environments

Jun 25, 2025 By ManageEngine Site24x7 In Site24x7

Modern IT environments are dynamic, distributed, and constantly evolving. You need more than traditional monitoring to keep everything running smoothly. Site24x7 is your all-in-one, AI-powered infrastructure monitoring solution. What this video covers: Whether you're overseeing AWS, Azure, GCP, OCI, VMware, or Kubernetes, Site24x7 simplifies it all with a single agent and AI-driven insights.

View Video

Site24x7

Read more about Infrastructure monitoring with Site24x7 | Cloud, Kubernetes, and Hybrid Environments

Grafana Cloud updates: The latest features in Kubernetes Monitoring, Fleet Management, and more

Jun 25, 2025 By Kristin Knapp In Grafana

We consistently roll out helpful updates and fun features in Grafana Cloud, our fully managed observability platform powered by the open source Grafana LGTM Stack ( Loki for logs, Grafana for visualization, Tempo for traces, and Mimir for metrics). In case you missed them, here’s our monthly round-up of the latest and greatest Grafana Cloud updates.

Read Post

Grafana

Read more about Grafana Cloud updates: The latest features in Kubernetes Monitoring, Fleet Management, and more

How to Configure Docker's Shared Memory Size (/dev/shm)

Jun 25, 2025 By Faiz Shaikh In Last9

Your Node.js app runs fine on your machine. But inside Docker? You start getting weird crashes—ENOSPC: no space left on device. Chrome headless tests fail out of nowhere. PostgreSQL throws shared memory errors under load. The problem? It’s probably /dev/shm, the shared memory volume Docker sets up by default. Most containers get just 64MB of space here.

Read Post

Last9

Read more about How to Configure Docker's Shared Memory Size (/dev/shm)

Amazon SQS Metrics: Monitor, Debug, and Optimize Your Message Queues

Jun 25, 2025 By Anjali Udasi In Last9

Message queues quietly take care of a lot—buffering workloads, smoothing traffic spikes, and keeping services connected. But they don’t always get much attention until something feels off. Amazon SQS offers a solid set of metrics to help you understand how your queues are doing, whether you’re scaling well or nearing limits. This blog breaks down the key SQS metrics: where to find them, what they mean, and how to respond when things start to shift.

Read Post

Last9

Read more about Amazon SQS Metrics: Monitor, Debug, and Optimize Your Message Queues

Introducing Cause Analysis: Instant Triage for Traffic Changes with Kentik AI

Jun 25, 2025 By Eric Hian-Cheong In Kentik

Introducing Cause Analysis from Kentik, designed to simplify network traffic analysis and rapidly identify the root cause of issues. Learn how this exciting new feature streamlines troubleshooting, makes complex insights accessible, and boosts team efficiency for all users.

Read Post

Kentik

Read more about Introducing Cause Analysis: Instant Triage for Traffic Changes with Kentik AI

Understanding APM and Distributed Tracing in the Observability Stack

Jun 25, 2025 By Pavithra Parthiban In Atatus

To keep modern applications running smoothly, you need more than just basic monitoring. APM (Application Performance Monitoring) gives you a broad overview, tracking metrics like latency, errors, and system health. Distributed Tracing, on the other hand, shows the full journey of each request across services, helping you pinpoint the root cause of slowdowns or failures.

Read Post

Atatus

Read more about Understanding APM and Distributed Tracing in the Observability Stack

How to Reduce IT Costs on Hardware Refresh Cycles

Jun 25, 2025 By Nexthink In Nexthink

IT budgets are under pressure, and hardware refresh costs continue to climb. For End User Computing (EUC) and IT professionals, the traditional time-based approach to managing device lifecycles is no longer viable. Simply replacing laptops and desktops every three to five years doesn’t reflect actual device performance, usage patterns, or business needs. The solution? A smarter, data-driven hardware refresh strategy that balances performance, cost-efficiency, and employee experience.

Read Post