Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Introducing Updog.ai: Real-time provider status from Datadog

When external SaaS providers or cloud services degrade or go down, engineers often find themselves wondering if the issue they're encountering is local or more widespread. The answers they find are usually slow to surface, limited in detail, or entirely dependent on the provider's updates. Vendor-controlled status pages and third-party aggregators don’t provide the timely, independent visibility that's necessary to quickly and accurately identify the root cause of slowdowns.

What is Open Telemetry? The Future Is Here

Watch SolarWinds tech evangelist, Sascha Giese, dive into OpenTelemetry (OTel) and explain why a vendor-agnostic standard is the future of observability and application performance monitoring (APM). If you’ve ever wondered, what is OpenTelemetry? Sascha’s presentation is a great start or restart to diving back into the topic.

What Is an Email Blacklist?

An email blacklist is a database that lists IP addresses or domains suspected of sending spam or malicious emails. Mail servers use these lists to decide whether to deliver or reject incoming messages. Understanding how blacklists work is essential for keeping your messages deliverable and your domain reputation intact.

AI-Powered Translation Tools: A Hidden Asset for Scaling DevOps Globally

DevOps or development (Dev) and IT operations (Ops) teams are no longer confined to single geographic locations or language groups. With over 80% of organizations now practicing DevOps (a figure projected to reach 94% in the near future), the challenge of scaling operations globally has never been more critical. Yet, one persistent bottleneck continues to slow down even the most sophisticated DevOps workflows: language barriers.

Get started with Grafana Alerting: Route alerts using dynamic labels

In this tutorial you will learn how to configure notification policies for dynamic routing based on query values Don't miss the rest of the "Get started with Grafana Alerting" series! Each part dives into a different feature to help you get the most out of alerting in Grafana.

Demo of Raygun's remote MCP

This Raygun remote MCP demo highlights the new depth of context available. The agent isn’t just fetching error lists. it’s reasoning through stack traces to find the issues. Combine this with the ability to now view associated deployment versions, browser information, breadcrumbs, customer data and more, the agent becomes infinitely more capable at solving errors. We’ve even heard of some of the early testers going from having errors in production to having them solved within minutes.

AWS Outage: How do you prepare for the failure of your own safety net?

When AWS’s massive outage struck, it didn’t just take down cloud services, apps, and enterprise platforms. It also knocked out many of the monitoring systems organizations depend on for real-time answers. Observability companies, including Datadog, New Relic, Checkly, Dynatrace, SpeedCurve, and Splunk Observability, lost visibility or functionality precisely when organizations needed them most.

Unreal Engine crash reporting now available on gaming consoles with trace-connected logs

With the first major release of the Sentry Unreal SDK (now on v1.2.0, and you can also explore in our interactive sandbox), we’ve made some important improvements to support cross-platform Unreal developers when it comes to platform coverage, debugging with user feedback, and performance monitoring improvements. Here’s what’s new.