Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Driving Innovation: A Bias Towards Action with Greg Freeman

AI is changing network operations faster than ever. In the latest episode of Next-Gen Network Heroes, Bob sits down with Greg Freeman of Lumen Technologies to talk about what it takes to innovate across one of the world’s largest telecommunications networks. From deterministic workflows to agentic AI, Greg shares how his team is using automation, analytics, and AI to improve network reliability, customer experience, and operational efficiency at scale.

What kind of correlations become impossible without depth and breadth?

Most teams don’t have a data problem. They have a correlation problem. When visibility is fragmented:→ Marketing sees conversion drop→ Engineering sees API latency So the wrong call gets made. Example: Checkout drops → pricing gets blamed → discounts applied. Reality: a backend API timeout was killing transactions. That’s what happens when you can’t connect: user impact (what) to system behavior (why)

Powering Autonomous IT with Edwin AI in ServiceNow Now Assist

Edwin AI extends ServiceNow Now Assist with real-time incident intelligence, acting as a context broker between observability data and ServiceNow incidents. Responders get the context they need inside the IT operations workflow they already use. Edwin AI now: The Edwin AI Agent for ServiceNow brings real-time incident intelligence into Now Assist and Workspace, giving ITOps teams root cause, impact, and recommended next steps directly inside the ServiceNow incident record.

Get Valid TLS Certificates for Icinga Web Despite a Firewall

Lots of big companies lock down their IT infrastructure in the internal network, sometimes they even use only locally mirrored repositories. I totally understand this, especially since our CVE-2024-49369. Nowadays, when LLMs find security holes even in OpenBSD, you definitely shouldn’t expose any services to the public without need.

How to Prevent AI Agents From Deleting Production Data

There’s a new question teams are asking. How can we prevent AI agents from deleting production. When Cursor deleted PocketOS’s entire production database in nine seconds, the agent wasn’t malfunctioning. It had full technical capability, but it was inferring operational authority from static code rather than live environment state. That gap between capability and context is the root cause. This article breaks down exactly how that happens, and what runtime visibility does to stop it.

Inside the .de DNS Outage: Real-World Data from UptimeRobot.

In the evening of May 5th, 2026, large parts of the German web briefly went dark. For a few hours, anyone trying to load a.de address through a major DNS resolver got errors instead of websites. Bahn.de, Amazon.de, and Spiegel.de were among the affected. Major brands like Telekom, DHL, and Sparkassen felt it too, along with hosting providers Hetzner, Strato, and Ionos.

Navigating the Middleware Maze: How meshIQ 12.1 Redefines Scale and Simplicity with Agentic AI

meshIQ v12.1 transforms middleware management with petabyte-scale data processing and agentic AI. The new intelligent launchpad, simplified onboarding, and context-aware safeguards move teams from reactive monitoring to proactive, AI-driven operations across the enterprise.

Analyze cloud costs with flexible spreadsheets in Datadog Sheets

Cloud cost data is most useful when teams can adapt it to their own reporting and planning needs. In addition to viewing cost breakdowns, FinOps teams often need to calculate forecasts, reshape datasets, and present tailored views to finance and leadership teams. In many workflows, those steps happen outside the observability platform. Once the data is exported, it quickly becomes outdated and requires repeated manual updates.

Datadog for Government achieves FedRAMP High certification

Modern government missions depend on software platforms that can perform under demanding conditions. As agencies update systems that support public safety, benefits delivery, financial operations, and national priorities, they face security and compliance requirements that shape how technology is adopted as well as how it is built, operated, and evolved over time.