%term

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

OnlineOrNot updates from January 2026

Feb 4, 2026 By Max Rozen In OnlineOrNot

Hopefully this will be one of the last major "behind-the-scenes" updates for a while, because OnlineOrNot's frontend now runs on a React framework that's easy to deploy across multiple providers, and is fully off GraphQL, being powered by its own REST API.

Read Post

OnlineOrNot

Read more about OnlineOrNot updates from January 2026

Observing agentic AI workflows with Grafana Cloud, OpenTelemetry, and the OpenAI Agents SDK

Feb 4, 2026 By Adam Quan In Grafana

As agentic AI applications are used more broadly in production, they introduce new operational models, combining multi-step reasoning, tool execution, and autonomous decision-making into a single workflow. SRE teams need visibility into how these agents behave, where they fail, and how they perform over time.

Read Post

Grafana

Read more about Observing agentic AI workflows with Grafana Cloud, OpenTelemetry, and the OpenAI Agents SDK

Monitoring Sprawl: Why IT Teams Still Can't Get Actionable Insight Fast

Feb 4, 2026 By LogicMonitor In LogicMonitor

IT teams collect extensive monitoring data but struggle to turn it into fast, confident decisions during incidents. Most IT leaders aren’t worried about whether their environments are monitored—they’re worried about whether their teams can make sense of what they’re seeing quickly enough to actually resolve issues. When something breaks, the problem usually isn’t finding data. Dashboards show activity, alerts indicate changes, and logs capture events across the entire stack.

Read Post

LogicMonitor

Read more about Monitoring Sprawl: Why IT Teams Still Can't Get Actionable Insight Fast

AI Agent Governance: How to Keep Agentic ITOps Workflows Safe

Feb 4, 2026 By Margo Poda In LogicMonitor

The future of ITOps automation is better control over what AI agents can see, share, and do. AI automation in ITOps is expected to resolve incidents, reduce operational load, and operate with limited human involvement. Those outcomes depend on systems that can take action, not just surface insight. Agentic AI enables that shift. AI agents can correlate signals across tools, update tickets, trigger remediation, and coordinate workflows without waiting for instruction.

Read Post

LogicMonitor

Read more about AI Agent Governance: How to Keep Agentic ITOps Workflows Safe

Make faster, better product decisions with Datadog Product Analytics

Feb 4, 2026 By Milene Darnis In Datadog

Product managers (PMs) need to make fast, confident decisions about what to build, fix, and improve based on user behavior within their application. But in practice, collecting the user insights they require is rarely straightforward. Recent updates to Datadog Product Analytics address this challenge. Product Analytics adds structure to autocaptured data and makes analysis easier to interpret, reuse, and share, helping PMs move from questions to answers without relying on SQL or engineering.

Read Post

Datadog

Read more about Make faster, better product decisions with Datadog Product Analytics

Surface and remediate runtime posture issues with Workload Protection Findings

Feb 4, 2026 By Danila Ivanov In Datadog

Threat detection and runtime posture monitoring are related but different jobs. Security teams already rely on Datadog Workload Protection to detect threats in real time across hosts and containers. But the actions that lead to those detections (file manipulation, process execution, network calls, or kernel activity) can be indicative of compromise or simply of risky behavior—like running compilers in production containers.

Read Post

Datadog

Read more about Surface and remediate runtime posture issues with Workload Protection Findings

Alert Noise Isn't an Accident - It's a Design Decision

Feb 4, 2026 By James Barnes In StatusCake

In a previous post, The Incident Checklist: Reducing Cognitive Load When It Matters Most, we explored how incidents stop being purely technical problems and become human ones. These are moments where decision-making under pressure and cognitive load matter more than perfect root cause analysis. When systems don’t support people clearly in those moments, teams compensate. They add process. They add people. They add noise. Alerting is one of the most visible places where this shows up.

Read Post

StatusCake

Read more about Alert Noise Isn't an Accident - It's a Design Decision

The Grok-to-AI Evolution: Why Modern SREs Are Moving Beyond Manual Parsing

Feb 4, 2026 By Mezmo In Mezmo

Grok structures logs. Context engineering connects systems. AI explains behavior. For years, Grok patterns have been the workhorse of the SRE world. Built on regular expressions, Grok helps teams extract structure from unstructured logs. As we explored in "Do You Grok It?", Grok is the key to turning messy log lines into usable fields. It's why our Grok Pattern Reference remains one of our most-visited resources — SREs are hungry for structure.

Read Post

Mezmo

Read more about The Grok-to-AI Evolution: Why Modern SREs Are Moving Beyond Manual Parsing

ISO 27K Without the Bloat: An Open Source Approach

Feb 4, 2026 By Tony Ramos In ObservIQ

It’s often framed as an enterprise-only exercise: long timelines, expensive tooling, consultants everywhere, and a lot of compliance work that exists mainly to survive an audit. As a ~40-person, engineering-driven SaaS company, we needed the same level of trust and rigor as much larger organizations — but we weren’t willing to accept shelfware, parallel compliance infrastructure, or controls that only exist on paper. We also didn’t stop at ISO 27001.

Read Post

ObservIQ

Read more about ISO 27K Without the Bloat: An Open Source Approach

Tech Talk | Splunk MCP & Agentic AI: Machine Data Without Limits

Feb 4, 2026 By Splunk In Splunk

In this session, we’ll show how MCP empowers autonomous AI agents to retrieve, process, and share live data anywhere it’s needed — breaking barriers and accelerating insight across your organization.

View Video