Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Automate Weekly Rollbar Reports with Zapier + Google Sheets

Product Managers thrive on clarity. But when it comes to understanding application errors and trends, Rollbar’s rich occurrence data can sometimes feel overwhelming. With AI by Zapier + Google Sheets, you can turn this into a completely automated reporting pipeline—one that generates weekly reports of Rollbar occurrences, organizes them in Sheets, and arms PMs with insights they can use to guide roadmap decisions, reduce risk, and improve user experience.

SSL Certificate Management: A Complete Guide to Monitoring SSL Expiry, Validity & Certificate Health

Managing SSL certificates is essential for maintaining trust, security, and uptime across any website or online service. While many people think SSL certificate management refers to renewing or issuing certificates, one of the most critical aspects,often overlooked,is monitoring certificates for expiry, validity, and unexpected changes. That’s the area where monitoring platforms provide their highest value.

Kentik in Motion: How AI Transforms Network Chaos to Clarity

Learn how artificial intelligence is transforming network operations through Kentik's AI Advisor platform. Philip Gervasi and Sean McGinley discuss the evolution from traditional network visibility to network intelligence, emphasizing that AI should augment, rather than replace, network engineers. They demonstrate how Kentik's AI Advisor uses natural language interfaces to perform automated root cause analysis, troubleshooting, and cost optimization.

Expose Hidden State Bugs with Sentry Logs

See how Sentry Logs can surface hidden state bugs that stack traces alone can’t explain. In this walkthrough, we debug a React Native app with an Express.js backend where a missing diet value causes a crash. We inspect the issue, pull in the connected logs, and confirm whether the problem comes from an initial render or from real backend data. By combining issues, traces, and logs from the same session, you get the full story—and a faster path to the fix.

Fixing Performance Issues Fast with Logs & Tracing

Learn how to quickly track down performance bottlenecks using Sentry Logs and Tracing. In this video, we walk through identifying a slow screen, jumping into the connected trace, and pinpointing slow backend steps, database calls, and AI/LLM operations. See how logs, issues, and traces work together to show the full picture of what happened in a single session.

Datadog at AWS re:Invent, Bits AI SRE, MCP Server, CloudPrem, and more | This Month in Datadog

Get a closer look at features we announced at AWS re:Invent in the latest episode of This Month in Datadog. Tune in for spotlights of Bits AI SRE, now generally available, and Datadog’s MCP Server, which connects AI agents to our platform by ingesting prompts and mapping them to Datadog resources and data. Plus, we cover how to: This Month in Datadog brings you the latest updates on our newest product features, announcements, resources, and events.

How Datadog Manages 50,000 Apache Iceberg Tables at Scale

Think managing a few database tables is hard? Try 50,000 production Iceberg tables storing petabytes of data with 8 million scans per day. In this clip, Datadog's platform team reveals the architecture choices behind their managed Iceberg implementation that serves hundreds of internal engineering teams.

Become a 10x investigator with Cribl Notebooks

Cribl Notebooks aims to streamline the investigation process by bringing everything into a single interactive interface. It functions as a virtual war room where teams can collaborate in real time. You can view AI queries and code alongside charts without switching between scattered tabs or workstations. This persistence makes it easier to document the root cause and share the story behind the data.

Agentic AI by Design: Evolving Our Principles for the Next Chapter of Responsible AI

Join SolarWinds CISO Tim Brown and CTO Sai Krishna for the SolarWinds Day Closing Keynote, where they share how SolarWinds is evolving from Secure by Design to AI by Design—a bold next step in building trusted, intelligent, and future-ready IT operations. As organizations adopt AI-driven systems, embedding trust, transparency, and accountability into product development becomes essential. In this forward-looking discussion, Tim and Sai reveal how the AI by Design framework ensures responsible AI adoption while enhancing performance, reliability, and security.

Runtime Context for AI Agents with Lightrun MCP

Introducing Runtime Context for AI agents The next evolution in autonomous software development. The Lightrun MCP connects IDEs and AI assistants to real runtime data, giving agents and developers the context they need to write, validate, and debug code with confidence. With Runtime Context, AI agents can: Reliable, AI-accelerated engineering starts here.