Operations | Monitoring | ITSM | DevOps | Cloud

Using AI + Rollbar's Session Replay to Understand Complex Errors

Front‑end bugs are notoriously hard to reproduce. By the time an error shows up in your monitoring tool, the most important context is already gone: what the user actually did. Session replay helps—but only if someone has the time and patience to scrub through recordings, correlate events, and form a hypothesis. That’s where Rollbar’s MCP server, paired with an AI agent like Github Copilot, changes the game.

Agentic AI demands a new data architecture #ai #telemetry

Clint Sharp explains why traditional schema-on-read systems cannot handle the query loads of the future. Agentic telemetry requires a 360-degree view, but structuring data only when you read it is too slow for AI-driven workloads. The solution is using LLMs to drive the cost of building parsers to near zero. Tools like Copilot Editor allow teams to map data to OCSF instantly, effectively building factories of parsers to handle the scale of agentic AI.

How AI Agents automate incident response #ai #cybersecurity #telemetry

Clint Sharp demonstrates how Cribl Search leverages AI to streamline incident investigation. Starting from a Slack channel, the AI builds an interactive notebook, analyzes order processing logs, and identifies suspicious traffic spikes. It connects high CPU usage to a recent Jenkins deployment, hypothesizing a supply chain attack, and ultimately recommends a rollback. This isn't a far off concept. It is the future of operations arriving right now.

Why AI agents need a common data model #ai #telemetry

Clint Sharp explains why a common model like OCSF is critical for the future of AI. Agents need standardized data to analyze information effectively on your behalf. He contrasts the traditional manual workflow of checking Slack, tickets, and wikis while asking colleagues with a future where AI fuses this human context with machine data. Instead of just search results, AI agents will hand you examined hypotheses so you know exactly where to take your investigation.

AI Reliability, Part 2: When the Datacenter Becomes the Bottleneck

In Part 1, we talked about all the hidden complexity inside AI systems: the pipelines, GPUs, embeddings, vector databases, orchestration layers, and everything else that quietly determines how reliable an AI-first product really is. But all of that software still rests on something far less glamorous: the physical infrastructure underneath it.

How to use AI to analyze and visualize CAN data with Grafana Assistant

Note: A version of this post originally appeared on the CSS Electronics blog. Martin Falch, co-owner and head of sales and marketing at CSS Electronics, is an expert on CAN bus data. Martin works closely with end users, typically OEM engineers, across diverse industries, including automotive, maritime, and industrial. He is passionate about data visualization and AI—and he’s been working extensively with Grafana Assistant.

Using AI + Rollbar's Session Replay to Understand Complex Errors

Front‑end bugs are notoriously hard to reproduce. By the time an error shows up in your monitoring tool, the most important context is already gone: *what the user actually did*. By letting an AI agent like Copilot analyze Rollbar's session replay data directly, teams can move from *“something broke”* to *“here’s exactly why it broke”* in minutes, not hours.

Why UX is the Missing Layer in AI Adoption And How to Fix It

Most AI programs don’t fail on model quality. They fail because the experience makes people either over-trust or quietly avoid the system. Employees often use AI more than leaders realize, frequently without training or guardrails. Interfaces that just “show an answer” without confidence, provenance, or recourse create two risks: blind reliance and shadow use.