Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Log Management, Log Analytics and related technologies.

Unlock telemetry value with a well-planned data lake

Your SIEM only holds a slice of your telemetry. Your data lake holds the rest. We'll show you how to use that to your advantage for investigations, threat hunting, and reporting. Why your data lake beats your SIEM for investigations – Your SIEM keeps a short window of expensive, filtered data. Your data lake keeps everything. When something goes wrong, that difference matters more than you think Threat hunting without the handcuffs – Hunting across months of data in a SIEM is painful and costly. We'll show you how a well-planned lake makes broad, deep searches practical and affordable.

The $600 billion wake-up call: New Splunk research reveals downtime is a systemic business crisis

600 billion annual impact: Aggregate downtime costs for the Global 2000 have soared 50% in two years. $15,000 per minute: The average cost of downtime for organisations, highlighting the immediate financial impact of service disruptions. 3.4% stock price drop: The average decline in shareholder value following a single downtime incident.

Multiple API Keys Are Here - More Keys, Better Control, Stronger Security

Today we're rolling out a major upgrade to API Keys in Bindplane. You can now create up to 25 API keys per project, give each one a description, set an expiration date, and delete keys you no longer need. Under the hood, every key is now hashed with Argon2, the modern standard for credential storage. If you've been working around the old single-key limit by sharing one key across CI jobs, scripts, and teammates, this release is for you.

Why SRE agents need orchestration, not just more tools

Single agents are a useful starting point for SRE workflows. They are not where the architecture should end. The first version is simple enough: connect an LLM to a few tools, give it a system prompt, and point it at your infrastructure. It can summarize an alert, pull logs, answer questions, and draft a useful next step. Then the workflow gets real. You add GitHub for runbooks, Kubernetes for cluster state, PagerDuty for incident context, Prometheus for metrics, and Mezmo for telemetry.

Cribl Notebook templates in Cribl Search

Investigations are time-sensitive, and analysts shouldn’t waste time recreating the same workflows or rewriting familiar queries. Whether troubleshooting infrastructure, investigating suspicious IPs, or analyzing host activity, teams often rely on duplicating old processes and copying query snippets — a slow, inconsistent approach that’s hard to scale.

Action trails: The missing link between AI and human trust

When people talk about trusting AI, they usually focus on the interface. It summarizes and uses confident language with a level of clarity that feels reliable. But that’s all window dressing. None of it builds trust. Trust doesn’t come from what the AI says. A verifiable record of what the AI did makes it trustworthy.

When your agents hallucinate at 2 am, it is not a model problem

The first time an AI assistant suggests "restart the service" during a live incident and nobody on the bridge can tell whether that suggestion came from a current runbook, a stale wiki page, or thin air, you stop caring about model benchmarks. You start caring about what the agent actually knew, where that knowledge came from, and whether you can trust the chain of reasoning behind it.

3 things you need to know about headless observability

If you're building agents trying to figure out the best way to actually make them successful in production, you're going to want to know about headless observability. Headless observability means an agent can access information about the health of your system through a CLI instead of clicking around dashboards. It's the data layer that going to unlock serious autonomy and allow you to scale with agentic workloads.

One Collector, Two Teams: How Bindplane Bridges Security and Observability with OpenTelemetry

Observability engineers will spend weeks tuning instrumentation. Security engineers? They want a collector installed and logs flowing — yesterday. And that's actually the magic of OpenTelemetry + Bindplane: from day one you're routing firewall logs, endpoint data, server logs straight into your SIEM with zero instrumentation lift. One toolchain. Two teams. No silos. Filmed at Google Cloud Next '26 — Las Vegas bindplane.com#OpenTelemetry.

From Phishing to SQL Injection: How Breaches Actually Happen

Critical vulnerabilities are critical because they're easy to exploit — but most breaches don't even need them. Tony explains why phishing remains the dominant attack vector, why strong instrumentation matters for forensics (tracing an API call through a database to see exactly what was leaked), and how observability data becomes security data when something goes wrong. The system is harder to breach than the human. And that's the whole game.