Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Log Management, Log Analytics and related technologies.

Elevating Security Posture to Maximize Threat Response - Customer Brown Bag - November 21st, 2024

Join us as Marvin, a Technical Account Engineer at Sumo Logic, addresses the following customer questions on how to elevate their security posture and maximize threat response: How can we mature our Sumo Logic SIEM? How can we identify if we have gaps in logs or detections? How can we create or identify custom rules for use cases that are critical to us and that we want to monitor closely?

Grafana Loki 3.3 release: faster query results via Blooms for structured metadata

The Grafana Loki 3.3 release is here, and it brings a fresh wave of enhancements aimed at making your log management experience faster, more efficient, and more scalable. While this update includes the usual round of bug fixes and operational improvements, the standout feature is a shift in how Loki leverages Bloom filters—going from free-text search to harnessing the power of structured metadata.

Leveling up your observability practice - Part 2

Lessons from the front lines: Challenges in your observability maturity journey In our previous blog, we explored the observability maturity spectrum — revealing that while only 7% of organizations consider themselves experts, the majority (43%) are actively working to improve their practices. We saw how mature organizations achieve better outcomes, from faster root cause analysis to reduced user-reported incidents.

Agentic RAG on Dell AI Factory with NVIDIA and Elasticsearch Vector Database

We are excited to collaborate with Dell on the white paper,Agentic RAG on Dell AI Factory with NVIDIA. The white paper is a design reference document for developers outlining strategies and solution components to implement agentic retrieval augmented generation (RAG) applications. It’s a design point for organizations across industries, specifically healthcare, for the agentic RAG framework decision-making with AI-driven data retrieval.

Adding AI to Observability 2.0 for Dynamic Observability

The original premise of observability was to ensure system health, identify issues, and resolve those issues efficiently. As I recently outlined, the legacy approach (sometimes called Observability 1.0 now) relied heavily on metrics and tracing because logs were seen as too noisy or challenging. But, as most forward thinkers have identified now, logs are exactly the telemetry type that we need the most.

Are you ready for the next outage? How a to prepare for any crisis

We live in an “always on” world, so unplanned outages are more than just inconvenient. They can result in lost revenue, damaged reputations, and, more importantly, frustrated customers. While preventing outages is impossible, the most resilient teams must be prepared with a solid plan, a “technical go bag,” so to speak: a collection of tools, plans, and resources ready to activate at the first sign of trouble.

Future-proofing operations with generative AI

NOBODY PANIC! The Elastic AI assistant’s got you! Transform problem identification and resolution, and eliminate manual data chasing across silos with an interactive assistant that delivers context-aware information for SREs. Additional Resources: About Elastic Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale. Elastic’s solutions for search, observability, and security are built on the Elastic Search AI Platform — the development platform used by thousands of companies, including more than 50% of the Fortune 500.

Collecting Windows telemetry with Elastic: An introduction to the ETW Filebeat input

In the world of security, being able to use system telemetry of Windows hosts opens new possibilities for monitoring, troubleshooting, and securing IT environments. Recognizing this, Elastic has introduced new capabilities focused on Event Tracing for Windows (ETW) — a powerful Windows-native mechanism for capturing a vast array of system and application events. With these new additions, Elastic users can capture, analyze, and visualize Windows telemetry using the Elastic Search AI Platform.

Leveling up your observability practice - Part 1

Lessons from the front lines: Moving to observability maturity What separates the observability experts from the novices? It's a question that's been on my mind lately, especially after diving into our recent 2024 State of Observability Survey of over 500 practitioners. In my past roles as a DevOps engineer and a site reliability engineer (SRE), I've seen firsthand how a mature observability practice can be the difference between sleepless nights and smooth sailing.

Mastering Tail Sampling for OpenTelemetry: Cost-Effective Strategies with Cribl

Recently, I have seen a trend of enterprises moving toward OpenTelemetry (OTel) for application tracing. Tail sampling, in particular, has emerged as a preferred approach to gain actionable insights while balancing data volume and cost. OpenTelemetry offers developers and practitioners the ability to instrument their code with open-source tools, moving away from vendor-provided tools for application instrumentation.