Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Elephant Flows: The Hidden Heavyweights of AI Data Center Networks

Elephant flows are no longer rare. They’re foundational to AI workloads. In today’s GPU-heavy data centers, long-lived, high-volume flows can distort ECMP, overflow buffers, and rack up unexpected cloud bills. Kentik helps you see and tame these elephants with real-time flow analytics, automated alerting, and predictive capacity planning.

The Hidden Cost of Downtime: Why IT Leaders Are Prioritizing Resilient Operations

No business sets out to tolerate downtime. And yet, across industries, unexpected service disruptions continue to drain revenue, erode customer trust, and expose operational fragility. For CIOs and IT leaders, the real concern isn’t if systems will break, it’s whether your team can outpace the fallout. Because in a crisis, speed isn’t just an advantage it’s survival.

How to Write Logs to a File in Go

When your Go application moves beyond development, you need structured logging that persists. Writing logs to files gives you the control and reliability that stdout can't match, especially when you're debugging production issues or need to meet compliance requirements. This blog walks through the practical approaches, from Go's standard library to structured logging with popular packages.

Logging in Docker Swarm: Visibility Across Distributed Services

Docker Swarm's logging model shifts from individual container logs to service-level aggregation. The docker service logs command batch-retrieves logs present at the time of execution, pulling data from all containers that belong to a service across your cluster. This approach gives you a unified view of distributed applications, but it comes with its patterns and considerations for effective observability.

VB Transform 2025: The Enterprise AI Revolution Takes Center Stage

Fabrix.ai team attended VentureBeat’s – VB Transform conference returned this week as the premier gathering for enterprise AI leaders, showcasing how artificial intelligence has evolved from experimental chatbots to autonomous agents reshaping entire industries.

Beyond ping: How OpManager redefines network discovery for modern IT

Today’s networks aren’t just growing, they’re evolving. Hybrid architectures, cloud-native services, and a never-ending stream of connected devices have made it impossible to keep track of what’s on your network manually. This is exactly where a next-gen network discovery tool becomes a game-changer. ManageEngine OpManager is more than a monitoring solution.

Custom Alerts in Checkly

Learn how to customize your alerts in Checkly to get only the notifications you need. This video walks through account-wide alert settings, managing alert channels, using groups for business-critical checks, and leveraging Monitoring as Code to manage everything from your IDE. Plus, see how to use the Checkly CLI to import existing checks from the UI into code for full version control and automation.

Do you Grok It?

Most people are probably familiar with the word “grok” from Robert A. Heinlein’s novel A Stranger in a Strange Land, in which it is used to describe a deep, almost mystical understanding of something. ‍ Grok is also the name of a plugin for LogStash that enables you to parse and analyze log data using a syntax similar to regular expressions, but specialized for various log formats and fields.

Operational Intelligence - the new horizon of observability

Monitoring your systems isn't enough anymore. Neither is “asking questions about your system”. Operational Intelligence embraces observability to proactively deliver business insights, support decision-making, and accelerate innovation. It seems that as the observability market grows and more and more products come into the space, the meaning of the term observability itself becomes more and more nebulous.