Operations | Monitoring | ITSM | DevOps | Cloud

Crossing the machine learning pilot to product chasm through MLOps

Numerous companies keep launching AI/ML features, specifically “ChatGPT for XYZ” type productization. Given the buzz around Large Language Models (LLMs), consumers and executives alike are growing to assume that building AI/ML-based products and features is easy. LLMs can appear to be magical as users experiment with them.

An Expert Guide On GCP Cost Monitoring

Research firm, IDC, published a study in early February 2024 predicting that Google Cloud Platform (GCP) customers break even after just 10 months. Moreover, the study, The Business Value of Google Cloud IaaS, reported that after migrating to GCP, the participants were on track to achieve a 318% ROI in five years. Here’s the thing. Achieving these milestones requires a robust GCP cost monitoring and optimization plan.

How IT administrators can streamline operations using the LogicMonitor API

In today’s fast-paced IT ecosystem, agility and efficiency are not just goals but necessities. So why waste an hour (or more) manually onboarding individual devices when you can leverage the LogicMonitor API to automate the onboarding process for an entire site in just minutes from a simple CSV file? In this article, we’re going to review how LogicMonitor administrators can maximize efficiency and transform their IT operations using LogicMonitor’s REST API and Powershell.

Better, Faster, Stronger Network Monitoring: Cribl and Model Driven Telemetry

New in Cribl 4.5, the Model Driven Telemetry Source enables you to collect, transform, and route Model Driven Telemetry (MDT) data. In this blog, you’ll learn how to explore the YANG Suite to understand the wide variety of datasets available to transmit as well as how to configure the tools to get data flowing from Cisco IOS XE network devices to Cribl Stream.

Monitoring the Health Status of Progress Flowmon Appliances with IT Infrastructure Monitoring Tools

Progress Flowmon is a core network monitoring and security tool. Confirming if it is up and running can mean the difference between responding to a data breach or overlooking such a critical event. Like any other critical system, it is a good practice to include the monitoring of Flowmon uptime, resource consumption and health in an IT infrastructure monitoring (ITIM) dashboard, such as Progress WhatsUp Gold.

Where to automate resilience testing in your SDLC

When organizations begin to deploy resilience testing or Chaos Engineering, there’s a natural question: can we integrate this with our CI/CD pipeline or release automation tools? After all, you’re likely running unit, performance, and integration tests already—is resiliency different? The short answer is yes—to both. Integration is possible, but resiliency is different, so automation is a nuanced conversation.

Introducing an OpenTelemetry Collector distribution with built-in Prometheus pipelines: Grafana Alloy

In the opening keynote of GrafanaCON 2024, we announced our newest OSS project: Grafana Alloy, our open source distribution of the OpenTelemetry Collector. Alloy is a telemetry collector that is 100% OTLP compatible and offers native pipelines for OpenTelemetry and Prometheus telemetry formats, supporting metrics, logs, traces, and profiles. Some of you may be thinking: Wait, another collector?

Find your logs data with Explore Logs: No LogQL required!

We are thrilled to announce the preview of Explore Logs, a new way to browse your logs without writing LogQL. In this post, we’ll cover why we built Explore Logs and we’ll dive deeper into some of its features, including at-a-glance breakdowns by label, detected fields, and our new pattern detection. At the end, we’ll tell you how you can try Explore Logs for yourself today. But let’s start from the beginning — with good old LogQL.