Operations | Monitoring | ITSM | DevOps | Cloud

Monitor Databricks with Grafana Cloud for instant visibility into your workloads

If you're running Databricks workloads, you've probably asked yourself these types of questions: How much is this costing me? Why did that job fail last night? Why are my dashboard queries suddenly slow? We've been there, too. Databricks is fantastic for data engineering, ML, and analytics. But once you start running jobs, pipelines, and SQL queries at scale, you need a way to keep tabs on what's happening. That's why we built the Databricks integration for Grafana Cloud.

Grafana Alerting: Respond faster and get situational awareness with alert enrichment in Grafana Cloud

Alerts are meant to help teams respond quickly to problems, but too often they arrive without enough context to be immediately useful. An alert that says “CPU usage is high” still leaves the on-call engineer asking critical follow-up questions: Which service? Which environment? Where do I look next? Validating the alert and triaging the situation is the first step for every engineer. It's a manual step that takes time, extending every potential incident.

A faster way to pinpoint performance bottlenecks: Using Profiles Drilldown with Grafana Cloud Knowledge Graph

When you identify CPU or memory spikes in your services, it’s critical to understand why they’re happening. But switching between tools or crafting complex queries can slow you down when trying to pinpoint a root cause. This is why we’re excited to share that Profiles Drilldown, an application that lets you easily explore profiling data through an intuitive, point-and-click interface (no queries required), is now integrated with Grafana Cloud Knowledge Graph.

Kubernetes Monitoring Helm chart v4: Biggest update ever!

The Kubernetes Monitoring Helm chart is the easiest way to send metrics, logs, traces, and profiles from your Kubernetes clusters to Grafana Cloud (or a self-hosted Grafana stack). And version 4.0 is the biggest update the chart has ever received. Representing nearly six months of planning and development, it's designed to solve real pain points that users have hit as their monitoring setups have grown.

How to manage synthetic monitoring checks as code with Terraform and Grafana Cloud

As teams scale, managing synthetic monitoring checks manually in the UI becomes difficult and error-prone. When you're dealing with dozens of checks across multiple environments, teams experience inconsistent configurations, lack of version control, and difficulty tracking changes.

Business metrics in Grafana Cloud: Get an AI assist to help securely analyze your data

For today's modern businesses, the data landscape demands security and flexibility. You need to connect your observability platform to rich, proprietary datasets that often reside in private networks without compromising security or managing complex network infrastructure. You may also face an extra layer of complexity in order to effectively query and visualize that data. Luckily, modern artificial intelligence tools have made these previously complicated processes much simpler.

Query fair usage in Grafana Cloud: What it is and how it affects your logs observability practice

In Grafana Cloud we use a simple yet generous formula that lets you query up to 100x your monthly ingested log volume in gigabytes for free. This works for the vast majority of our customers, but if you aren’t careful and strategic with your usage, you could find yourself with an overage bill.

Finding performance bottlenecks with Pyroscope and Alloy: An example using TON blockchain

Performance optimization often feels like searching for a needle in a haystack. You know your code is slow, but where exactly is the bottleneck? This is where continuous profiling comes in. In this blog post, we’ll explore how continuous profiling with Alloy and Pyroscope can transform the way you approach performance optimization.

From raw data to flame graphs: A deep dive into how the OpenTelemetry eBPF profiler symbolizes Go

Imagine you're troubleshooting a production issue: your application is slow, the CPU is spiking, and users are complaining. You turn to your profiler for answers—after all, this is exactly what it's built for. The profiler runs, collecting thousands of stack samples. eBPF profilers, including the OpenTelemetry eBPF profiler, operate at the kernel level, so they capture raw program counters: memory addresses pointing into your binary.