Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

How a Production Outage Was Caused Using Kubernetes Pod Priorities

On Friday, July 19, Grafana Cloud experienced a ~30min outage in our Hosted Prometheus service. To our customers who were affected by the incident, I apologize. It’s our job to provide you with the monitoring tools you need, and when they are not available we make your life harder. We take this outage very seriously. This blog post explains what happened, how we responded to it, and what we’re doing to ensure it doesn’t happen again.

Kusto 101 - A Jumpstart Guide to KQL

This blog post is for anyone needing a jumpstart into the world of Kusto. Perhaps you’ve heard about Kusto and are just curious. Maybe you’re just starting to use Azure Monitor for your application monitoring. You might even be getting skilled up in anticipation of the new Squared Up for Azure release that will have KQL at its heart. Whatever your reason, set aside the next 10 minutes and we'll get you up to speed with KQL. Ready? KQL stands for Kusto Query Language.

Using Vagrant to simplify building Virtual Machines

Oracle’s VirtualBox software is a key tool in software and website development, but can be complicated to configure. Vagrant simplifies the process and enables developers to repeatably build and scrap near-identical Virtual Machines (VM). This post will create a Ubuntu 18.04 Virtual Machine with a local directory mounted on it to make it easier to code on.

The unexpected path to the c-suite

When I was a little girl, I played “business” at my grandmother’s house. She gave a box of blank payroll checks from a defunct business and heels and fancy clip-on earrings that she wore to work. I stuffed a bunch of blank checks into a purse and strutted down the hall to the back bedroom (y’know, the official boss’s office), where I’d wave my hands around telling everyone to get to work.

Intent-based Capacity Planning and Autoscaling with Kubernetes

Intent-based Capacity Planning is Google's approach to declare reliability intent for a service and then solve for the most efficient resource allocation plan dynamically. Learn how you can start using this approach to effectively manage the reliability of your services running on your Kubernetes cluster.

3 Things Finance Teams Should Understand About AWS (Straight from Engineering)

If you’re a CFO or finance leader at a company that uses public cloud services like AWS, chances are you’ve had a bill cross your desk that may seem confusing. You or your team of financial analysts may have frequent conversations with engineering about how AWS services are allocated across different engineering initiatives.

Network Emulation. Bringing real-world conditions to the test environment

The Network Emulation is a relevant technology when making tests related to the behaviour of our platform. Let’s look at these situations: All these situations are part of the day-to-day work of the IT managers and all are responded to by developing the necessary tests. However, when we propose to do these tests, two options arise: simulation and emulation of networks. These are two concepts that are often used interchangeably but are actually very different.

Cloud Security: What It Is and Why It's Different

The principles of data protection are the same whether your data sits in a traditional on-prem data center or in a cloud environment. The way you apply those principles, however, are quite different when it comes to cloud security vs. traditional security. Moving data to the cloud introduces new attack-surfaces, threats, and challenges, so you need to approach security in a new way.

Reducing MTTR in the Field: 10 Simple Steps Using Retrace

The last decade has ushered in a golden era of software engineering. The rise of cloud computing freed companies from managing their own data centers and provided on-demand scaling. These services allow for provisioning servers on the fly using configuration and code. Treating that task as just another type of software development led to the advent of DevOps.