Operations | Monitoring | ITSM | DevOps | Cloud

Declare early, declare often: why you shouldn't hesitate to raise an incident

My first incident.io-incident happened in my second week here, when I screwed up the process for requesting extra Slack permissions, which made it impossible to install our app for a few minutes. This was a bit embarrassing, but also simple to resolve for someone more familiar with the process, and declaring an incident meant we got there in just a few minutes. Declaring the first incident when you start a new job can be intimidating, but it really shouldn’t be.

Gartner: tips for improving reliability

In their report titled “IT Resilience — 7 Tips for Improving Reliability, Tolerability and Disaster Recovery”, Gartner presents seven strategies for improving the resilience posture of your critical systems. These recommendations range from how to get started, to identifying IT hazards and risks to reliability, to capturing metrics and translating them into business value. In this blog, we’ll take a high-level look at the report and summarize some of its key findings.

Managing the unseen - getting started with SaaS management

Claiming that SaaS is the future is a broad statement until you take a second and look around you. Be it for your business or everyday needs, every problem has one or more SaaS solutions to solve them. People today are well aware of this growing reality that simpler, cheaper SaaS solutions are available all around them to make their lives easy.

Ask Miss O11y: As a developer, how can I try out observability?

What's the first small thing to do in o11y that would teach me something, bring something valuable, and open the way for something else? Observability doesn’t have to be a big, company-wide project. It can be useful locally and individually. A little playing around can get you some crucial insight into how your software works. Try it as a team, or in a pair, or by yourself. It takes 3 steps: Step 1 is easy. The other two might take ten minutes, or maybe more like a day.

7 tips for knowledge managers to increase self-service

Knowledge managers have an ongoing challenge: They want to encourage customers to self-serve and quickly find answers to their questions. They also want to minimize frustration if customers can’t quickly find an answer. Delivering great self-service is key to customer experience. It’s in everyone’s best interest if a customer doesn’t open a case (or incident). But where is the line between keeping your customer happy and causing undue delays?

Grafana dashboards: A complete guide to all the different types you can build

There is one universal truth about using Grafana: Dashboards are easy to create, but not-so-easy to organize. As organizations scale, there’s a high risk of unchecked dashboard sprawl, when dashboards become an unmanageable mess. As the number of users increase, so does their dashboard output. Our guide to dashboard management gives an overview of features that help with organizing dashboards, but there are still two pain points.

Multi-Step Monitoring: Why it's Essential and How it Works

The term “essential” is thrown around pretty loosely these days. That new show about the hospital (no, not that one… not that one either… yeah that one) is advertised as essential viewing. A newly-released track by a hip hop artist that describes how little they need to release new tracks in order to live much, much better than the rest of us? That’s essential listening.

9 Powerful Ways To Align Engineering And Product Teams

One of the best ways to achieve efficient product development is to make sure everyone involved with a project is on the same page, working toward the same ultimate goals. However, this is harder to do than it sounds. Engineering and product teams bring very different priorities to the table, and establishing which priorities to focus on first often becomes a matter of debate.

Getting Your Clouds Under Control: Part 1-FinOps

Given the strategic importance of the cloud and size of cloud expenditures, it’s critical for enterprises to have solid controls in place to manage it all. According to our latest research, however, while most organizations agree with that sentiment, very few have put it into practice. There are distinct but related disciplines that come into play: FinOps and cloud governance. In this two-part series, we explore current state of each.

How Proactive IT Prevents IT Tickets from Reaching Critical Mass

If an IT ticket is submitted, then it’s already too late. But what if you could get ahead of the pile up, reduce IT tickets, and solve issues before the tickets ever appear? Digital workplace technology is evolving fast, fueled by the post-pandemic realization that IT is vitally responsible for the productivity and satisfaction of modern organizations. But it’s not all going to plan.