Operations | Monitoring | ITSM | DevOps | Cloud

Using incidents to level up your teams

I joined GoCardless as a junior engineer. It was one of my first coding jobs, and in my time there I progressed to senior much faster than I had expected. When I reflect on how this happened, one pattern stands out to me; the big step changes in my understanding, and my ability to solve larger and more complex engineering problems, came as a result of incidents.

Kubernetes operators - the top 5 things to watch for

Software operators are steadily revolutionising how we deploy and run complex distributed systems. They offer the promise of low-intervention, self-driving software – ideally leading to service reliability gains and better uptime. For an introduction to Kubernetes operators, check out our introductory webinar or download our guide to Kubernetes operators.

Obtaining and Storing Time Series Data with Python

In this tutorial we’ll learn how to use Python to get time series data from the OpenWeatherMap API and convert it to a Pandas DataFrame. Next we’ll write that data to InfluxDB, a time-series data platform, with the InfluxDB Python Client. We’ll convert the JSON response from our API call to a Pandas DataFrame because I find that that’s the easiest way to write data to InfluxDB.

What is Enterprise Design Thinking?

Since IBM introduced its concept in the IT business landscape, enterprise design thinking has been making huge waves and has become a synonym for innovation. We have thoroughly explored the benefits and inner workings of traditional design thinking, but as with all great ideas and concepts, new and boundary-pushing versions tend to crop up with time. Enterprise design thinking takes elements from experience design and experience management to new levels.

The Basics of Vulnerability Management

Vulnerability management is a proactive and continuous process that seeks to keep networks, systems, and general applications as safe as possible from cyberattacks. Vulnerability management is a crucial aspect of security, and it's essential because it can help prevent data breaches that could result in severe damage to organizations. In this article, we'll delve into the definition of vulnerability management, its process, its importance, and some solutions to perform this task.

Best Remote System Monitoring Solutions in 2022

Companies must effectively monitor their assets and networks in today's competitive setting, get the most significant result, and react swiftly to problems. However, such a situation is unusual with companies that continue to run in a traditional, isolated setting. These companies frequently don't have precise asset performance tracking procedures.

What's New: Updates to PagerDuty Process Automation Software & PagerDuty Runbook Automation, Integrations, and More!

We’re excited to announce a new set of updates and enhancements to the PagerDuty Operations Cloud. Recent development and app updates from the product team include PagerDuty® Process Automation, our Partner Integrations and App Ecosystem, as well as Community & Advocacy Events updates. We continue to help customers automate everywhere to optimize cloud operations and reduce the amount of issues escalated to other teams.

OpenTelemetry Logs, OpenTelemetry Go, and the Road Ahead

We’ve got a lot of OpenTelemetry-flavored honey to send your way, ranging from OpenTelemetry SDK distribution updates to protocol support. We now support OpenTelemetry logs, released a new SDK distribution for OpenTelemetry Go, and have some updates around OpenTelemetry + Honeycomb to share. Let’s see what all the buzz is about this time! 🐝🐝

How adding Kubernetes label selectors caused an outage in Grafana Cloud Logs - and how we resolved it

Hello, I’m Callum. I work on Grafana Loki, including the hosted Grafana Cloud Logs offering. Grafana Loki is a distributed multi-tenant system for storing log data — ingestion, querying, all that fun stuff. It also powers Grafana Cloud Logs.