Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

How Stress Affects Our Learning Abilities in Incidents (And What To Do About It)

While retrospectives provide a valuable pathway for learning outside of the flow of work, we also want learning to happen during an incident or unexpected event as it unfolds. This can be challenging due to the negative impact of stress on our ability to learn and navigate difficult situations. In this article, we’ll dig into how stress inhibits our ability to learn and what we can do about it.

Sentry is now Fair Source

Today we’re launching Fair Source, a new approach to software sharing that is safe for companies to adopt and developers to use. Before Fair Source, companies that wanted to engage the developer community with their core products often did not know how to do so while maintaining control over their roadmap and business model. The result is that most software products today are closed-source. With Fair Source, companies have a new option. The Fair Source option is not theoretical.

Enterprise DORA Metrics: Scaling Measurement Across Value Streams and the Organization

DevOps Research and Assessment (DORA) metrics are a ubiquitous measure of DevOps performance. These metrics are used in nearly every enterprise engaged in software development. DORA metrics help measure DevOps maturity, identify bottlenecks, and guide quality and process improvements. Despite their popularity, DORA metrics are generally considered difficult to measure and are primarily used by technical teams within the context of their respective domains.

The 80/20 Rule of Bug Fixing

At BugSplat, we've been supporting applications and video games with crash and error reporting for a long time. Over the years, we've collaborated with a wide range of teams, handling applications of all sizes. From our experience and numerous conversations with users, we've noticed an interesting trend: the distribution of crashes isn't uniform. If your application experiences 100 crashes in a given version, those crashes aren't caused by 100 different defects.

Prometheus data source update: Redefining our big tent philosophy

As we continue adding to our growing catalog of more than 100 plugins for Grafana, we have been focused on developing data sources for Grafana that are more purpose-built for the respective technologies. One example has been the recent update to our core Prometheus data source. We have deprecated AWS authentication from the original Prometheus data source, and we created a new dedicated Amazon Managed Service for Prometheus plugin that will specifically cater to the AWS use case.

Developer's Guide to Getting Started with Pandas Profiling

Exploratory data analysis is a key component of the machine learning pipeline that helps in understanding various aspects of a dataset. For example, you can learn about statistical properties, types of data, the presence of null values, the correlation among different variables, etc. But to get these details, you need to use different types of Python methods and write multiple lines of code.

Balancing Centralization and Autonomy: The Key to Automation at Scale

The recent global outage reminds us that identifying issues and their impact radius is just the first part of a lengthy process to remediation. Incidents are inevitable; how we prepare for and learn from them is what sets teams up to respond more effectively next time. As we saw from the remediation steps taken by enterprises around the world, implementing a known fix across a large number of environments that are potentially managed by a number of distributed teams can be a gargantuan challenge.

Alerting with Twilio: Connect Your Monitoring with the Top-1 Communications Platform

You might be surprised. Why does ilert, the platform dedicated to alerting and incident management, publish anything about the direct (in the sense of bypassing an incident management tool) connection between monitoring solutions and Twilio? Do they take the bread out their own month? —You might think. Working on DevOps incident management since 2009, we believe every solution fits specific needs.

The risks - and rewards - of using production data for testing

Data, and the way enterprises use data in areas like development and testing, has not traditionally been a focus for business leaders but that’s now changing. Data is more varied and complicated than ever before, for example, with enterprises using two or more different database platforms – and 40% using four or more. It’s also spread wider and further, with enterprises hosting their databases in a combination of cloud and on-premises infrastructures.