Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Slack's New Metrics Storage Engine Challenges Prometheus

Metrics storage engines must be specially engineered to accommodate the quirks of metrics time-series data. Prometheus is probably the most popular metrics storage engine today, powering numerous services including our own Logz.io Infrastructure Monitoring. But Prometheus was not enough for Slack given their web-scale operation. They set out to design a new storage engine that can yield 10x more write throughput, and 3x more read throughput than Prometheus! In February 2022 Suman Karumuri, Sr.

Why Icinga?

We have decided to make some short educational videos about Icinga, and today we will be releasing the first one: Why Icinga? In these videos we want to explain the Whys and Whats and Hows around Icinga in a way that is accessible to anyone who is interested. So Why do you want to use Icinga? Monitoring is the foundation you want to build your infrastrastructure on.

Troubleshoot faster with improved Datadog Events

Datadog Events provides customers with a data feed about their infrastructure and applications, delivering an up-to-the-minute history of activity such as code deployments, configuration changes, and triggered alerts. Events collects data from Datadog products and over 100 third-party integrations—including Docker, Jenkins, Kubernetes, Sentry, AWS CloudWatch, and Azure Service Health.

Bolster network monitoring with root cause analysis

If you own an enterprise, then you know the value of a healthy network and how seriously detrimental a network outage is to your business. But network issues are inevitable. The heavy dependence on networks to meet the ever-changing client and internal usage requirements takes a heavy toll on the network. This makes networks vulnerable to common problems such as unplanned, sudden downtime, high resource utilization, and hardware malfunctioning.

How We Run Successful Beta Tests with Error Reporting

We’ve recently completed a large beta test for our new product here at Testmo. We build a test management tool, so most of our users are professional software testers. As you can imagine, our customers are a rather critical group of users when it comes to software quality. We’ve learned some important lessons about running a large beta test and we want to share how we benefited from Sentry error reporting to identify, find, and fix issues quickly.

Machine learning for infrastructure monitoring and troubleshooting, explained

Learn exactly what machine learning is and how it takes part in the observability, monitoring, and troubleshooting industry. We'll also cover the future of ML trends within the industry, and how Netdata is staying at the forefront of machine learning development.

The Fellowship of the Stream: Unlock Radical Levels of Choice & Control with Observability Data

Cribl Stream is a vendor-agnostic observability pipeline that gives you the flexibility to collect, reduce, enrich, normalize, and route data from any source to any destination within your existing data infrastructure. You’ll finally achieve full control of your data, empowering you to choose how to treat your data to best support your business goals..

New Honeycomb Whitepaper on Frontend Observability

Big news: I can finally stop pointing anyone who asks about Honeycomb’s story for frontend observability to Emily’s blog post from 2017 on “Instrumenting Page Loads with Honeycomb.” (It was a great post, don’t get me wrong, but I don’t think any of us knew it would bear such weight for so long.) I am ecstatic to announce that we have released a new whitepaper called “Getting Started With Honeycomb Client-Side Instrumentation for Browser Applications,” wri

Modernizing Network Monitoring with InfluxDB and Telegraf

This article was originally published in The New Stack. As the technology landscape continues to change at a rapid pace, enterprise companies are in a rush to catch up and modernize their legacy IT and network infrastructure to capture the benefits of newly developed tools and best practices. By adopting modern DevOps techniques, they can reduce their operational costs, increase the reliability of their services and improve the overall speed and agility at which their IT teams are able to move.