Operations | Monitoring | ITSM | DevOps | Cloud

Monitoring

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Where did all my spans go? A guide to diagnosing dropped spans in Jaeger

Nothing is more frustrating than feeling like you’ve finally found the perfect trace only to see that you’re missing critical spans. In fact, a common question for new users and operators of Jaeger, the popular distributed tracing system, is: “Where did all my spans go?” In this post we’ll discuss how to diagnose and correct lost spans in each element of the Jaeger ingestion pipeline.

How to maximize span ingestion while limiting writes per second to Scylla with Jaeger

Jaeger primarily supports two backends: Cassandra and Elasticsearch. Here at Grafana Labs we use Scylla, an open source Cassandra-compatible backend. In this post we’ll look at how we run Scylla at scale and share some techniques to reduce load while ingesting even more spans. We’ll also share some internal metrics about Jaeger load and Scylla backend performance. Special thanks to the Scylla team for spending some time with us to talk about performance and configuration!

How to Optimize Websites for Ad Publishers

As an ad publisher, your revenue depends on two main factors: traffic to your site and ad optimization. A lot of the focus goes into the practice and processes of driving traffic to your site from an SEO perspective, but what if when visitors get to your site, they have a less than ideal experience? All the effort and time that went into creating and driving traffic to your site would be for nothing if the visitor lands on your page and doesn’t take any action.

Introducing the OpUtils mobile app

Troubleshoot your IP addresses and switch ports faster and smarter! Managing and troubleshooting your network IPs and ports effectively can become difficult if hands-on network monitoring by your IT team is required at all times. Ever wondered if you can monitor your IPs and endpoints on the go? If yes, the solution you need is the OpUtils mobile app.

How Uptime.com can Help Improve Internal Documentation

An acquaintance of mine works for a company that still uses Windows XP to manage some internal applications. The higher ups of the company refuse to adopt the new versions, given costs and technical gaps, and it’s created something of a Pandora’s box for employee turnover. With no strong internal reference documentation, each new departure leaves IT wondering two things. This rather amusing conundrum is apparently not an isolated incident.

How to Test Ruby Code That Depends on External APIs

Few things are more frustrating than slow, flaky test suites. You're ready to deploy, wait 20 minutes for CI to run, only to find that a test failure in code you've never touched is blocking you. You dig into the source and find the problem: an external API call. It works (slowly) most of the time. But sometimes the network glitches and it fails. What do you do? In this article, José Manuel shows us several techniques for removing external API dependencies from our tests.

Using Dynamic Thresholding to Monitor Your Cloud Platforms

Whether you are new to the Cloud, mid-transition, or a professional at cloud or hybrid systems, no one likes being bothered with useless alerts. The options are simple: If you take the approach of ignoring the alert like a bad cold-call, you risk the chance of missing a critical alert and watching your system crash around you. No one likes to open their inbox to a few hundred alerts they have been ignoring.

Monitor and Optimize Your Rancher Environment with Datadog

Many organizations use Kubernetes to quickly ship new features and improve the reliability of their services. Rancher enables teams to reduce the operational overhead of managing their cloud-native workloads — but getting continuous visibility into these environments can be challenging. In this post, we’ll explore how you can quickly start monitoring orchestrated workloads with Rancher’s built-in support for Prometheus and Grafana.