Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Lessons learned from running Kafka at Datadog

At Datadog, we operate 40+ Kafka and ZooKeeper clusters that process trillions of datapoints across multiple infrastructure platforms, data centers, and regions every day. Over the course of operating and scaling these clusters to support increasingly diverse and demanding workloads, we’ve learned a lot about Kafka—and what happens when its default behavior doesn’t align with expectations.

Troubleshoot Faster with Anomaly Visualization

LogicMonitor is proud to announce anomaly visualization as an addition to our growing AIOps capabilities! With this new functionality, users are able to visualize anomalies that occur for a monitored resource and compare that anomaly to key historical signals, such as the past 24hrs, 7 days, or 30 days. Anomaly visualization complements LogicMonitor’s existing forecasting functionality and provides another layer of intelligence to better understand resource health.

Take Troubleshooting Up a Notch and Add Context to Your Logs

Today we announced an integration between SolarWinds® AppOptics™ and SolarWinds Papertrail™ to allow you to quickly move from service-level metrics, down to a trace, and then down to the logs specific to that trace. The integration between AppOptics and Papertrail provides the ability to group the log messages from a traced transaction and add trace context to your logs in Papertrail.

Reflections on Monitorama 2019

This year was my third in a row attending (and now speaking at!) Monitorama. Because the organizers do a great job of turning introverts into extroverts for three days straight, it’s always a fun and exhausting time—but one of my favorite parts is how much folks continue talking about and sharing the content, days or weeks after it’s over. So, to continue the drumbeat, here were some of my highlights from this year.

From the First Mile of Infrastructure Performance to the Last Mile of Customer Experience: OpsRamp Synthetic Monitoring Sees What Your Customer Does

OpsRamp delivers real-time observability that IT teams need to understand the performance and availability of business services. Given that modern digital services rely on dynamic and distributed infrastructure, it is critical to pinpoint performance issues that prevent an enterprise from delivering compelling user experiences. So how do you track the end-customer experience as well?

Answer These 3 Questions to Help Find Your MSP Niche

One of the common pieces of advice I hear given to managed service providers (MSPs) is to “go narrow”—find a niche and become a specialist. This is generally sound advice. Specialization typically means your MSP faces less competition and becomes much easier to find in an otherwise crowded marketplace. But finding an area to specialize in is easier said than done. So how do you find a great niche for your MSP?

Monitoring Kubernetes, part 4: the Sensu-native approach

At this point in our series, you’re likely quite familiar with the many opportunities and challenges that Kubernetes presents (especially when it comes to monitoring!). The last couple of posts take at a look at Prometheus for monitoring Kubernetes, with a side-by-side comparison with Sensu, and illustrate how they work in tandem.

11 web monitoring mistakes startups need to avoid

As an Internet startup, you have to put out innovative, meaningful solutions for your users. Therefore, no matter what that solution may be, you’ve got to make sure that the solution is available, functioning, and has excellent performance at launch and afterwards. To help you succeed and to avoid common web monitoring mistakes, we’ve put together a list for you.

A Look Inside GitLab's Public Dashboards

There are transparent companies – and then there’s GitLab. “GitLab is a ridiculously transparent company,” said Ben Kochie, a Staff Backend Engineer for Monitoring at GitLab. “When GitLab has a database outage, we live stream the recovery on YouTube.” GitLab has the same bare all approach to its metrics. “All of our Prometheus metrics are available on a public Grafana dashboard,” Kochie told the crowd gathered at GrafanaCon.