Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Monitor Ray applications and clusters with Datadog

Ray is an open source compute framework that simplifies the scaling of AI and Python workloads for on-premise and cloud clusters. Ray integrates with popular libraries, data stores, and tools within the machine learning (ML) ecosystem, including Scikit-learn, PyTorch, and TensorFlow. This gives developers the flexibility to scale complex AI applications without making changes to their existing workflows or AI stack.

How to get your first ten customers

It'll soon be the third anniversary of publicly launching OnlineOrNot on Twitter, and I often get asked what I did to get my first paying customers - so I felt like sharing. I assume when most folks ask this that they're looking for the one thing they can do to finally start getting paid customers. Let me be clear: it's never just one thing.

Unlocking the Power of Real-Time Analytics with InfluxDB

Turn insights into action–in real-time–using your time series data. Now, more than ever, businesses generate massive amounts of time-stamped data. To get value from that data, you need to be able to ingest and query it in real-time. InfluxDB 3.0, built on innovative open source technology (Apache ecosystem), is the solution startups and enterprises use to achieve real-time insights.

Subnetting - Ultimate Guide - Definition, How & Why?

In computer networking, understanding the concept of subnets and subnetting is crucial for managing and troubleshooting network issues. So, in this ultimate guide, we will explain everything you need to know about subnets, subnet masks, and subnet calculators. Additionally, we will introduce you to popular subnetting tools.

Why monitoring your application is important

Effective monitoring and observability tools are critical for modern enterprises. Daily operations, digital transformation, moving to a cloud-native architecture, and an ever-evolving tech stack all require ITOps, DevOps, and SRE teams to monitor increasingly complex systems. So what happens if your applications suddenly cease to function? Every moment of downtime translates to lost income, decreased customer satisfaction, and harm to your company’s reputation.

The Advent of Monitoring, Day 11: Testing and Monitoring: Should You Separate or Unite Them?

The two key pillars of building reliable applications are: testing and monitoring. With testing, you can verify that each pull request works before it’s merged and deployed to production. Just testing isn’t enough, though. You also need to make sure that the application continues to work on production. Database rollovers, third-party outages, and unexpected spikes in traffic can all cause issues that need to be detected.

OpenTelemetry best practices: A user's guide to getting started with OpenTelemetry

If you’ve landed on this blog, you’re likely either considering starting your OpenTelemetry journey or you are well on your way. As OpenTelemetry adoption has grown, not only within the observability community but also internally at Grafana Labs and among our users, we frequently get requests around how to best implement an OpenTelemetry strategy.

Rollbar Alternatives: Compare Before You Commit

Rollbar is acclaimed as the top error monitoring tool - with 4.5 out of 5 stars on both Capterra and G2 - amongst a competitive field. That said, we recognize there are alternatives some people consider when also looking at us. Here is our perspective on what these other tools are for, and when to choose Rollbar instead.

The Advent of Monitoring, Day 10: Better Observability Into Your Local Clickhouse Instance With Grafana and Prometheus

Cloud-based database providers often provide great observability out of the box. But, what if you’re developing a tricky feature locally and need more details about what your local Clickhouse is doing? There are many options, but if you’re a numbers and graphs person like me, you’ll want to be able to view the inner workings of Clickhouse in something like Grafana.

AWS re:Invent Recap!

Cribl’s usual suspects, Ed Bailey and Jackie McGuire, are joined by Sr Partner Marketing Manager Michelle Zhang to discuss our experiences at AWS re:Invent this past November. It was a great event, and we want to share the top themes and presentations we saw at the show. Michelle will share her experience building and strengthening Cribl’s strategic alliance network and some of the "better together" progress made over the past year for customers.