Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Making sure you get a Checkly alert for every detected failure

It’s every ops team’s biggest anxiety: a monitoring system detects a failure, but the notification either isn’t delivered or isn’t noticed by the team. Now we have to wait for users to complain before our team knows about the problem. Checkly sends an alert every time the system detects a failure, but how can you be sure you’re getting those alerts, and that those alerts are going to the right people?

Announcing Checkly Traces: Unified Synthetic Monitoring and Distributed Tracing

Until recently, Checkly was telling you what broke in your app. Now, it can also tell you why it broke. We're excited to announce the general availability of Checkly Traces, a new addition to our synthetic monitoring platform that bridges the gap between frontend monitoring and backend observability. By combining synthetic monitoring with distributed tracing, Checkly Traces empowers development teams to detect, diagnose, and resolve issues faster than ever before.

DOES Cache Rule Everything Around Me? - Using Compression for our Prometheus Cache

Checkly is a key part of a professional developer’s workflow, making it easy to know if your service is up or down, and measure performance. As we integrate with almost any development workflow, we also have Prometheus endpoints to let you use the popular Grafana stack to keep track of your site checks’ status. As large enterprise users grew in usage, their check performance data grew in parallel, and our endpoint started returning occasional 429 status codes.

Why Monitoring as Code Is the Future of Application Reliability for Modern Teams... and how it can save you $1 million!

I recently talked to a customer of Checkly and he shared some thoughts about Monitoring as Code. Let’s call him Karl in this article. Karl and I talked about why Monitoring as Code (MaC) is becoming essential for teams operating at scale. As the Head of Platform Engineering at a major e-commerce company processing millions of transactions daily, his experience shows how MaC solves a lot of the messy challenges that come with traditional synthetic monitoring setups.

How LinkedIn Stopped Relying on Users to Report Bugs

When making changes to your production services, it’s important to have a plan for how to detect problems and roll back changes. How many roll out plans would include: “if it breaks, don’t worry, the users will tell us!” But if your monitoring coverage of production services isn’t complete, you’re implicitly relying on your users to tell you when something breaks.

How good is GitHub Copilot at generating Playwright code?

People keep asking us here at Checkly if and how AI can help create solid and maintainable Playwright tests. To answer all these questions, we started by looking at ChatGPT and Claude to conclude that AI tools have the potential to help with test generation but that "normal AI consumer tools" aren't code-focused enough. High-quality results require too complex prompts to be a maintainable solution.

Prometheus Blackbox Exporter vs Kuberhealthy for K8s monitoring

We all implement tools to monitor our nodes and keep our entire cluster up and running. But how often do updates, failures, or errors mean that users suffer outages, even though our status boards look green? As Kubernetes has enabled more complex microservice architecture, the gap between the state of the dashboard, and the health of services for the user, has grown wider.

Are ChatGPT or Claude better than Playwright Codegen?

I'm a bit of an AI skeptic. And even though GitHub Copilot is my daily auto-completion on steroids, I always double-check the code generated by LLMs. If you're using AI for coding, you probably know that the results are sometimes surprisingly good and other times shockingly terrible. Lately, I have seen more and more articles and even docs recommending ChatGPT to generate Playwright tests. Could this be true? Are ChatGPT and friends really that good at generating test code?

Five Playwright CLI features you should know

Thanks to Microsoft's Playwright, running end-to-end tests with real browsers is quickly done. Initialize a new Playwright project, install all the dependencies, and off you go! Then, any new headless browser test run is only one npx playwright test away. But have you checked all the test command's CLI options? playwright test includes a few real gems to help you create better tests faster. Let me share a mixed bag of my favorite CLI tricks in this post.