Operations | Monitoring | ITSM | DevOps | Cloud

%term

How to detect broken links with Playwright Test

Join Stefan Judis in this Playwright tutorial, where he explores detecting broken links using Playwright and/or Checkly. Stefan covers essential techniques such as soft assertions, crafting custom error messages for clearer debugging, and using page context-aware requests to check for URL status codes. Whether you're dealing with empty links, nonexistent domains, or 404 errors, this video provides all the tools needed to enhance your testing strategy effectively.

Create an EKS Cluster with Ubuntu Pro Using eksctl

Setting up an EKS cluster with Ubuntu Pro has never been easier, thanks to eksctl. In this brief tutorial, we'll walk you through the simplest way to create an EKS cluster with Ubuntu Pro, using the latest version of eksctl. Why choose Ubuntu Pro? Ubuntu Pro on EKS provides enhanced security features tailored for your enterprise needs. From Kernel Livepatch, which keeps your kernel secure without requiring node reboots, to ESM-Apps, ensuring your containers run with fully supported open-source software from Ubuntu's repositories—all backed by Canonical's active security maintenance.

How to Automatically Remediate Incidents with Grafana IRM

Build automatic remediation workflows to preemptively resolve system issues and minimize downtime. With observability-native IRM, you can automate routine tasks, ensure consistent responses, and reduce the manual effort required to manage incidents. Grafana Cloud is the easiest way to get started with Grafana dashboards, metrics, logs, and traces. Our forever-free tier includes access to 10k metrics, 50GB logs, 50GB traces and more.

The Evolution of Engineering and the Role of Observability 2.0 in Shaping the Future

Engineering has come a long way since the days of delivering discrete, point-in-time products that were often packaged on a CD and shipped to customers. The days of physical media and long development cycles are long gone. The advent of cloud computing and the rise of Software-as-a-Service (SaaS) transformed the landscape, creating a new model of continuous development and service delivery. This shift has not only revolutionized how software is developed, but has also redefined the engineer’s role.

How to Use InfluxDB for Real-Time SpringBoot Application Monitoring

Enterprise Java developers understand the frustration of sluggish application performance in production. Diagnosing issues within complex microservice architectures can be a time-consuming nightmare. Thankfully, the popular Java framework SpringBoot provides a robust observability stack to simplify real-time monitoring and analysis. By harnessing the power of libraries and tools such as SpringBoot Actuator, Micrometer with InfluxDB, and Grafana, you can gather meaningful insights easily and quickly.

Welcome to a World of Possibility with Elastic

Activate a world of possibility with Search AI. Elastic powers AI to give you real-time, forward-thinking flexibility. When data turns into action, you don't have to wait for the world to turn. You can drive it's motion. With Elastic's search AI you can unleash the possibilities of your data. And transform your world.

5 ways teams used BigPanda during the CrowdStrike outage

In the weeks since the Crowdstrike outage brought millions of systems to a halt, countless articles have been written about the cause of the outage, its impact, and the costs companies incur during service disruptions. Nearly every large company had hosts offline due to the faulty update in CrowdStrike’s Falcon software. BigPanda customers were no exception. On July 19, between 04:00 and 07:00 UTC, the BigPanda systems logged an increase in shared incidents.

Alert noise reduction: How to cut through the noise

ITOps and AIOps teams often face an overwhelming volume of notifications, many of which are false positives or low-priority alerts. The constant influx creates a chaotic environment. ITOps and AIOps teams can easily miss critical issues, potentially leading to system failures or prolonged downtime. Spending significant time sifting through irrelevant alerts reduces team efficiency and slows response. Focus on alert noise reduction to ensure that only meaningful and actionable alerts reach your teams.

Navigating the Incident Management Lifecycle: A Complete Guide

Ever wonder why some IT teams can quickly resolve incidents while others struggle? The secret lies in mastering the Incident Management lifecycle. But don’t worry—this isn’t some dull, complicated process only experts can understand. The Incident Management lifecycle is simply a structured approach to handling incidents efficiently. And the best part? You can quickly get the hang of it.