Operations | Monitoring | ITSM | DevOps | Cloud

What is AIOps? | AIOps Explained | ITOM Made Easy 3/5

AIOps is about more than just IT operations: It encompasses DevOps, IT service management (ITSM), incident management, observability, support, security, and KPIs/priorities from business stakeholders. It’s about providing actionable insights into IT operations data—whether it is hybrid, cloud-based, or private data center–based. In this video, Stephen Mann sheds light on the concept of AIOps, what are its applications, and dives deep into how artificial intelligence can be leveraged to improve IT operations in your organization.

How To Build an Escalation Policy for Effective Incident Management

Regardless of your organization’s size, industry, or security measures, you will inevitably face IT incidents. But what do you do if an incident affects a critical system and your on-call responders can’t resolve it? Does your team have a set of clearly outlined next steps they should take to handle the issue? Answering these questions can be complicated, even more so for large organizations that rely on cloud-based services to fuel their IT environment.

Engineering Levels at Honeycomb: Avoiding the Scope Trap

It has been seven years since Rent the Runway posted their engineering ladder, kicking off a veritable trend of engineering teams open sourcing their ladders. Interestingly, nearly all of them seem to have coalesced around “area of scope” as a useful proxy for level. At first glance, “area of scope” does seem to make sense. Senior engineers should be able to work across larger areas of the organization. In addition, your area of influence should expand as you gain experience.

Anomaly Advisor Case Study - K6 Load Test

In this video, our Analytics & ML Lead, Andrew Maguire, walks through an example case study using the K6 load testing platform to run a load test against some of our demo servers running Netdata. Watch in real-time as the Anomaly Advisor reacts to the load test and painlessly surfaces the most anomalous metrics, making it easy to just "see" the load test and how it plays out on the servers.