AIOps is the trendy cool new kid on the block in the IT operations world. No doubt about it. However, with all the buzz surrounding AIOps, it’s easy to skip over some of the basics. How many IT operations professionals can clearly define what AIOps is? Beyond the baseline definition, why should you care? What about plugging it into your existing automation and analytics ecosystem?
This year is a leap year, and you know what that means? February has an extra 24 hours of glitches in store for us than usual. Let’s take a look at all the new glitches that February has brought us and learn how to avoid them.
OnPage’s incident alert management platform continues to evolve, providing unique and powerful capabilities to business clients. Latest advancements include live call routing reporting and a sophisticated dashboard for enterprise users. The capabilities enhance team transparency and performance, improving incident management and collaboration in the process. In this blog post, I’ll discuss the benefits of the features and how they improve workflows.
There has long been a request from administrators to have the ability to enforce a minimum interval between alert rule evaluations. This is useful for restricting unrealistic user-defined alert rules that evaluate too often and create unnecessary load in the backend. @Uepoch took the initiative and made all the necessary modifications for this configuration in Grafana’s backend, and we finally pushed it forward and introduced the feature in Grafana v6.6.
Do you still find yourself visually monitoring dashboards for anomalies? That leaves catching revenue-related issues to chance. It’s become humanly impossible to catch incidents on streaming data. This is why many eCommerce and data-driven companies have adopted automated anomaly detection.
Setting and tracking key performance indicators based on the right data can help incident management teams reduce the impact of incidents and strengthen the business. But what exactly is the right data? That can be a deceptively tricky question. Incidents are complex, and no two are exactly the same – and your KPIs must reflect this complexity.