AIOps and Smart Alerting
Smart Alerting is not enough. Effective deployment of AIOps requires an independent platform capable of interacting with all technologies along the path from signal to response.
Smart Alerting is not enough. Effective deployment of AIOps requires an independent platform capable of interacting with all technologies along the path from signal to response.
Critical and sev1 incidents are always a priority, but what about those dozens and often hundreds of lower priority ones that often sit in a queue waiting for a first response engineer to get to them? Do you find that no matter how much effort your team puts into minimizing the number of queued incidents, their number always seems to grow? If this sounds familiar – this blog is for you.
As 2019 comes to an end, OnPage would like to re-inform MSP teams about the value and importance of offering a 24×7 support service. Twenty-four seven support ensures that client issues are quickly resolved by an after-hours support team. Though 24×7 support is a must-have offering, MSPs must first re-work their internal workflows and policies, ensuring that after-hours servicing is a pain-free venture.
Time for another installment in the series where we explain in detail yet another important metric for tech organizations. After covering MTTD and MTTF, today we answer the question, “What is MTBF?” As the post title makes clear, MTBF stands for “Mean time between failures.” The acronym refers—like the others that came before it—to an important DevOps KPI. But what actually is it? What is it good for? How do I implement it?
Utilizing ChatOps for issue resolution isn’t new, but the benefits of using a single tool for communicating and resolving issues gives it lasting power. The ChatOps model enables teams to take action on their day-to-day work directly from collaboration platforms, including Microsoft Teams. Since many Dev and ITOps folks are using Microsoft Office 365 for their daily work, it was a natural next step for Opsgenie to align with Microsoft Teams.
Recently, I wrote about an IDC business value study PagerDuty commissioned and shared some of the results from the research. In summary, after in-depth interviews with eight enterprise customers, IDC applied its proven business value methodology to the aggregated results of those interviews and found that enterprise customers were averaging a three-year return-on-investment (ROI) of 731% and a payback period (break-even point) on their investment in just 4.3 months.
Aquafin is a Belgian company with over 1,000 employees that was established by the Flemish Region in 1990 for the purpose of expanding, operating and pre-financing the wastewater treatment infrastructure in Flanders. Aquafin collects household wastewater from the municipal sewers and transports it to wastewater treatment plants, where it is treated in accordance with European and Flemish standards.
When considering the state of critical incidents in 2019 – it’s no surprise that looking ahead to 2020, CISOs have one of the organization’s most challenging and stressful jobs. During the first half of the year alone 4.1 billion records were compromised, and the average cost of a data breach is now estimated at $3.92 million.
“Being on-call is a critical duty that many operations and engineering teams must undertake to keep their services reliable and available. However, there are several pitfalls in the organization of on-call rotations and responsibilities that can lead to serious consequences for the services and the teams if not avoided.