How machine learning in AppDynamics Cloud accelerates log analysis and reduces mean time to detect. Site recovery engineers (SREs) need to investigate unknown problems reported in production. The common approach is to search and filter log files to find the root cause, and we all know how painful it is to sift through log contents. It’s like finding a needle in a haystack. A machine learning approach is essential to assist SREs to quickly identify the root cause.
The current tech winter has a number of glaring stories — cyclical as they may be, there’s one truth that’s been gleaned over more than the rest; the money spent on internal software tools to support tech infrastructure is bloated. And there’s nothing cyclical about this infrastructure spending.
A Recap of SRECon 2023 Americas by guest author Sebastian Vietz.
Last week, over five hundred SREs gathered in Santa Clara to share the latest research, tips, tricks, best practices, and more for site reliability engineering. They were joined by some of the biggest names in the reliability space. And, yes, Gremlin was there to answer any and all questions about chaos engineering and proactive reliability. After three days of great conversations and insightful talk, let’s take a look at some of the themes we heard weaving through SRECon.