Operations | Monitoring | ITSM | DevOps | Cloud

Alerting

"TRIBAL KNOWLEDGE" (noun): That thing you should have done, if only someone had told you.

As a former NOC engineer, I clearly remember my onboarding, and especially the deep-rooted fear I felt every time I encountered an alert that was new to me – particularly during a night shift. My only consolation was that I was never alone during training, so there was always someone I could ask that very awkward question: “I’m new here, what do we do with this…?”

Grow your blame-free culture with these postmortem best practices

Bugs will happen from time to time. As our systems grow in complexity, new functionalities mean new risks. What makes or breaks a team is not only how it handles incidents, but also how it learns from them. This is where incident postmortems come into the picture.

You can't shoot for 5-9's without having powerful incident resolution capabilities

With the advent of more and more connected systems and devices, organizations are facing ever greater data security challenges. Moreover, with the accelerating cloudification of apps and even whole infrastructures, security professionals are also faced with the challenge of how to protect critical assets including personal and other sensitive data as well as IP.

Announcing Unified IT Status Notifications from "Big 3" Cloud Providers

StatusCast helps corporations keep their employees happy by providing unified IT status notifications, which gives them the ability to communicate IT status updates with their employees from a single location. Having to check both a corporate IT status page and a separate one for the organization’s cloud provider to determine the extent of IT issues, lowers employee productivity and job satisfaction.

Hrushikesh shares his journey into SRE and his thoughts on the future of this space

Hrushikesh is passionate about making a complex design with simple and reliable solutions. He is technology and platform agnostic and doesn’t believe in limiting himself to just a few. He started his career in 2006 with a Media company where he was responsible for introducing new technologies along with driving a team to deliver quickly. He does not limit his role to just development and operations and loves exploring everything in the tech space.

Dynamic alerts

The power and value that’s embedded in logs are reflected by the status and behavior of our applications and infrastructure. Many times we would like to be alerted when the application or its components show abnormal behavior. This behavior can be reflected by the application sending some logs at a higher than usual volume. Figuring out exactly what ‘higher than usual’ means, or in other words, setting the threshold value at which the alert should trigger can be a daunting task.

Chatbot integration with Microsoft Teams and Slack

SIGNL4 provides plug-and-play chatbot integrations with Microsoft Teams and Slack, both via certified chatbot apps. Why does it makes sense to integrate SIGNL4 with chat tools after all? There are two basic uses cases that we address with the integration into Teams and Slack. By default, SIGNL4 notifies by mobile push, text and voicecalls, all according to user preference. The focus is clearly on mobile alert notifications. And of course, tracking and escalation of critical alerts is built-in.