Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.

Moogsoft and PagerDuty: Boosting DevOps Teams' Productivity and Incident Resolution with AIOps

Today, the customer experience drives IT on all levels. In our digitally transformed world, we do everything online — transact, interact, purchase and more. This mandates constant change and zero downtime. Ironically, as enterprises adopt IT innovations, IT environments get harder to manage and impact the productivity and agility of DevOps and SRE teams — and as a result, the customer experience suffers.

Moogsoft and Atlassian JIRA and Opsgenie: Put the Dev and Ops in DevOps!

DevOps has become the go-to-approach for IT to accelerate their ability to achieve business requirements and ensure the quality of the customer experience. Today’s economy and the customer experience drives IT across the entire stack. Our world has become digitally dependent, which mandates an ever-evolving IT environment that’s on-demand and always available.

Failover Conf Wrapup

Failover Conf was held on April 21, 2020, online. The folks at Gremlin came up with the idea of a virtual conference about reliability after many in-person conferences started being postponed or canceled due to COVID-19. The conference was a lot of fun to attend. I’ll be sharing some of my thoughts on the event and the talks I was able to catch. The videos for the talks haven’t been posted yet, but I’ll update this post with links to them when they are.

How PagerDuty and Partner Rundeck Enable Business Continuity for Digital Operations

At times like these when the world has been forced to adapt and go almost entirely digital, it’s imperative that our systems and platforms stay up and operational—all the times. We are going to great lengths to make sure that the hardware and software in our application stacks are reliable and responsive. Hardware is set up to have redundant backups and new code is tested and reviewed to make sure it doesn’t introduce any bugs into the system.

Darwin Was Right: Change Will Separate the Strong from the Weak

“It is not the strongest or the most intelligent who will survive, but those who can best manage change” said Charles Darwin over 150 years ago – and probably every IT Ops engineer out there these days would agree with him. According to Gartner (and probably your experience as well), over 80% of service disruptions these days are caused by changes in infrastructure and software.

Virtualize the NOC: Futureproof Your IT Investment with AIOps

By abruptly forcing most people to work from home, and by triggering an economic crisis, the global pandemic has upended business operations. Not only must business leaders facilitate remote work among their employees, but they must also accommodate new ways of interacting with suppliers, partners and customers. Meanwhile, businesses’ digital channels and infrastructure, already critical prior to the crisis, have become even more essential, and yet harder to monitor and manage.