Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

What is MTTR? How to measure and improve your Mean Time to Recovery

Complex distributed systems run just about every service imaginable. Healthcare systems that monitor patient health, security systems, and financial systems are all mission-critical. Downtime, or lack of availability, loses money and can even put lives at risk. These systems must be monitored. Many measurements are useful to keep systems running with as little downtime as possible. One of those is Mean Time To Recovery. (MTTR.)

How to achieve DevOps consensus: The what and how of DevOps

DevOps is a complex, multi-dimensional topic. It is context-sensitive. Those who attempt to learn about and implement DevOps bring their roles and cultural perspectives to the process. Diversity of opinion and expertise can be an important advantage. However, it can also lead to friction and contention in developing DevOps consensus.

What is Real User Monitoring? Definitions, examples and benefits

It sucks to spend a long time building an app then get complaints about slow loading pages. You don’t know which pages the problems occur on, let alone the environment. So, software performance problems stay elusive, and linger on in your app, causing havoc for end users and your bottom line.

Why building internal tools could become a costly mistake

Having worked closely with software developers for almost a decade, I’ve noticed some common traits amongst them. Technically minded people think about problems in different ways. I’m often stunned how I could miss such an obvious data point or edge case when discussing product changes with people who have a far greater technical mind than myself.