Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Chaos Engineering and Windows: Mitigating common Windows failure scenarios

Microsoft Windows is a popular operating system for many enterprise applications, such as Microsoft SQL Server clusters and Microsoft Exchange Servers. About 30% of the world’s web application hosting systems are running Windows, making it an important part of every enterprise’s plans to prevent outages and enhance reliability.

CMDB: Your Family Tree of Dependencies

Configuration items (CIs) in the configuration management database (CMDB) stores information regarding the relationships among its assets. IT configuration management is becoming increasingly critical in order to maintain service levels and keep all hardware and software performing at peak levels. In order to maintain those levels, there are a few things to understand about the CMDB, how those CIs are connected, and how their relationships can be used to improve technology services.

5 Tips to Succeed with Enterprise Service Management from Ivanti Customers

Do more with the same. That’s a phrase many of us hear in our careers when there is an expectation for higher output with no increase in resources or budget. One way many organizations achieve this is by leveraging the functionality of their IT Service Management (ITSM) solution to automate workflows and build more efficient business processes throughout all departments, not just IT. This is often referred to as Enterprise Service Management, or ESM.

Stay Ahead of Outages With Proactive Incident Response

How would your daily life be impacted if you had a bird’s-eye view of your operations, their dependencies, and the ability to spot indicators that an incident or outage was likely to happen? What would it mean for your business if you were given minutes or hours to get ahead of disruptions instead of reacting to a surprise? For most organizations, enabling proactive incident response translates directly to dollars saved, brand reputation protected, and less burnout within response teams.

How we monitor Checkly

If you follow our very own @HLENKE, you might have seen his recent tweet. Availability and responsiveness are key topics for every SaaS platform. They also happen to be multi-level, complex topics that often span different technology stacks and can be tackled with a variety of approaches. Hannes' tweet actually gives us the perfect segue into a blog post about how our engineering team currently monitors Checkly.

Tame IT Chaos by Leveraging Advancements in Machine Learning and Artificial Intelligence

Information Technology (IT), like many other industries, is tapping into the latest advancements in Machine Learning (ML) and Artificial Intelligence (AI) to solve a decades-old problem in the IT management world. History can teach us many things, and by diving into years of accumulated IT data, we can find meaningful insights and use them to guide the future.

The Raw & Real Approach to Observability

Practicing observability isn’t just about tools. It also means improving how you work together and how you share lessons across the team. Learning from each other helps everyone on your team become better engineers that can create amazing experiences with code, or that make code work at incredible scale (or both!). Writing software and operating it in production is—and must be—a team sport.

Infrastructure as a Competitive Advantage - Tips for Managing Trading Operations

I recently spoke on a panel discussion with the Securities Technology Analysis Center (STAC) on the use of infrastructure as a competitive advantage. The event offered fresh thinking on what it takes to manage high-frequency, low-latency trading environments - so I wanted to share some best practices for organization, monitoring, and how to make insights operational.

Survivorship Bias in Observability

During World War II, a mathematician named Abraham Wald worked on a problem – identifying where to add armor to planes based on the aircraft that returned from missions and their bullet puncture patterns. The obvious and accepted thought was that the bullets represented the problem areas for the planes. Wald pointed out that the problem areas weren’t actually these areas, because these planes survived.

Elastic Cloud roundup: API support, more regions, and new purchasing options

You can now benefit from even more features and functionality in Elastic Cloud. In case you missed it, we’ve added powerful tools to simplify and automate operations. We’ve added support for more regions. And we’ve even added new ways to pay for, and understand your bill for Elastic Cloud. With a cup of tea and five minutes, we’ll recap them for you.