Sometimes, two concepts overlap so much that it’s hard to view them in isolation. Today, incident management and problem management fit this description to a tee. This wasn’t always the case. For a long time, these two ITIL concepts were seen as distinct—with specialized roles overseeing each. Incident management existed in one corner and problem management in the other. Then came the DevOps movement and the lines suddenly became blurred. So where do they stand today?
Incident management is a critical aspect of IT service management (ITSM) that revolves around restoring normal service operations as swiftly as possible after an unplanned interruption or reduction in quality. Also referred to as “incidents,” these interruptions could range from a minor issue like a single user being unable to access a service to a significant problem such as a server crash or network outage affecting many users.
In this article, we will be covering how to monitor Kubernetes using Graphite, and we’ll do the visualization with Grafana. The focus will be on monitoring and plotting essential metrics for monitoring Kubernetes clusters. We will download, implement and monitor custom dashboards for Kubernetes that can be downloaded from the Grafana dashboard resources. These dashboards have variables to allow drilling down into the data at a granular level.
Prometheus is becoming a popular tool for monitoring Python applications despite the fact that it was originally designed for single-process multi-threaded applications, rather than multi-process. Prometheus was developed in the Soundcloud environment and was inspired by Google’s Borgmon. In its original environment, Borgmon relies on straightforward methods of service discovery - where Borg can easily find all jobs running on a cluster.
Before we jump into the specifics of Grafana and Datadog, let's look at the main comparison points. Grafana is a great dashboard that allows you to plug in essentially any data source in the world. Grafana is most commonly paired with Prometheus, Graphite, and Elasticsearch to provide a full APM, time-series, and logs monitoring stack.
Implement the enterprise service management approach to revolutionize how your organization delivers services beyond IT.
Top tips is a weekly column where we highlight what’s trending in the tech world today and list out ways to explore these trends. This week we’re looking at five ways ways you can build upon the basics and start incorporating AI in your everyday. AI technology is now utilized in some form by almost 77% of devices. Nearly every industry has incorporated, or is trying to incorporate, AI in some way or another.