Operations | Monitoring | ITSM | DevOps | Cloud

Does Observability Throw You for a Loop? Part Two: Close with Controllability

In part one, we introduced the duality of observability, controllability. As a reminder, observability is the ability to infer the internal state of a "machine” from externally exposed signals. Controllability is the ability to control input to direct the internal state to the desired outcome. So observability is a loop problem. And we need to stop treating it as the end state of our challenge in delivering performant, quality experiences to our users and customers.

Adapting to The New Normal in IT Operations

The waves of change are certainly upon us and businesses are being forced to adapt at a record pace. Current world events have caused a jarring shift in all aspects of our lives, accelerating major changes in how we live and work. An unprecedented number of people are now working from home. Those of us working in IT Operations are no exception. Many companies are implementing a Distributed IT Operations Center (D-NOC) approach to address this new reality.

Monitor Apache Flink with Datadog

Apache Flink is an open source framework, written in Java and Scala, for stateful processing of real-time and batch data streams. Flink offers robust libraries and layered APIs for building scalable, event-driven applications for data analytics, data processing, and more. You can run Flink as a standalone cluster or use infrastructure management technologies such as Mesos and Kubernetes.

Incident Response in the time of Remote Work

The unexpected and sudden shift to remote working introduces a new set of problems within the incident response space. And while each organization needs to take its own unique circumstances into account, this post outlines the best practices and steps that can be taken in the right direction in keeping operations both productive and proactive.

Modern shadow IT demands visibility, not control

“Shadow IT” can be a divisive subject depending on how long you’ve been in the IT field. There is a legacy attitude within many IT teams that shadow IT must be controlled – but it can bring significant benefits to an organization. Modern IT teams understand these benefits, and focus on balancing shadow IT’s value and risk. Moving past that legacy attitude and developing a modern IT mentality in your organization can be difficult.

Investing In Our Partnership with AWS

Even with a complete understanding of the benefits that come with running a hybrid environment, companies are still challenged with digital transformation best practices: what to move, when to move it, what’s being spent, how it’s performing, and what’s being overutilized and underutilized. This is why Amazon Web Services (AWS) is one of the most strategic partners in LogicMonitor’s ecosystem.

Top 10 Reasons Why NMS is A Must Have

Nothing’s worse than getting a call from the users that the network is down. Too often, IT lacks the visibility they need to get before performance issues arise, meaning you’re in the dark until a user or customer calls to complain. Once an outage happens, the clock is ticking. And the more time you take to understand and resolve the issue, the more it costs you: in terms of customer dissatisfaction, and also staff time & lost productivity.

Challenges using Prometheus at scale

This article will cover the most common challenges you might find when trying to use Prometheus at scale. Prometheus is one of the foundations of the cloud-native environment. It has become the de-facto standard for visibility in Kubernetes environments, creating a new category called Prometheus monitoring. The Prometheus journey is usually tied to the Kubernetes journey and the different development stages, from proof of concept to production.