Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Network Bandwidth vs. Capacity: What's Slowing Down Your Network?

Understanding network performance can be a challenge, especially when certain terms are frequently discussed but often misunderstood. For many, the concepts of network bandwidth and capacity are at the forefront of any networking conversation. These terms are commonly used, and their significance is often highlighted in network performance discussions. However, they are just pieces of a much larger puzzle – one that this article will help you realize.

Navigating VMware licensing changes with SquaredUp: Insights from a Global IT service provider

The recent changes in VMware's licensing model under Broadcom have introduced new complexities for IT teams worldwide. The shift from perpetual to subscription-based licensing has raised concerns about cost management, compliance, and resource optimization. One global IT service provider has leveraged SquaredUp to navigate these challenges, providing a real-world example of how organizations can use SquaredUp to adapt to these changes and maximize efficiency.

The Evolution of Engineering and the Role of Observability 2.0 in Shaping the Future

Engineering has come a long way since the days of delivering discrete, point-in-time products that were often packaged on a CD and shipped to customers. The days of physical media and long development cycles are long gone. The advent of cloud computing and the rise of Software-as-a-Service (SaaS) transformed the landscape, creating a new model of continuous development and service delivery. This shift has not only revolutionized how software is developed, but has also redefined the engineer’s role.

How to Use InfluxDB for Real-Time SpringBoot Application Monitoring

Enterprise Java developers understand the frustration of sluggish application performance in production. Diagnosing issues within complex microservice architectures can be a time-consuming nightmare. Thankfully, the popular Java framework SpringBoot provides a robust observability stack to simplify real-time monitoring and analysis. By harnessing the power of libraries and tools such as SpringBoot Actuator, Micrometer with InfluxDB, and Grafana, you can gather meaningful insights easily and quickly.

5 ways teams used BigPanda during the CrowdStrike outage

In the weeks since the Crowdstrike outage brought millions of systems to a halt, countless articles have been written about the cause of the outage, its impact, and the costs companies incur during service disruptions. Nearly every large company had hosts offline due to the faulty update in CrowdStrike’s Falcon software. BigPanda customers were no exception. On July 19, between 04:00 and 07:00 UTC, the BigPanda systems logged an increase in shared incidents.

Alert noise reduction: How to cut through the noise

ITOps and AIOps teams often face an overwhelming volume of notifications, many of which are false positives or low-priority alerts. The constant influx creates a chaotic environment. ITOps and AIOps teams can easily miss critical issues, potentially leading to system failures or prolonged downtime. Spending significant time sifting through irrelevant alerts reduces team efficiency and slows response. Focus on alert noise reduction to ensure that only meaningful and actionable alerts reach your teams.

Navigating the Incident Management Lifecycle: A Complete Guide

Ever wonder why some IT teams can quickly resolve incidents while others struggle? The secret lies in mastering the Incident Management lifecycle. But don’t worry—this isn’t some dull, complicated process only experts can understand. The Incident Management lifecycle is simply a structured approach to handling incidents efficiently. And the best part? You can quickly get the hang of it.

What is ISO 27001 Incident Management? Definition and Process

Managing incidents is crucial to maintaining the security and integrity of an organization's information systems. ISO 27001 Incident Management provides a structured approach to addressing and resolving incidents in a way that minimizes impact and prevents recurrence. This framework doesn't just help organizations respond to incidents—it helps them create a robust system that anticipates and mitigates risks before they escalate.

Grafana Tempo 2.6 release: performance improvements and new TraceQL features

Grafana Tempo 2.6 is here with performance improvements and buckets of new TraceQL features! Watch the video above for an overview of the new TraceQL features, or continue reading to get a quick overview of the latest updates in Tempo. If you’re looking for something more in-depth, don’t hesitate to jump into the Grafana Tempo 2.6 release notes or the changelog.