Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Lumigo Introduces AI to Simplify Observability Workflows

Lumigo is expanding its troubleshooting and observability platform with cutting-edge AI-powered tooling, now available in beta, which will provide developers and DevOps teams with the fastest and most cost-efficient way to debug and observe complex microservices. AI is quickly reshaping the technology landscape. However, observability tools have been slow to find ways to leverage AI in a fashion that provides tangible value.

With AppNeta, ResultsCX Decreases Network Performance Triage Time by 90%

In order to deliver its differentiated, boutique level of customer care services, the team at ResultsCX has had to navigate some challenges in recent years that teams in many organizations can relate to. The organization relies extensively and constantly on its network connections—and outages and poor performance can be a big problem. This post offers an introduction to the challenges the company was facing, and it reveals how AppNeta by Broadcom delivered the solution they needed.

Introducing a New, Zero-Touch Way to Manage Your DX NetOps Upgrades

For every customer who has an existing DX NetOps solution deployed, an upgrade can be a daunting task. Even for seasoned administrators, the process of logging into each box, running the pre-checks, and then executing the installers can be tedious. With the solution’s support for zero-touch administration (ZTA), the effort becomes easier. Now, you can plan, test, and then finally upgrade your deployment versions in one session.

Strategies for Lowering Observability Costs

Learn how to cut IT observability costs with OpenTelemetry. We'll cover ways to streamline data collection, reduce hidden expenses, and optimize data management. Discover practical tips for handling telemetry data efficiently, avoiding vendor lock-in, and improving system performance. Watch this video for actionable insights and real-world examples of using OpenTelemetry to manage costs effectively.

Understanding Network Traffic Blockages in AWS

In this post, explore the challenges of diagnosing network traffic blockages in AWS due to the complex and dynamic nature of cloud networks. Learn how Kentik addresses these issues by integrating AWS flow data, metrics, and security policies into a single view, allowing engineers to quickly identify the source of blockages enhancing visibility and speeding up the resolution process.

Syslog 101: Everything You Need To Know

System logging protocol, abbreviated as Syslog, is a standard protocol used for message logging. Put simply, it is a standard for collecting and storing log information. A Syslog server collects, parses, stores, examines, and dispatches log messages from devices including routers, switches, firewalls, Linux/Unix hosts, and Windows machines.

Observability vs Monitoring [Understanding the Key Differences in 2024]

When systems fail, it's not just a technical hiccup – it's a business problem. Downtime means unhappy customers and lost revenue. That's why teams need effective ways to spot issues fast and fix them even faster. This is where monitoring and observability come into play. Monitoring and observability are two key approaches to keeping your systems running smoothly. Monitoring is like your system's alarm bell – it tells you when something's wrong.

OpsRamp and HPE-One Year Later: An Analyst's Perspective

In March 2023, Hewlett Packard Enterprise (‘HPE’) announced the acquisition of OpsRamp, subsequently closing the deal in May that year. Founded in 2014, OpsRamp is an award-winning solution that enables IT operations, site reliability engineeering (SRE), cloud operations, and DevOps teams, and other stakeholders to better detect, remediate, predict, and prevent slowdowns and outages across physical, virtual, and cloud systems.

Incident Template Library

We recently announced a new feature to enhance how you communicate with your users during maintenance, incidents, and general service updates. Status Page Templates allows you to save and re-use status updates - but how do you know what incidents might happen or what updates you need to keep users informed about until it's too late? We have put together a library of ready-to-use templates designed to keep your users informed with clear, concise and consistent messaging.