Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

How to Propagate OpenTelemetry Trace Headers Over AWS Kinesis: Part 2

In the first article of our series, we explored the importance of trace headers and the complexities involved in their propagation. Now, we shift from theory to practice. This second installment will take you through a hands-on baseline scenario and our initial strategy of propagating the OpenTelemetry trace context in AWS Kinesis by using the PartitionKey parameter.

How to Propagate OpenTelemetry Trace Headers Over AWS Kinesis: Part 1

Welcome to our series on navigating the complexities of trace header propagation with OpenTelemetry in AWS Kinesis. In this 3-part exploration, we'll dive into the critical role of trace headers in distributed systems, discuss the unique challenges presented by AWS Kinesis, and explore innovative solutions that keep your data tracking robust and consistent.

The cost of inaction: A CIO's primer on why investing in Internet Performance Monitoring can't wait

When John Wanamaker famously declared, “When a customer enters my store, forget me. He is king,” he unknowingly coined a mantra that remains as relevant today as it was in the 1900s. This philosophy, rooted in the customer service ideologies of his time, holds true not just for brick-and-mortar stores but also for eCommerce.

Mastering IPM: Key Takeaways from our Best Practices Series

As we conclude our Mastering IPM blog series, it's time to reflect on the wealth of insights we shared. From delving into the critical layers of the Internet Stack to navigating the intricacies of data analysis, each installment has provided valuable perspectives on optimizing digital experiences through Internet Performance Monitoring (IPM). Now, let's distill the key takeaways from the series.

DNS Security: Fortifying the Core of Internet Infrastructure

In an era marked by escalating cyber threats, Domain Name System (DNS) infrastructure security has become a key concern for IT organizations worldwide. Attacks related to DNS infrastructure, such as DNS hijacking, DNS tunneling, and DNS amplification, are on the rise. Many organizations find themselves questioning the robustness of their DNS security protocols.

Mastering IPM: API Monitoring for Digital Resilience

APIs (Application Programming Interfaces) have quietly evolved into the backbone of contemporary business operations, even though it's ironic that most people use APIs without even realizing it. For instance, you're ordering your favorite takeaway online; you tap the payment button, and voilà! Through APIs, your payment information swiftly traverses the digital landscape, promptly reflecting the adjustment in your credit card balance.

Mastering IPM: Protecting Revenue through SLA Monitoring

If you’re an SRE, then you already know your SLOs from your SLAs, not to mention your SLIs. But even if you’re not au fait with those acronyms, you’ll soon discover how widespread and applicable these concepts are in this installment of our IPM Best Practices Series. We’ll explore these concepts in detail and explore how external monitoring can enhance the tracking of Service Level Objectives (SLOs), leading to positive user experiences and informed decision-making.

Mastering IPM: The Essential Customer Experience Monitoring Framework

In the previous installment of our Internet Performance Monitoring (IPM) Best Practices Series, we explored the critical importance of monitoring what matters, from where it matters. Now, we pivot to a core aspect of Internet Resilience: Customer Experience (CX). This blog explores the critical role of IPM in achieving faster Mean Time to Detect (MTTD) and Mean Time to Recover (MTTR).

Accelerating Detection to Resolution: A Case Study in Internet Resilience

Today, any revenue-generating website is like a house of cards, poised to collapse with multiple points of failure. The modern service delivery chain relies on intricate multi-step transactions and third-party API integrations, making the system more complex and interconnected. A single point of failure in the architectural diagram above can lead to slowdowns and outages with tangible consequences on your bottom line.