Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Data Gravity in Cloud Networks: Distributed Gravity and Network Observability

So far in this series, I’ve outlined how a scaling enterprise’s accumulation of data (data gravity) struggles against three consistent forces: cost, performance, and reliability. This struggle changes an enterprise; this is “digital transformation,” affecting everything from how business domains are represented in IT to software architectures, development and deployment models, and even personnel structures.

Exploring Your Network Data With Kentik Data Explorer

A cornerstone of network observability is the ability to ask any question of your network. That means having an unbound capacity to explore the tremendous amount and variety of network telemetry you collect. It means seeing trends and patterns from a macro level, but it also means getting very granular to pursue any line of analysis of your data. Collecting information from flow records, SNMP, streaming telemetry, BGP, eBPF, and so on is indeed very important.

The Russification of Ukrainian IP Registration

Last summer we teamed up with the New York Times to analyze the re-routing of internet service to Kherson, a region in southern Ukraine that was, at the time, under Russian occupation. In my accompanying blog post, I described how that development mirrored what took place following Russia’s annexation of Crimea in 2014.

Implementing a Cost-aware Cloud Networking Infrastructure

Cloud networking is the IT infrastructure necessary to host or interact with applications and services in public or private clouds, typically via the internet. It’s an umbrella term for the devices and strategies that connect all variations of on-premise, edge, and cloud-based services.

The Consolidation of Networking Tasks in Engineering

In recent years, the rapid development of cloud-based networking, network abstractions such as SD-WAN, and controller-based campus networking has meant that basic, day-to-day network operations have become easier for non-network engineers. The result we’re starting to see today is a sort of consolidation of networking tasks, leading to a need for only a small number of highly skilled network engineers to handle the less frequent heavy lifting of advanced design and troubleshooting.

Gathering, Understanding, and Using Traffic Telemetry for Network Observability

Traffic telemetry is the data collected from network devices and used for analysis. With traffic telemetry, engineers can gain real-time visibility into traffic patterns, correlate events, and make predictions of future traffic patterns. As a critical input to a network observability platform, this data can help monitor and optimize network performance, troubleshoot issues, and detect security threats. However, traffic telemetry can be difficult to understand.

Data Gravity in Cloud Networks: Achieving Escape Velocity

In an ideal world, organizations can establish a single, citadel-like data center that accumulates data and hosts their applications and all associated services, all while enjoying a customer base that is also geographically close. As this data grows in mass and gravity, it’s okay because all the new services, applications, and customers will continue to be just as close to the data. This is the “have your cake and eat it too” scenario for a scaling business’s IT.

Digging Into the Recent Azure Outage

In the early hours of Wednesday, January 25, Microsoft’s public cloud suffered a major outage that disrupted their cloud-based services and popular applications such as Sharepoint, Teams, and Office 365. Microsoft has since blamed the outage on a flawed router command which took down a significant portion of the cloud’s connectivity beginning at 07:09 UTC.

Best Practices for Enriching Network Telemetry to Support Network Observability

Network observability is critical. You need the ability to answer any question about your network—across clouds, on-prem, edge locations, and user devices—quickly and easily. But network observability is not always easy. To be successful, you need to collect network telemetry, and that telemetry needs to be extensive and diverse. And once you have that raw telemetry data, you need to interpret it.