Operations | Monitoring | ITSM | DevOps | Cloud

Latest Videos

Datadog on Kafka

As a company, Datadog ingests trillions of data points per day. Kafka is the messaging persistence layer underlying many of our high-traffic services. Consequently, our Kafka usage is quite high: double-digit gigabytes per second bandwidth and the need for petabytes of high performance storage, even for relatively short retention windows. In this episode, we’ll speak with two engineers responsible for scaling the Kafka infrastructure within Datadog, Balthazar Rouberol and Jamie Alquiza. They'll share their strategy in scaling Kafka, how it’s been deployed on Kubernetes, and introduce kafka-kit; our open source toolkit for scaling Kafka clusters. You'll leave with lessons learned while scaling persistent storage on modern orchestrated infrastructure, and actionable insights you can apply at your organization

Introduction to Service Request Automation

A brief introduction to Service Request Automation using the Kelverion Runbook Suite. The Kelverion Runbook Suite provides a cloud automation platform with a range of automation tools including; a rich graphical design experience, smart integrations, ready built solutions and the option of an easy to configure self-service automation portal.

Meet Flowmon Packet Investigator

Flowmon Packet Investigator (FPI) is an automated network traffic auditing tool that records and interprets full packet data. Where flow data is not sufficient, and more detail is needed, the Investigator captures all the packets of traffic surrounding the event for in-depth troubleshooting. What sets the Investigator apart, is built-in expert knowledge. It not only provides extensive details but automates the analysis, assessing the captured events, looking for error codes, and providing explanations and suggestions for a remedy.

Integrating Traces and Logs with OpenTelemetry - Stack Doctor

Tracing is a great way to monitor your services, but how does one go about fixing latency issues in a specific service? In this episode of Stack Doctor, Yuri Grinshteyn shows you how to connect traces with logs via OpenTelemetry and Cloud Trace and Logging, enabling you to pinpoint and debug service latency issues in a snap!

How to Document Your Medical Staff Can Access EHR Applications Anywhere

Learn how Goliath provides Health IT Pros the ability to document EHR application availability, troubleshoot end-user experience issues quickly, and report to management on end-user productivity and the overall user experience.

Zenduty - Incident Priorities and SLAs

Incident Priorities and SLAs in Zenduty Incident SLAs let you set acknowledgement and resolution SLAs for your incidents. SLAs allow your teams to prioritize incidents as well as increase transparency amongst incident stakeholders - support, account managers and management. Incident priority is the sequence in which an Incident or Problem needs to be resolved, based on Impact and Urgency. Priority also defines response and resolution targets associated with Service Level Agreements. Each team in Zenduty can define their own priorities like P0/P1/P2/P3 or L0/L4/L16 etc.