Operations | Monitoring | ITSM | DevOps | Cloud

Chasing the Rainbow: Towards Unified Service Metrics

As Zendesk migrated from a monolithic application to an ecosystem of hundreds of services, its need for fully unified and standardized observability became a chief concern. In this talk, Senior Principal Engineer Daniel Schierbeck shares how adopting a service mesh has helped Zendesk teams manage its growing number of services while standardizing its observability. He also explains how Zendesk’s approach to monitoring service interactions has evolved as it adopted Datadog metrics and Datadog APM.

Manage metrics & logging costs with Grafana Cloud + Log Volume Explorer demo | ObservabilityCON

Are your SRE and platform teams under pressure to ingest fewer metrics and logs in the name of cost savings? Reducing costs does not have to mean reduced observability. This recording walks through the cost management features in Grafana Cloud that allow you to analyze, attribute, monitor, and optimize your metrics and logs usage – and lower costs – without compromising your observability strategy.

How Pipedrive switched its observability stack to OpenTelemetry & LGTM | ObservabilityCON 2023

The cloud-based CRM company Pipedrive has been relentlessly modernising its observability stack, first adopting Grafana visualisation and Grafana Mimir for Prometheus metrics, then recently completed a migration of its distributed tracing from a third-party SaaS provider to OpenTelemetry and Grafana Tempo, and its logging stack from Graylog to Grafana Loki. Along the way, the team developed its own in-house library to include OpenTelemetry in its roughly 750 microservices.

Grafana SLO Demo: Prioritize critical resources with SLO-driven IRM | ObservabilityCON 2023

A majority of respondents in our Observability Survey said they were using SLOs or moving in that direction. For good reason: By highlighting the most critical error budget burndown, service level objectives (SLOs) can help you prioritize performance issues based on business impact. In this recording, Josh Abreu Mesa and Reem Tariq walk through how Grafana Cloud’s integrated SLO and Incident Response Management (IRM) capabilities can help you identify the most important issues and resolve them quickly.

User-centered observability: load testing, real user monitoring & synthetics | ObservabilityCON 2023

Understanding your end users’ experience with your applications and services is critical, and there are a variety of tools to help. But there are also a number of different use cases: During development or in production? Simulate user behavior or monitor real user behavior? What should you use and when? This recorded session explores when and how to apply load testing, synthetic monitoring, and real user monitoring to gain insights into the end user experience of your critical applications.

Application Observability and Beyla Demo | ObservabilityCON 2023

In cloud native environments, finding and resolving issues across services and between application and infrastructure dependencies can be challenging. In this recording, we provide demos on Grafana Cloud’s latest capabilities for correlating application and infrastructure observability: Application Observability and Beyla — both generally available. You will hear how Grafana unifies and contextualizes service relationships and application and infrastructure dependencies to help you resolve problems faster.

Ship code via Slack approvals

Automate shipping code through Slack, or better yet, skip the manual steps altogether! This video covers three steps to use Sleuth to automate code deployments using Slack. A bonus tip shows how to use Sleuth to automate the promotion of builds from staging to production, but in a safe manner with automatic health checks. Give Sleuth a try and see how we give teams actionable insights on how to improve with no-code automations to instantly ship improvements, and metrics to measure their impact — all in a way that both managers and developers love.

Deploy with Slack

Automate shipping code through Slack, or better yet, skip the manual steps altogether! This preview gives you a taste of what Slack approvals looks like and why you'd want to do it with Sleuth. Give Sleuth a try and see how we give teams actionable insights on how to improve with no-code automations to instantly ship improvements, and metrics to measure their impact — all in a way that both managers and developers love.

Status Pages and Incident Management for IT Enterprise

Ready to revolutionize your IT Enterprise? Look no further! Explore the dynamic world of StatusCast.com, where Status Pages and Incident Management come together to redefine how you handle IT disruptions. Why StatusCast.com? StatusCast.com is not just a tool; it's your strategic partner in maintaining the health and performance of your IT systems. Our platform offers a comprehensive solution for creating informative and visually appealing status pages, ensuring your users are always in the loop during incidents.