Operations | Monitoring | ITSM | DevOps | Cloud

Honeycomb + Google Gemini

Today at Google Next, Charity Majors demonstrated how to use Honeycomb to find unexpected problems in our generative AI integration. Software components that integrate with AI products like Google’s Gemini are powerful in their ability to surprise us. Nondeterministic behavior means there is no such thing as “fully tested.” Never has there been more of a need for testing in production!

Setting Up the Latest AWS Observability Solution

The tutorial demonstrates how easy it is to deploy the AWS Observability Solution using the CloudFormation template using the quick and new method. The CloudFormation template being used in this method sets up an automated collection of logs and metrics from AWS to the Sumo Logic service.

Monitor Complex User Flows with Checkly's Multistep Checks

Learn how Checkly's new multistep checks help you to decrease incident response times with synthetic monitoring. Use multistep checks to chain and manage multiple API requests, run custom code for response validation, and get accurate alerts when incidents occur. This video explains how to create a multistep check to monitor a RESTful API from scratch. Do you have questions? Join our vibrant Checkly community on Slack and explore further!

How to standardize resiliency on Kubernetes

There’s more pressure than ever to deliver high-availability Kubernetes systems, but there’s a combination of organizational and technological hurdles that make this ‌easier said than done. Technologically, Kubernetes is complex and ephemeral, with deployments that span infrastructure, cluster, node, and pod layers. And like with any complex and ephemeral system, the large amount of constantly-changing parts opens the possibility for sudden, unexpected failures.

Stay up to date on the latest incidents with Bits AI

Since the release of ChatGPT, there’s been growing excitement about the potential of generative AI—a class of artificial intelligence trained on pre-existing datasets to generate text, images, videos, and other media—to transform global businesses. Last year, we released our own generative AI-powered DevOps copilot called Bits AI in private beta. Bits AI provides a conversational UI to explore observability data using natural language.

Maximizing Cloud SQL database availability

How does Cloud SQL achieve near-zero downtime? Join Debi Cabrera as she interviews Product Manager, Rahul Deshmukh. Rahul discusses the various capabilities of Cloud SQL and the best practices to maximize business continuity for applications. Watch along and hear firsthand from the session speaker about configuring and monitoring Cloud SQL for maximum availability.

Step-by-Step Guide to Monitoring Your SNMP Devices With Telegraf

Monitoring SNMP (Simple Network Management Protocol) devices is crucial for maintaining network health and security, enabling early detection of issues and proactive troubleshooting. Continuous monitoring ensures efficient resource utilization, minimizes downtime, and enhances overall network performance. In this article, we'll detail how to use the Telegraf agent to collect SNMP (MIB) performance statistics that you can forward to a data source.