Operations | Monitoring | ITSM | DevOps | Cloud

Google Operations

Better monitoring and logging for Compute Engine VMs

Over the past several months we’ve been focused on improving observability and operations workflows for Compute Engine. Today, we are excited to share the first wave of these enhancements are now available. These include: Significantly improved operating system support for the Cloud Monitoring and Cloud Logging agents. The ability to rapidly deploy, update, and remove agents to groups of VMs, or all of your VMs, by policy, with as little as a single gcloud command.

All together now: Fleet-wide monitoring for your Compute Engine VMs

Cloud Monitoring has always provided comprehensive visibility and management into individual Compute Engine virtual machines (VMs). But many Google Cloud customers have hundreds, thousands, or tens of thousands of VMs that they need to manage. Cloud Monitoring now gives you zero-config, out-of-the-box visibility into your entire Compute Engine VM fleet, with quick access to advanced Monitoring features such as installing the Cloud Monitoring agent and configuring fleetwide alerts.

Tips and tricks for using new RegEx support in Cloud Logging

One of the most frequent questions customers ask is “how do I find this in my logs?”—often followed by a request to use regular expressions in addition to our logging query language. We’re delighted to announce that we recently added support for regular expressions to our query language — now you can search through your logs using the same powerful language selectors as you use in your tooling and software!

Analyze your logs quickly with suggested queries beta in Cloud Logging

Cloud Logging is a popular tool to help developers, operators, and other users identify and find the root cause of issues in their infrastructure. With features like the Logs Explorer, you can quickly and efficiently retrieve, view, and analyze logs. To help you get the most out of your logs, we’re excited to introduce suggested queries in Cloud Logging to help highlight important logs, so you can start analyzing and troubleshoot issues quickly.

Monitoring as code with Terraform

We try to automate as much as possible in our environments, but we often treat monitoring as an afterthought. In this episode of Stack Doctor, we show you how to automate your monitoring configurations via Terraform. Watch to learn how you can automate the creation of common resources - such as uptime checks, alerting policies, and dashboards - with Terraform!

Extended retention for custom and Prometheus metrics in Cloud Monitoring

Metrics help you understand how your business and applications are performing. Longer metric retention enables quarter-over-quarter or year-over-year analysis and reporting, forecasting seasonal trends, retention for compliance, and much more. We recently announced the general availability (GA) of extended metric retention for custom and Prometheus metrics in Cloud Monitoring, increasing retention from 6 weeks to 24 months. Extended retention for custom and Prometheus metrics is enabled by default.

High-resolution user-defined metrics in Cloud Monitoring

Higher resolution metrics are critical for monitoring dynamically changing environments and rapidly changing application metrics. Examples where high resolution metrics are critical include high volume e-commerce, live streaming, autoscaling bursty workloads on Kubernetes clusters, and more. Higher resolution custom, Prometheus, and agent metrics are now generally available, and can be written at a granularity of 10 seconds. Previously these metric types could only be written once every 60 seconds.

Getting started with Cloud Logging

Want to make sure that your cloud services are free from any vulnerabilities, threats, or errors that can make it unreliable? In this episode of Stack Doctor, we show you the new features in Google Cloud Logging, teach you how to navigate the new and improved Logs Viewer and build log queries, and give you an in-depth analysis of the Log Router. Watch to learn what’s new with Cloud Logging!

Bucket list: Better log storage and management for Cloud Logging

As more organizations move to the cloud, the volume of machine generated data has grown exponentially and is increasingly important for many teams. Software engineers and SREs rely on logs to develop new applications and troubleshoot existing apps to meet reliability targets. Security operators depend on logs to find and address threats and meet compliance needs. And well structured logs provide invaluable insight that can fuel business growth.

Debugging, distributed tracing, and profiling for web applications

Google Cloud offers many tools that can help you manage your application services. In this video, we teach you how to set up and utilize Cloud Trace, Cloud Profiler, and Cloud Debugger to collect latency data across different services, memory-allocation information, and inspect application code locations without compromising the performance of your web application.