%term

Configuring Cloud Operations on Google Cloud

Jan 25, 2022 By Google Operations In Google Operations

In this video we help you understand the different services in the Google Cloud Operations Suite with a focus on how they help you. We conclude with a customer success story (Krikey gaming).

View Video

Google Operations

Read more about Configuring Cloud Operations on Google Cloud

Webhook, Pub/Sub, and Slack Alerting notification channels launched

Jan 19, 2022 By Alisa Goldstein In Google Operations

When an alert fires from your applications, your team needs to know as soon as possible to mitigate any user-facing issues. Customers with complex operating environments rely on incident management or related services to organize and coordinate their responses to issues. They need the flexibility to route alert notifications to platforms or services in the formats that they can accept.

Read Post

Google Operations

Read more about Webhook, Pub/Sub, and Slack Alerting notification channels launched

Creating custom notifications with Cloud Monitoring and Cloud Run

Jan 19, 2022 By Dong Wang In Google Operations

The uniqueness of each organization in the enterprise IT space creates interesting challenges in how they need to handle alerts. With many commercial tools in the IT Service Management (ITSM) market, and lots of custom internal tools, we equip teams with tools that are both flexible and powerful. This post is for Google Cloud customers who want to deliver Cloud Monitoring alert notifications to third-party services that don’t have supported notification channels.

Read Post

Google Operations

Read more about Creating custom notifications with Cloud Monitoring and Cloud Run

Patterns for better insights and troubleshooting with hybrid cloud logs

Jan 18, 2022 By Meenaxi Gunjati In Google Operations

Hybrid and multi-cloud environments produce a boundless array of logs including application and server logs, logs related to cloud services, APIs, orchestrators, gateways and just about anything else running in the environment. Due to this high volume, logging systems may become slow and unmanageable when you urgently need them to troubleshoot an issue, and even harder to use them to get insights.

Read Post

Google Operations

Read more about Patterns for better insights and troubleshooting with hybrid cloud logs

How to deploy the Google Cloud Ops Agent with Ansible

Jan 12, 2022 By Kyle Benson In Google Operations

Site Reliability Engineering (SRE) and Operations teams responsible for operating virtual machines (VMs) are always looking for ways to provide a more reliable, more scalable environment for their development partners. Part of providing that stable experience is having telemetry data (metrics, logs and traces) from systems and applications so you can monitor and troubleshoot effectively. Many Google Cloud services, including Google Compute Engine, provide basic system metrics out of the box.

Read Post

Google Operations

Read more about How to deploy the Google Cloud Ops Agent with Ansible

How to find cloud logs and manage logging costs

Dec 15, 2021 By Google Operations In Google Operations

We covered best practices for ingesting, centralizing, and managing cloud logs in our previous episode. But how can you quickly find the logs you're looking for when troubleshooting? And how can you manage and optimize your logging costs? In this episode, we'll show you how to use advanced log queries to find the exact logs you're looking for and how to manage logging costs.

View Video

Google Operations

Read more about How to find cloud logs and manage logging costs

Best Practices for Cloud Logging

Dec 1, 2021 By Google Operations In Google Operations

In our last episode, we covered how to best deploy and use Cloud Monitoring. This week, we answer the most important questions about Cloud Logging - what’s the best way to ingest logs? And how do you centralize logs and manage access? Watch this episode of Engineering for Reliability to learn some best practices for using Cloud Logging. Watch to learn how to keep your services reliable and your users happy.

View Video

Google Operations

Read more about Best Practices for Cloud Logging

How Sabre is using SRE to lead a successful digital transformation

Nov 22, 2021 By Kenny Kon In Google Operations

Editor’s note: Today we hear from Kenny Kon, an SRE Director at Sabre. Kenny shares about how they have been able to successfully adopt Google’s SRE framework by leveraging their partnership with Google Cloud. As a leader in the travel industry, Sabre Corporation is driving innovation in the global travel industry and developing solutions that help airlines, hotels, and travel agencies transform the traveler experience and satisfy the ever-evolving needs of its customers.

Read Post

Google Operations

Read more about How Sabre is using SRE to lead a successful digital transformation

Best Practices for Cloud Monitoring

Nov 17, 2021 By Google Operations In Google Operations

In our last episode, we covered best practices for deploying and using Cloud Operations in an enterprise environment. But we still left some questions unanswered. How should you monitor your services? How should you deal with alerts? And what about managing cost? In this episode of Engineering for Reliability, Yuri discusses best practices for setting up and using Cloud Monitoring and optimizing monitoring costs.

View Video

Google Operations

Read more about Best Practices for Cloud Monitoring

Get planet-scale monitoring with Managed Service for Prometheus

Nov 15, 2021 By Lee Yanco In Google Operations

Prometheus, the de facto standard for Kubernetes monitoring, works well for many basic deployments, but managing Prometheus infrastructure can become challenging at scale. As Kubernetes deployments continue to play a bigger role in enterprise IT, scaling Prometheus for a large number of metrics across a global footprint has become a pressing need for many organizations.

Read Post

Google Operations

Read more about Get planet-scale monitoring with Managed Service for Prometheus

Operations | Monitoring | ITSM | DevOps | Cloud

Configuring Cloud Operations on Google Cloud

Webhook, Pub/Sub, and Slack Alerting notification channels launched

Creating custom notifications with Cloud Monitoring and Cloud Run

Patterns for better insights and troubleshooting with hybrid cloud logs

How to deploy the Google Cloud Ops Agent with Ansible

How to find cloud logs and manage logging costs

Best Practices for Cloud Logging

How Sabre is using SRE to lead a successful digital transformation

Best Practices for Cloud Monitoring

Get planet-scale monitoring with Managed Service for Prometheus

Monthly Archive

Follow Us