Latest Posts

Building a more reliable infrastructure with new Stackdriver tools and partners

Oct 11, 2018 By Melody Meckfessel In Google Operations

Every software organization faces challenges in keeping applications available and running reliably. At Google, we’ve developed and practiced a discipline known as Site Reliability Engineering (SRE). Following SRE practices lets us build and operate services reliably for our billions of users. Google has about 2,500 Site Reliability Engineers who support both internal and external services.

Read Post

Google Operations

Read more about Building a more reliable infrastructure with new Stackdriver tools and partners

Using Stackdriver Workspaces to help manage your hybrid and multicloud environment

Sep 12, 2018 By Charles Baer In Google Operations

At Google, we believe strongly in an open cloud. We’re continually working to bring you tools for understanding how your applications are performing, whether they run in different projects, organizations, clouds, or even on prem. Monitoring tools like Stackdriver Kubernetes Monitoring, OpenCensus, and Stackdriver APM are designed to help you get visibility into your workloads wherever they run—on Google Cloud Platform (GCP), on-premises or on another cloud platform.

Read Post

Google Operations

Read more about Using Stackdriver Workspaces to help manage your hybrid and multicloud environment

Drilling down into Stackdriver Service Monitoring

Jul 30, 2018 By Jay Judkowitz In Google Operations

If you’re responsible for application performance and availability, you know how hard it can be to see it through the eyes of your customers and end users. We think that’s really going to change with last week’s introduction of Stackdriver Service Monitoring, a new tool for monitoring how your customers perceive your applications, and that then lets you drill down to the underlying infrastructure when there’s a problem.

Read Post

Google Operations

Read more about Drilling down into Stackdriver Service Monitoring

Transparent SLIs: See Google Cloud the way your application experiences it

Jul 27, 2018 By Jay Judkowitz In Google Operations

Like all good IT organizations, you religiously measure the performance and availability of your services and applications. But if those apps run in the cloud, critical components are often delivered by a third party or the cloud provider. In the case of a service disruption or degraded performance, how do you know what the problem is—your code, the network, or the provider? And, if the problem is with the service provider, how do you convince them to take action as quickly as possible?

Read Post

Google Operations

Read more about Transparent SLIs: See Google Cloud the way your application experiences it

SRE fundamentals: SLIs, SLAs and SLOs

Jul 19, 2018 By Jay Judkowitz In Google Operations

Next week at Google Cloud Next ‘18, you’ll be hearing about new ways to think about and ensure the availability of your applications. A big part of that is establishing and monitoring service-level metrics—something that our Site Reliability Engineering (SRE) team does day in and day out here at Google.

Read Post

Google Operations

DevOps
Blog

Read more about SRE fundamentals: SLIs, SLAs and SLOs

How to connect Stackdriver to external monitoring

Jun 21, 2018 By Valentin Hamburger In Google Operations

Google Stackdriver lets you track your cloud-powered applications with monitoring, logging and diagnostics. Using Stackdriver to monitor Google Cloud Platform (GCP) or Amazon Web Services (AWS) projects has many advantages—you get detailed performance data and can set up tailored alerts. However, we know from our customers that many businesses are bridging cloud and on-premises environments.

Read Post

Google Operations

Read more about How to connect Stackdriver to external monitoring

Try full-stack monitoring with Stackdriver on us

Jun 15, 2018 By JD Velásquez In Google Operations

In advance of the new simplified Stackdriver pricing that will go into effect on June 30, we want to make sure everyone gets a chance to try Stackdriver. That’s why we’ve decided to offer the full power of Stackdriver, including premium monitoring, logging and application performance management (APM), to all customers—new and existing—for free until the new pricing goes into effect. This offer will be available starting June 18.

Read Post

Google Operations

Read more about Try full-stack monitoring with Stackdriver on us

Gain visibility and take control of Stackdriver costs with new metrics and tools

May 29, 2018 By Mary Koes In Google Operations

A few months back, we announced new simplified Stackdriver pricing that will go into effect on June 30. We’re excited to bring this change to our users. To streamline this change, you’ll receive advanced notifications and alerting on the performance and diagnostics data you track for cloud applications, plus flexibility in creating dashboards, without having to opt in to the premium pricing tier.

Read Post

Google Operations

Read more about Gain visibility and take control of Stackdriver costs with new metrics and tools

Stackdriver brings powerful alerting capabilities to the condition editor UI

May 25, 2018 By Amir Hermelin In Google Operations

We are excited to announce the beta version of our new alerting condition configuration UI. In addition to allowing you to define alerting conditions more precisely, this new UI provides an easier, more visual way to find the metrics to alert on. The new UI lets you use the same metrics selector as used in Stackdriver’s Metrics Explorer to define a broader set of conditions. Starting today, you can use that metrics selector to create and edit threshold conditions for alerting policies.

Read Post

Google Operations

Read more about Stackdriver brings powerful alerting capabilities to the condition editor UI

Getting more value from your Stackdriver logs with structured data

May 17, 2018 By Mary Koes In Google Operations

Logs contain some of the most valuable data available to developers, DevOps practitioners, Site Reliability Engineers (SREs) and security teams, particularly when troubleshooting an incident. It’s not always easy to extract and use, though. One common challenge is that many log entries are blobs of unstructured text, making it difficult to extract the relevant information when you need it.

Read Post

Google Operations

Read more about Getting more value from your Stackdriver logs with structured data

Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Building a more reliable infrastructure with new Stackdriver tools and partners

Using Stackdriver Workspaces to help manage your hybrid and multicloud environment

Drilling down into Stackdriver Service Monitoring

Transparent SLIs: See Google Cloud the way your application experiences it

SRE fundamentals: SLIs, SLAs and SLOs

How to connect Stackdriver to external monitoring

Try full-stack monitoring with Stackdriver on us

Gain visibility and take control of Stackdriver costs with new metrics and tools

Stackdriver brings powerful alerting capabilities to the condition editor UI

Getting more value from your Stackdriver logs with structured data

Monthly Archive

Follow Us