Operations | Monitoring | ITSM | DevOps | Cloud

%term

How to Ensure Your App Is Online and Working Properly

Your application is deployed. You checked a few endpoints, and they work as expected. You can log in and see the generic home page. There aren’t any exceptions in the logs—or at least any new ones. Great! But what does that mean for your customers and partners? Does everything work for them?

When is a website considered down ...as opposed to just slow?

When you visit a webpage that is down, most of the time you'll see an error: you'd see a 404 error if the page can't be found or a 503 if the server isn't unavailable. Although this is not what you want to see, it is helpful. You know that the site is down and have a rough idea why. But sometimes you don't see an error... just a spinning wheel.

How the New Influx Query Engine Was Designed-And How to Use It With Grafana

Flux, the long-awaited new functional query processing engine for InfluxDB, has finally landed. If you’re curious to learn more about the hows and whys of its design, check out this GrafanaCon EU session with InfluxData Cofounder and CTO Paul Dix. Also we’d like to share a recent presentation from David Kaltschmidt, Director, UX for Grafana Labs on the new Flux support in Grafana!

Logz.io Releases DevOps Pulse 2018 for SysAdmin Day, Revealing Most DevOps Departments Fall Short in Security

Boston and Tel Aviv, July 26, 2018 — Logz.io, the leader in AI-powered log analysis, releases the results of their annual DevOps Pulse survey in honor of SysAdmin Day 2018, a day dedicated to honoring the work of SysAdmins and DevOps professionals across the globe. This year, the survey emphasized security and compliance in light of General Data Protection Regulation (GDPR) enforcement and worldwide concerns over data privacy and the growing threat landscape.

Visualizing Network Topologies and Traffic (Cloud Next '18)

In this session, we will look at which use cases in the field of network monitoring and management are relevant in a cloud environment and which data Google Cloud Platform provides to gain insights. We will then demo how to visualize traffic flows and topologies using a mix of Google and Open Source tools.

Optimizing and Troubleshooting Your Application, the Google Way (Cloud Next '18)

In this session, you’ll learn about the value of these kinds of tools, how you can automatically extract telemetry from your app with OpenCensus, and will receive a demonstration of how to solve customer issues in a multi-cloud deployment with Stackdriver APM and other tools supported by OpenCensus.

Improving Reliability with Error Budgets, Metrics, and Tracing in Stackdriver (Cloud Next '18)

Members of the Stackdriver and Customer Reliability Engineering teams will demonstrate how Stackdriver tooling inspired by the needs of SREs at Google brings you the ability to run services more reliability and with fewer false positive signals through tracking and alerting upon error budgets and debugging with the exemplar technique during an outage.