Operations | Monitoring | ITSM | DevOps | Cloud

LeadDev Live 2021- Habits of highly-performing teams

There is a yawning gap opening up between the best and the rest — the elite top few percent of engineering teams are making incredible gains year over year in reliability and lack of technical drag forces, while the bottom 50% are losing ground. Take an engineer out of an elite-performing team and place them in the bottom 50%, and they become subpar too; take an engineer out of a mediocre team and embed them in an elite team, and they are pulling their weight within the year. I will share with you everything I know — everything that went into building a high-performing team at Honeycomb.

7 Tips On Building And Maintaining An SRE Team In Your Company

In today's "always on" world, Reliability is a primary business KPI. Plant the culture of Reliability by implementing these 7 simple tips to build a solid SRE team in your organization. Many of today’s hottest jobs didn’t exist at the turn of the millennium. Social media managers, data scientists, and growth hackers were never heard of before. Another relatively new job role in demand is that of a Site Reliability Engineer or SRE. The profession is quite new.

Building powerful tailored dashboards: end users, management, infrastructure

In my position, I get to work with a wide variety of organizations that each have a different level of monitoring maturity. But I’ve noticed an emerging pattern that I’ll call the ‘Critical Service Offering’ or ‘Executive Level Status’ dashboard. At their most basic level, these dashboards should communicate the current health of the application, provide some historical context and, most importantly, not be tied to infrastructure monitoring.

Bringing SCOM Override Sprawl Under Control With PowerBI

Have you ever wondered where all your SCOM overrides are stored? Want to easily find the source and target of each override? In this webinar, we showcase our new PowerBI based tool, designed to turn your override spaghetti into orderly overrides. Using our Microsft PowerBI Sankey Diagrams you can easily see your override MPs scope and destination, enabling you to visualize and then take control of your overrides with Easy Tune, our free alert tuning solution.

Take the first step toward SRE with Cloud Operations Sandbox

At Google Cloud, we strive to bring Site Reliability Engineering (SRE) culture to our customers not only through training on organizational best practices, but also with the tools you need to run successful cloud services. Part and parcel of that is comprehensive observability tooling—logging, monitoring, tracing, profiling and debugging—which can help you troubleshoot production issues faster, increase release velocity and improve service reliability.

Truly Doubling down on open source #2

Earlier this week, I wrote a blog stating our intention to fork Kibana and Elasticsearch. This was a huge decision on our end, one that we did not take lightly. A few days have passed since this announcement and I wanted to share how humbled and excited we are with the responses from companies and individuals who are eager to participate and contribute.