Operations | Monitoring | ITSM | DevOps | Cloud

Latest Blogs

How you can take back control over your log analytics with AI

We’ve all been there — you’re on-call, fast asleep at 3 AM when suddenly, in comes the alerts–in overdrive. Your system is notifying you of some sort of abnormal behavior, but with all the alerts and data coming through, its difficult to figure out what your system is trying to tell you. Is there potential malicious behavior? Did someone write faulty code? Is it an important issue or can it wait? Is it nothing at all?

Using Skylight to Solve Real-World Performance Problems [Part I: OSEM]

Every single app — large or small, open source or not — has room for improvement when it comes to performance. This is why we created Skylight for Open Source to give open source contributors the tools they need to find these issues. Over the next week, we'll show you three different open source apps running on Skylight, each with their own unique performance challenges, varying in complexity.

Why building internal tools could become a costly mistake

Having worked closely with software developers for almost a decade, I’ve noticed some common traits amongst them. Technically minded people think about problems in different ways. I’m often stunned how I could miss such an obvious data point or edge case when discussing product changes with people who have a far greater technical mind than myself.

A holding company for side projects

For as long as I can remember, I have loved building businesses. Ideas have always come naturally to me, and over the years I have honed my skills at actually making those ideas a reality. I recall one of my first businesses at about 12 or 13 years old, designing nicer looking versions of property data sheets for real estate agents to give out to prospective buyers. My most recent profitable business is StatusGator, a status page monitoring and alerting service.

7 Tips to Get New Engineers Ready to Be On-Call

Before the philosophy of DevOps, developers would build products, services, and infrastructures , but the responsibility for maintaining them would shift to operators, aka system or IT admins. The DevOps philosophy removes the boundary between Operations and Development teams, making system reliability a shared responsibility of all parties.

Getting more value from your Stackdriver logs with structured data

Logs contain some of the most valuable data available to developers, DevOps practitioners, Site Reliability Engineers (SREs) and security teams, particularly when troubleshooting an incident. It’s not always easy to extract and use, though. One common challenge is that many log entries are blobs of unstructured text, making it difficult to extract the relevant information when you need it.