Operations | Monitoring | ITSM | DevOps | Cloud

Blog

Extending Stackdriver to on-prem with the new BindPlane integration

We introduced our partnership with Blue Medora last year, and explained in a blog post how it extends Stackdriver’s capabilities. We’re pleased to announce that you can now join our new offering for Blue Medora. If you’re using Stackdriver to monitor your Google Cloud Platform (GCP) or Amazon Web Services (AWS) resources, you can now extend your observability to on-prem infrastructure, Microsoft Azure, databases, hardware devices and more.

Office 365 Suffers Multiple Outages for Start of 2019

Unfortunately, Microsoft and Office 365 suffered their second major outage of the year and this one was even bigger than the first. We say “unfortunately” because even though our business is to help monitor cloud and SaaS services and our business goes up when there are problems, we don’t wish an outage on any cloud provider. Operating a SaaS business at the scale of Microsoft Office 365 is a herculean task and that’s why they get paid the big bucks.

How to Identify Orphaned EBS Snapshots to Optimize AWS Costs

So a while back I got an email from our finance team. I was tasked to assist with tagging resources in our AWS infrastructure and investigate which items are contributing to certain costs. I don’t know about other engineers, but these kinds of tasks are on the same realm of fun as … wiping bird poop off your windshield at a gas station. So I did the sanest thing I could think of.

Escalations and Maintenance Windows Are Critical to Downtime Response

Uptime.com includes several advanced check options to provide the flexibility organizations need in creating a response plan to downtime. Maintenance and planned downtime for patches and updates don’t typically create severe downtime events. With escalations, teams have an automated alert system that contacts designated senior-level personnel with relevant technical data.

Automate Tasks with AWS Systems Manager and Opsgenie Actions: A Use Case

Opsgenie Actions enable you to automate manual, repetitive tasks so that your resources are freed up to concentrate on higher-value work. This blog post is the first in a series of use cases in which we discuss how Opsgenie works with various third-party automation platforms to automate these traditionally manual tasks—right from the Opsgenie console or mobile app— to reduce interruptions for your on-call responders, and ultimately help your bottom line.

The 4 Requirements of a Better Digital Experience

Better employee experience drives better business outcomes1. The result? Technology is no longer the driving force of IT — instead, the end-users’ digital experience is the key to unlocking business value and driving ROI. The challenge? Mending the gap between traditional metric-based monitoring and the need for real-time, contextual data about the end-user experience. Here’s how to get started.

Windows Management Instrumentation in remote monitoring

Pandora FMS features include decentralized monitoring, which is based on several standards and/or protocols of common and open use (SNMP v1 and v2 -v3 from version 7.0 NG 727-, ICMP and WMI). In this article we will talk about the latter, starting from the simplest and with references to each of the articles published here in your blog.

Coding with Confidence - CloudBees + Honeycomb

DevOps, Observability, Continuous Delivery, Test in Production, Chaos Engineering, and Software Ownership are all major themes in software development today, but why? In an ideal world, we get everything right the first time, nothing breaks, no one DDOS’ us, and the weather report is “Cloudy With A Chance of Meatballs.” Reality of course is different – and better, to be honest.