Operations | Monitoring | ITSM | DevOps | Cloud

A Day in the Life: Intelligent Observability at Work with a Super SRE

After we’d fixed Aparna’s network issue, James came to see me at my desk. Masks on, socially distanced and all that, but it was nice to have some face-to-face time. James is cool – that dry British humor and not your classic IT Ops dude. He’s been here forever and mentored me when the CIO, Charlie, hired me as the first SRE here a year or so ago. I lucked out really.

AWS CloudWatch alerts vs. Dashbird alerts

In the 21st century, it’s quite easy to manipulate machines and computers. Our worries are no longer if something is doable, but if something can be perfected. Therefore, we mostly search for new ideas and ways to make our work impeccable. For example, if you’re using a particular software and you realize that the software is excellent, but it could be better in some ways that would allow you to work even faster, you’ll explore the alternatives.

What Is the OSI Model?

As an IT professional, chances are you’ve come across the phrase Please Do Not Throw Sausage Pizza Away while hearing about protocols, network design, and implementation issues. How about Or Please Do Not Touch Steve’s Pet Alligator? If these ring a bell then you’re on the right track: These are smart memory aids linked to the seven layers of the Open System Interconnection (OSI) model.

Introduction to Giraffe

Giraffe is InfluxData’s graphing library, built to use and graph the data coming from InfluxData’s time series database, InfluxDB. Yes, there are other graphing libraries available; but ours is the only one purpose-built to graph line protocol without having to convert it. Plus, we have lots of great features, like legends and colorization, without much configuration. So, how to get started?

Adding Rich Content to Alerts, Work Orders or Service Requests

When you send alerts, work orders or service requests to your workers in the field, on the shop floor or campus it is essential to provide them with all relevant information necessary to solve the task. This prevents misunderstandings, avoids waste work, time for searching information and thus increases productivity and facilities an effective, timely incident resolution.

Import and Export for OnCall Times

On-call planning is one of the most popular features in Enterprise Alert and is widely used by users, team managers and administrators. However, in our discussions we keep finding that it is not simply done with 5 minutes of planning. Scheduling often depend on external systems. This can range from a simple excel form provided to HR all the way to a comprehensive billing system such as SAP. As a result, it takes a quite a bit of time to transfer the planned shifts to third-party systems.

Why do I need to switch to Firebase?

Apple announced some time ago that the Apple Push Notification (APN) will be deactivated for sending push messages as of March 31, 2021. To continue to ensure the sending of push messages to iOS devices, we have already implemented push shipping via Firebase in Enterprise Alert 2019. Unfortunately, the change could not be done automatically and requires manual intervention.

IAM Policy Basics and Best Practices

One of the most powerful aspects of AWS is their Identity and Access Management (IAM) service. The obvious aspect of its power is that it controls who can do what with all the resources inside your AWS account. But the non-obvious side is how configurable it is. You can encode permissions that are so finely grained that a Lambda Function could, for example, be given just enough permissions to be able to read one attribute from one record for the current user of a DynamoDB Table.

Monitor Microsoft 365 with RapDev's integration in the Datadog Marketplace

Microsoft 365, formerly known as Office 365, is a suite of cloud-based productivity and communication services that is used by more than one million companies worldwide. The applications included in the suite are critical to the daily workflows of subscribers and therefore require careful monitoring in order to minimize the effects of downtime and ensure optimal usage.