Operations | Monitoring | ITSM | DevOps | Cloud

Monitoring

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Maintaining Operational Sanity Across 100+ AWS Accounts | Eric Mann / Ryan Tomac (Vacasa)

At Vacasa, AWS accounts represent the unit of isolation for distinct applications & services in our software ecosystem, providing security benefits and operational autonomy for our teams as we scale. Managing accounts at this scale requires strong DevOps practices to maintain security, operational sanity, and uniform observability across the system. In this talk, we’ll cover the benefits of such an approach, the practices that make it possible, and the important role Datadog plays.

What is AIOps?

AIOps is an approach to managing the exponential growth of IT operations and the complexity of new technology through the application of artificial intelligence (AI). IT infrastructure increasingly relies on complicated deployments, multi-cloud architectures, and huge amounts of data. Traditionally, the tech industry responds to complexity by applying extra brainpower to the problem, bringing in more engineers, developers, and management.

Istio Log Analysis Guide

Istio has quickly become a cornerstone of most Kubernetes clusters. As your container orchestration platform scales, Istio embeds functionality into the fabric of your cluster that makes monitoring, observability, and flexibility much more straightforward. However, it leaves us with our next question – how do we monitor Istio? This Istio log analysis guide will help you get to the bottom of what your Istio platform is doing.

Detailed Insight, Right on Time: Introducing Scheduled Alerts

Logz.io customers, here’s some big product news that we think you’ll be excited to hear. Scheduled Alerts, an altogether new manner of alerting, is coming your way. That’s right, get ready to utilize a whole new world of alerts that weren’t previously available in the Logz.io platform.

How to Restore Databases From Native SQL Server Backups

In my previous post, Native SQL Server Backup Types and How-To Guide, I discussed the main types of native SQL Server backups and various backup options. Backups are critical to restoring databases quickly, but there isn’t much benefit to having backup files sitting around if you aren’t prepared and know when and how to perform the restores.

Reverse Connect for Azure Virtual Desktops (AVD)

There’s something common between AVD and eG Enterprise. Can you take a wild guess? Listening on open TCP ports is an extremely bad practice for cloud architectures, as it exposes products and services to accepting incoming messages from malicious parties. This is something eG Innovations avoids in our own products (see details). This is also a best practice adopted by Microsoft for Azure Virtual Desktops (AVD).

Incident Review - Google Cloud Outage has Widespread Downstream Impact

Outages on the Internet always catch you by surprise, whether you are the end user or the Head of SRE or DevOps trying to keep a clear mind while you execute your incident playbook. As people in charge of ensuring reliable services for our customers, our normal experience of outages involves surfing a deluge of fire alarms and video calls as we work to solve the problem as quickly as we can. We often forget, therefore, what an outage means to the end user.

Ask Miss O11y: Mapping Out Your Observability Journey

Dear Trapped, Thanks for asking the question! Approaching observability as an all-or-nothing problem often leads to the project feeling daunting. But that’s not specific to observability—any project can be overwhelming if you think it needs to be done all at once, perfectly. Such as, erm, writing an entire book on observability! *looks around worriedly*

Tutorial: Build Serverless functions with C#

The world of cloud computing has been revolutionized by a solution called serverless computing. It has been an absolute joy for developers to use. Before this innovation, developers had to worry about the resources powering their code. Since the launch of serverless computing, the developer’s focus on operating-system and hardware architecture is now a thing of the past. It handles all the server management while focusing on what you do well — writing good quality code.