Operations | Monitoring | ITSM | DevOps | Cloud

An Introduction to Application Monitoring

Users prefer an application that runs smoothly and without bugs to one that may have an appealing UI and shiny new features but comes with issues. Application monitoring is critical to the health of your application. With application monitoring, you can stay on top of any errors and ensure your application performs as it should. In this article, we'll cover: Let’s dive straight in!

Prometheus Alertmanager best practices

Have you ever fallen asleep to the sounds of your on-call team in a Zoom call? If you’ve had the misfortune to sympathize with this experience, you likely understand the problem of Alert Fatigue firsthand. During an active incident, it can be exhausting to tease the upstream root cause from downstream noise while you’re context switching between your terminal and your alerts. This is where Alertmanager comes in, providing a way to mitigate each of the problems related to Alert Fatigue.

Azure Storage: Compare Prices and Plans - The Ultimate Guide

Azure Storage is a cloud-based storage solution offered by Microsoft. It provides scalable and secure storage for unstructured and structured data, including blobs, files, queues, and tables. With Azure Storage, you can store and access your data from anywhere in the world. The service is flexible and customizable, making it ideal for businesses of all sizes and industries. But, with so many plans and options available, it can be difficult to determine the best plan for your business needs.

50+ DevOps Interview Questions To Ask In 2023

Building quality software is tough. Even automation requires skilled developers to make it work. Yet, good developers are in short supply. You could offer attractive packages, too. Think: hybrid working, breakfast burritos, and organic chicken wings every Wednesday. You'll still need to ask the right DevOps interview questions to find and hire qualified DevOps engineers.

Your non-technical teams should be using incident management tools, too

For many businesses across the world, incident management is something that’s usually left to engineers. These teams are on the front lines, declaring, managing, and resolving all sorts of incidents across the org, regardless of where it originates or what form it takes. But there’s a glaring issue with this approach. Outside of technical teams, people across organizations aren’t accustomed or trained to use the word “incident” whenever an issue comes up.

How Gremlin helps you meet Google's Infrastructure Reliability standards

In January of 2023, Google released its infrastructure reliability guide, which provides guidelines on how to build high-availability applications in Google Cloud. While it's written for Google Cloud, it provides some excellent general-purpose information on how to architect reliable applications on any cloud provider, including: In this blog, we'll explain each of these factors and how you can use Gremlin to ensure you're meeting your reliability requirements.

How Security Engineers Use Observability Pipelines

In data management, numerous roles rely on and regularly use telemetry data. The security engineer is one of these roles. Security engineers are the vigilant sentries, working diligently to identify and address vulnerabilities in the software applications and systems we use and enjoy today. Whether it’s by building an entirely new system or applying current best practices to enhance an existing one, security engineers ensure that your systems and data are always protected.

Is open-source as secure as proprietary software?

We’re surrounded by news of data breaches and companies being compromised, and the existential threat of ransomware hangs over just about every organisation that uses computers. One of the consequences is that we are hassled by an ever-increasing number of software updates, from phones and computers to vacuum cleaners and cars; download this, restart that, install the updates.

Data Gravity in Cloud Networks: Achieving Escape Velocity

In an ideal world, organizations can establish a single, citadel-like data center that accumulates data and hosts their applications and all associated services, all while enjoying a customer base that is also geographically close. As this data grows in mass and gravity, it’s okay because all the new services, applications, and customers will continue to be just as close to the data. This is the “have your cake and eat it too” scenario for a scaling business’s IT.