Operations | Monitoring | ITSM | DevOps | Cloud

Release code confidently with Automatic Faulty Deployment Detection

Modern software development teams use CI/CD tools to ship features quickly and rely on best practices like shift-left testing to find application errors before they become user-facing bugs. But you still face the risk that any code you deploy could contain errors that your testing did not surface. To help you deploy with confidence and mitigate the effects of a bad deployment, Datadog APM now provides Automatic Faulty Deployment Detection.

How to detect security threats in your systems' Linux processes

Almost all tasks within a Linux system, whether it’s an application, system daemon, or certain types of user activity, are executed by one or more processes . This means that monitoring processes is key to detecting potentially malicious activity in your systems, such as the creation of unexpected web shells or other utilities.

How Squadcast Benefits On-call Engineers - Part 1

It is difficult to stay completely reliable in an always-on world. So it's very important to choose the right Incident Management solution that can solve your problems. In this blog, we have highlighted the benefits of Squadcast and why you should adopt it. “Being on-call sucks!" Often incident response teams use this phrase when talking about their on-call experiences. Despite using best practices for managing infrastructure, incidents do occur from time to time.

Distributed Tracing for C++ Applications with OpenTelemetry & Logz.io

Many organizations are moving from monolithic to microservices-based architectures. Microservices allow them to improve their agility and provide features more quickly. Although developing a single microservice is simpler, the complexity of the overall system is much greater. Here, we’ll review how to add distributed tracing to C++ with the OpenTelemetry collector and send to Logz.io. One of the biggest challenges is finding efficient tools to quickly debug and solve production problems.

How we fixed a double-counting Prometheus bug while working on a Grafana Cloud project

In my role as a software engineer at Grafana Labs, I recently worked on a project that involved generating PromQL queries. One of the ways we verified the correctness of the generated queries was with a suite of integration tests. These tests would execute the generated PromQL queries against a local instance of the Prometheus query engine with some test data, and verify the results were as expected.

7 email blocked list prevention practices for businesses

Email blocked listing is a major problem, but sometimes overlooked in the conversation around spam and email threat prevention . The impact on a business can be devastating if it isn’t caught in time, especially if the company relies on email marketing for lead generation and customer correspondence. While it’s possible to recover from the effects, it’s a complicated process and may set back productivity in the meantime.

How Martello's Microsoft 365 Solution Supports the Return to the Office

The global COVID-19 pandemic caused a massive and immediate shift to remote work which was bolstered by video conferencing telecommunication software such as Microsoft Teams. Although the world is still trying to heal (while simultaneously navigating new and evolving challenges) some organizations have started the process of having their employees return to work and explore new hybrid workforce environments.