Operations | Monitoring | ITSM | DevOps | Cloud

Monitor your NVIDIA GPUs with Datadog

NVIDIA is well known for its computing advancements across a broad range of industries and has become the clear leader in the artificial intelligence (AI) space. Due to their high-performance capabilities, NVIDIA’s discrete graphics processing units (GPUs) now account for approximately 80 percent of the market share for production-level AI, gaming, graphics rendering, and other complex data processing tasks.

Query unsampled logs in real time with Live Search

With thousands of logs generated every minute from your infrastructure, applications, services, and devices, retaining this copious amount of data for active search and analysis can be cost-prohibitive. Because log volumes continue to grow rapidly as operations scale, it’s common for organizations to implement log management strategies and store only a limited number to minimize costs.

How to Create Powerful, Customized Lua Mailers in HAProxy

SysOps teams know how crucial observability is to maintaining around-the-clock infrastructure health. And while HAProxy already provides reliable logging and stats (for diagnostics and monitoring), what if you could supplement those with a real-time alerting system? That idea inspired our initial inclusion of mailers in HAProxy. Since then, users have been asking for deeper mailer customizability—and we’re thrilled to deliver that with Lua-based email alerts in HAProxy 2.8.

Get Started with Puppet: A Tutorial Guide for First-Timers

So you’re ready to get started with Puppet and you don’t know where to begin. That’s alright – this short, easy Puppet tutorial will help you get started with Puppet Enterprise. This tutorial blog walks through the first few steps you'll need to take to get Puppet Enterprise up and running so you can start automating your organization's IT infrastructure.

Capitalizing on the Cloud: Five Strategic Benefits of ITSM in the Cloud

Cloud computing is no longer the future; it’s our reality. This means that the decision to move ITSM to the cloud isn't just an option. It's a strategic necessity. For business and IT leaders in the early stages of this transformation, understanding the benefits of cloud-based ITSM is pivotal. Here are five advantages that this shift can bring to your organization.

How Automation Can Help Scale An MSP Business

For a Managed Service Provider (MSP) to grow, it must win new customers. However, each new customer brings more devices and users that require management, incident ticket resolution, and provisioning. If the above is actioned manually, an MSP requires dedicated staff to handle the new clients’ users, devices, resolve tickets and requests. Bearing in mind, each staff member can only support a certain number of users and devices.

How to Use Google PageSpeed Insights Correctly: A Technical Guide

PageSpeed Insights is a Google web tool that analyzes web page performance and optimization. It provides valuable insights and recommendations to help website developers improve their websites’ speed and user experience. With this tool, we can better understand how a website performs on different devices and networks. In this post, we’re going to look at how to use it correctly, as well as giving you some technical tips along the way. Alright, let’s jump in!

A Detailed Guide to Docker Secrets

This post was written by Talha Khalid, a full-stack developer and data scientist who loves to make the cold and hard topics exciting and easy to understand. No one has any doubt that microservices architecture has already proven to be efficient. However, implementing security, particularly in an immutable infrastructure context, has been quite the challenge.

Monitoring edge and fog computing devices

Edge computing and fog computing are technological advancements gaining traction in a hyper-connected world. Being close to the source, edge computing enables data collection and processing at the fastest possible speeds. Instead of sending all the data to a remote cloud location through the internet with latency, edge devices store and process most of it onsite and pass the heavy lifting to the central cloud to achieve the quickest turnaround.

Optimizing Resource Scheduling and Planning in Healthcare

The pandemic has exacerbated the staff shortage in healthcare, placing a disproportionate burden on the industry, and underscoring the significance of effective resource scheduling. While resource scheduling encompasses the allocation of healthcare staff and physical resources and assets, in this blog, our primary focus will be on healthcare staff. Resource scheduling plays a vital role in ensuring the smooth operation of healthcare facilities.