Operations | Monitoring | ITSM | DevOps | Cloud

The Keys to Adopting an Automation Mindset

In today's fast-paced digital world, automation stands as a beacon of efficiency and innovation. The adoption of an automation mindset is not merely a trend, but a paradigm shift in how we approach IT processes. This post provides insights into the critical role of IT automation, highlights potential challenges, offers preparation tips before embarking on the automation journey, presents a systematic plan and wraps up with key takeaways.

Better learning from incidents: A guide to incident post-mortem documents

If you’re just starting out in the world of incident response, then you’ve probably come across the phrase “post-mortem” at least once or twice. And if you’re a seasoned incident responder, the phrase probably invokes mixed feelings. Just to clarify, here, we’re talking about post-mortem documents, not meetings. It’s a distinction we have to make since lots of teams use the phrase to refer to the meeting they have after an incident.

What are web checks and ping checks? Why are they important?

Keeping your websites and web-based applications running smoothly for your users is essential. External users that can’t get onto your client-facing portals will quickly turn to competitors. Internal users having difficulty with your network will become frustrated which can reduce morale and productivity. Network monitoring via web checks and ping checks could improve website uptime and make everyone’s lives a little easier.

Monitoring Machine Learning

I used to think my job as a developer was done once I trained and deployed the machine learning model. Little did I know that deployment is only the first step! Making sure my tech baby is doing fine in the real world is equally important. Fortunately, this can be done with machine learning monitoring. In this article, we’ll discuss what can go wrong with our machine-learning model after deployment and how to keep it in check.

Introducing the Datadog Open Source Hub

At Datadog, we have always been deeply involved with open source software—producing it, using it, and contributing to it. Our Agent, tracers, SDKs, and libraries have been open source from the beginning, giving our customers the flexibility to extend our tools for their own needs. The transparency of our open source components also allows them to fully audit the Datadog software that is running on their systems. But our commitment to open source only starts there.

Pick 3 for Your Data Management: Speed, Choice, and Flexibility

Data growth has significantly out-pacing budgets; the products we use, have to do more. This is where optimization comes into play. Generally, optimization is associated with reduction which may be intimidating…what if something important is reduced? How can you identify what should be reduced? Reduction isn’t about removing context, but about removing repetitive data, meaningless fields, or even flattening JSON.

Sumo Logic ahead of the pack in a consolidating market

The observability and cybersecurity sector is chock full of providers from startups like StateStack and Coralogix to established organizations like Datadog, Sumo Logic and Splunk, offering solutions with capabilities of various depth and breadth that are solving the tough problems of application reliability and security.

The Evolution of Data Center Networking for AI Workloads

Traditional data center networking can’t meet the needs of today’s AI workload communication. We need a different networking paradigm to meet these new challenges. In this blog post, learn about the technical changes happening in data center networking from the silicon to the hardware to the cables in between.