Operations | Monitoring | ITSM | DevOps | Cloud

Kubernetes Security Guide: Risks, Strategies, And Tools

In 2018, attackers gained access to Tesla’s AWS cloud environment through an unprotected Kubernetes console (admin console). Because it lacked proper authentication, the hackers could see and control cluster resources. Once inside, they deployed new pods running cryptocurrency mining software, using Tesla’s compute power for profit. During the breach, the attackers also uncovered credentials stored in the cluster.

SharePoint Server Monitoring: Uptime, Performance & SLAs

SharePoint is the backbone of internal collaboration for countless organizations. It hosts documents, drives workflows, powers intranets, and underpins team communication across departments. But when it slows down—or worse, goes dark—productivity grinds to a halt. The problem is that most monitoring approaches treat SharePoint like a static website. They check availability, not experience.

Onboarding Microsoft Sentinel data lake with DataStream

Modern security operations teams face an overwhelming challenge: a rapidly growing volume of logs, alerts, and telemetry from cloud services, on-premises infrastructure, and third-party security tools. Traditional SIEM platforms often struggle to scale cost-effectively and provide the agility needed for advanced analytics and threat hunting.

Track, debug, and roll back changes with Version History for Synthetic Monitoring tests

A synthetic test is only useful if you can trust what it’s telling you. When one fails, the reason may not be obvious. Was the application updated? Did the test change? Or both? As more people contribute and refine the same test, it becomes harder to understand what changed or restore a working version. Without clear visibility into those updates, teams can spend more time tracking down the cause of a failure than resolving it.

A deep dive into Java garbage collectors

Historically, developers have relied on languages like C and C++ for explicit control over memory allocation and deallocation. This approach can yield very low overhead and tight control over performance, but it also increases complexity and risk (e.g., memory leaks, dangling pointers, and double frees). This often results in runtime issues that are difficult to diagnose, which can become a drag on team velocity.

Ingest OTLP metrics directly into Datadog with the new OTLP Metrics API

Many organizations rely on OpenTelemetry (OTel) to standardize observability across distributed systems. These organizations are at varying stages of adoption and are implementing OTel in complex environments with diverse configurations. To support this range of use cases, Datadog offers many ways to use OpenTelemetry with Datadog.

7 Ways Your Incident Management Just Got a Boost (New Feature Rundown)

All the things you may have missed that will make your incident management smarter, faster, and simply easier. We ship updates every week because we want you to get the most out of FireHydrant. But we also know it's hard to stay up to date and read every week's changelog (even though we know reading changelogs is the highlight of your week ).

Experimenting With Different Scripts

It all began when I spun up an AWS t4g.small burstable instance for a side project. Nothing unusual just another day in the cloud. But the moment I connected through SSH, something caught my eye. The system greeted me with a temperature reading of -273.5°C. Wait… what? That’s 0 Kelvin, the point where atomic motion completely stops. In other words, absolute zero , a state that’s theoretically impossible for anything to operate in.

What is autonomous validation? The future of CI/CD in the AI era

Over the past decade, CI/CD has redefined how modern software is built and shipped. CircleCI has been a leader in that transformation, working alongside the world’s best engineering teams to build a reliable foundation for continuous delivery at scale. Today, those foundations are under new pressure as AI reshapes every aspect of the delivery cycle. Developers are producing more change with less certainty about what those changes touch.

IaC is Great, But Have You Met IaCM?

This blog highlights the critical role of Infrastructure as Code Management (IaCM) in enhancing IaC practices, ensuring security, compliance, and efficiency in managing complex infrastructure at scale. ‍ Managing infrastructure efficiently and reliably is more critical than ever. Infrastructure as Code (IaC) has emerged as a key practice, enabling teams to define, deploy, and manage infrastructure using code.