Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

Save Hours on Troubleshooting with Automated Investigations

How many times has your team stared at a dashboard, pointed to a spike, and asked a question that charts alone can’t answer? “What was the real impact of that deployment?” “Why are our Kubernetes pods in the us-east-1 cluster suddenly crashing?” “Are we wasting money on overprovisioned servers?” Answering these questions is the real work of operations and SRE.

Tutorial: How to Remediate Vulnerabilities with Puppet Enterprise Advanced Patching

The rate at which vulnerabilities are being exploited is on the rise. The VulnCheck company, which specializes in vulnerability intelligence, found that in Q1 2025, 28.3% of vulnerabilities were exploited within 1 day of CVE disclosure. Keeping your systems up to date is more important than ever. The reality is that many security teams are running scans and then exporting to giant spreadsheets, which are “tossed over the wall” to the Operations team with little context.

How to Block Apps on Android Business Devices?

Are you an IT administrator looking for an efficient way to manage company-owned Android devices? This video provides a step-by-step guide on how to block apps on Android devices to boost employee productivity and maintain security. In a business environment, a clear app usage policy is essential for compliance and focus. We'll show you how to easily set up an App Blocklist using the AirDroid Business MDM solution.

Product Klip: Istio Developer Dashboard

Troubleshooting issues in a complex service mesh environment, such as traffic failures or authorization problems, often requires the expertise of an SRE or DevOps professional. However, Komodor simplifies this process. Komodor provides developers with the necessary visibility to diagnose service mesh issues on their own. It helps developers easily identify blocked connections and understand the root cause without having to review logs or configuration files.

Netdata Now Troubleshoots Your Alerts for You

The 2 AM pager alert. For anyone in Ops, SRE, or IT administration, those words trigger a familiar sense of dread. An alert has fired. Is it a real fire, or another false alarm waking you from a dead sleep? The pressure is on. Every minute of downtime costs money and reputation, but troubleshooting a complex system when you’re sleep-deprived is a Herculean task.

AI Agent Is Hitting Your APIs - Are You Ready?

It’s no longer theoretical – artificial intelligence has left research labs and entered production systems, generating a new breed of consumers – autonomous and intelligent agents. These autonomous AI agents are increasingly interacting with real-world APIs (application programming interfaces), which are sets of protocols and tools for building and integrating software applications.

Building your AI infra, our tips

Modular architecture: Decouple compute from storage so each can scale independently. This makes it easier to adapt to growing or shifting workloads over time. Future-ready hardware: Select GPUs and CPUs not just for current workloads but with an eye on scalability, including support for newer accelerator types. Scalable design: Ensure the system allows seamless addition of compute nodes or storage without a full redesign.

Running AI without blowing up your storage

Storage is often underestimated: In infrastructure discussions, compute and networking get most of the attention, while storage is treated as secondary. For AI workloads, that can be a costly oversight. Data throughput for specialized hardware: AI infrastructure powered by GPUs can process massive volumes of data at unprecedented speeds. This puts immense pressure on the storage system to keep up. Scale-out performance: An on-prem, scale-out, software-defined storage setup allows you to meet high performance demands, grow capacity as needed, and stay in control of infrastructure costs.

Bridging the Gap: 3 Practical Strategies to Align Security and Operations in DevOps

The gap between security operations and IT operations poses significant risk. It’s increasingly clear that DevOps leaders, IT managers, and enterprise teams face an uphill battle to manage growing threat complexity, endless patches, and compliance requirements while operating in silos. Bridging this gap is essential to effectively manage risks and enhance operational efficiency.