Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

GPU cloud for AI inference in production: How infrastructure requirements change after training

Training a model is a project with an end date. Inference is what happens for the rest of the model's working life. The two workloads share GPUs, frameworks, and a lot of vocabulary, but the infrastructure decisions that make sense during training are usually the wrong ones in production. Teams that treat inference as "training, but smaller" tend to discover the gap somewhere around their first traffic spike.

Beyond tokens per watt - using Ubuntu 26.04 LTS for AI

Tokens per watt (TpW) – the measure of useful AI work produced per watt of energy consumed – is the metric at top of mind for CEOs, heads of AI, and infrastructure teams alike. With the tremendous cost of GPU clusters, extracting as much value as possible from the expense is critical. But in the pursuit of tokens, it’s important to remember that hardware efficiency isn’t the only factor influencing data center operating costs, or the output of useful, revenue-generating AI work.

[Webinar] Building Regulated Infrastructure: How Lucis Standardized Security for Global Care

In Healthtech, downtime is more than a loss of revenue, it is a disruption to patient care. Whether supporting digital health platforms or AI-driven healthcare applications, infrastructure must remain secure, compliant, and highly available. Join Lucis and Qovery for a technical breakdown of building compliant and secure infrastructure that scales AI and healthcare workloads, handles traffic peaks, and maintains SOC 2, HDS, and HIPAA standards.

Enforce your team's database standards automatically with Custom Policy Checks in Redgate Flyway Enterprise

Every engineering team has a list of “things we don’t do”. No TRUNCATE TABLE in production. Every audit table must end in _audit. Foreign keys follow a naming convention. But until now, enforcing those standards has meant relying on pull request checklists, tribal knowledge, or a separate linting tool bolted onto the pipeline.

TikTok Challenges: Trend or Danger? What Every Parent and Teen Should Know

Everyone seems to be doing TikTok challenges—but are they always harmless? From positive movements like the Ice Bucket Challenge to risky viral trends that have led to serious injuries, social media challenges can influence how teens think, behave, and seek attention online In this video, we'll explore: 0:00 Why teens are drawn to TikTok challenges & The hidden pressure of fitting in and going viral.

AI Agent Governance: The Missing Piece of Autonomous IT

AI agents are making decisions, accessing systems, and resolving issues autonomously. But as organizations deploy more agents, one challenge becomes impossible to ignore: governance. Who has access? What changed? Who is accountable? The future of Autonomous IT requires autonomy with accountability.

A package manager for AI assets (and why the lock file is per-user)

Sometime in the last two years your repos quietly filled up with a new category of file. Not code, not config exactly: prompts. A.claude/skills/ directory here. A.cursor/rules/ folder there. A CLAUDE.md at the root, an AGENTS.md next to it, a.mcp.json listing the servers your agent is allowed to call. These are the things that make a coding agent useful on your codebase, and they're sprawling.