Operations | Monitoring | ITSM | DevOps | Cloud

Building LLM agents to validate LangGraph tool use and structured API responses

Transitioning LLM agents from intriguing prototypes to reliable, production-grade solutions introduces a unique and significant challenge: the inherent stochasticity of LLMs. Unlike conventional software, where inputs predictably yield precise outputs, an LLM’s response can exhibit variability even when presented with identical prompts. To ensure the dependability of your LLM agent, you will need a rigorous validation strategy.

Navigating AI transformation ft. Meg Adams, Senior Director of Engineering at The New York Times

In this episode of The Confident Commit, Rob Zuber sits down with Meg Adams, Senior Director of Engineering at The New York Times, for a deep dive into leading engineering teams through the AI revolution while staying true to organizational mission. Meg shares how the Times approaches AI adoption with a "measured but focused" strategy, emphasizing experimentation and opinion-formation over mandates, and why she believes AI serves as a force multiplier for what already exists in your organization and workflows.

The new AI-driven SDLC

For decades, the software development life cycle (SDLC) has been the framework teams use to understand how software moves from idea to production. It breaks complex work into familiar phases: planning, design, development, testing, deployment, and maintenance. This structure gave organizations a shared way to coordinate teams, track progress, and build with confidence.

Automating Expo app build delivery to QA with CircleCI and EAS webhooks

Manually sharing mobile app builds with Quality Assurance (QA) engineers can be a tedious and error-prone process. Developers often find themselves exporting.apk or.ipa files, uploading them to Google Drive or Dropbox, and then pinging the QA team on Slack to announce the upload, all while juggling deadline and code reviews. This manual process not only slows down feedback cycles but also leaves room for human error, miscommunication, or outdated builds being tested.

Building and deploying a Python MCP server with FastMCP and CircleCI

Extending Large Language Models (LLMs) with custom tools has become increasingly valuable in today’s AI landscape. Model Context Protocol (MCP) servers provide a standardized way to connect external tools and resources to LLMs. This can enhance their capabilities beyond basic text generation. While thousands of pre-built MCP servers exist, creating your own allows you to address specific workflows. You can implement use cases that off-the-shelf solutions cannot handle.

Automated RAG pipeline evaluation and benchmarking with RAGAS

Retrieval-Augmented Generation (RAG) pipelines have become an integral part of how Large Language Models (LLMs) access information beyond their training cutoff. These pipelines enable LLMs to deliver current, accurate, and grounded responses. By fetching relevant external documents, RAG mitigates common LLM challenges like factual inaccuracies and hallucinations. However, this methodology introduces a new complexity: evaluating RAG pipeline performance is particularly challenging.

7 ways AI agents are transforming software delivery

For most teams, the slowest part of delivery isn’t writing code, it’s everything that happens after: automated tests, manual reviews, bug fixes, final approvals, and the long wait for deployment. The longer these phases run, the more expensive and painful late fixes become. As AI makes it easier to generate code at scale, those bottlenecks only get bigger.

Code coverage standards for a Next.js project using CircleCI and Coveralls

An essential part of software development, testing helps catch bugs and errors early, improves software quality, and ultimately prevents costly issues from being deployed to production. The effectiveness of software testing will remain uncertain until it can be measured and that is where code coverage comes in. Code coverage is a metric that tells developers what portion of their codebase is executed when specific tests are run.

AI-powered email automation with CI/CD pipelines

Email automation allows you to send emails automatically based on certain triggers or schedules, so you don’t have to click the Send button every time. This includes things like welcome messages, drip campaigns, and regular newsletters. In this tutorial, you will create a simple system that automatically welcomes new subscribers and sends them updates about technology, all with the help of AI.

Deploying a multimodal RAG application with Gemma 3 and CircleCI on GKE

Retrieval-Augmented Generation (RAG) has transformed how applications interact with Large Language Models (LLMs). RAGs ground LLM responses in external knowledge, improves accuracy, and reduces hallucinations. But traditional RAG systems have a significant limitation: they only process text. Multimodal RAG addresses this limitation by processing and understanding multiple data types (text, images, and potentially audio).