Operations | Monitoring | ITSM | DevOps | Cloud

FrogML SDK: the Gateway to Model Governance

Data-driven decisions are critical. And to support high-stakes decision-making – from fraud detection in credit card transactions to demand forecasting in retail – organizations are increasingly relying on complex models. According to McKinsey, 78% of organizations report using AI in at least one business function, highlighting just how embedded AI and ML models have become in operational and strategic decision-making.

Amazon SageMaker Pricing Guide: 2025 Costs (And Savings)

Amazon SageMaker makes it easy to prepare data for machine learning (ML) and then train, deploy, and modify ML models. SageMaker is a fully managed service that automates much of the ML lifecycle. So, if you want a single partner to help you through all stages of your Artificial Intelligence (AI) lifecycle, SageMaker might be the answer. Perhaps more important for this post is the promise that Amazon SageMaker can reduce your machine learning model costs. But does SageMaker pricing reflect this?

Optimizing Legacy ML Systems with Real-World DevOps Practices

We chose to feature this article because it reflects exactly what OpsMatters stands for: practitioners solving real problems with practical DevOps thinking. When we came across Ashish's detailed breakdown of his experience modernizing a complex ML environment, it stood out for its clarity and actionable insights. We reached out to him to learn more about the work behind this case study, and with his permission, we are sharing it here so the broader community can benefit from these lessons in observability, cost optimization, and real-world DevOps execution.

Canonical announces Charmed Feast: A production-grade feature store for your open source MLOps stack

July 10, 2025: Today, Canonical announced the release of Charmed Feast, an enterprise solution for feature management with seamless integration with Charmed Kubeflow, Canonical’s distribution of the popular open source MLOps platform. Charmed Feast provides the full breadth of the upstream Feast capabilities, adding multi-cloud capabilities, and comprehensive support.

Automating machine learning security checks using CI/CD

Machine learning (ML) pipelines are increasingly being treated like software; built, tested, deployed, and monitored using automated tooling. But while infrastructure as code and microservices have matured with security best practices, ML systems often lag behind. The truth is, your ML pipeline is part of your software supply chain and it is vulnerable.

Harnessing Machine Learning for Advanced Threat Detection with Observo AI

Cyber threats are growing more cunning every day, with attackers even tapping into artificial intelligence to outsmart traditional defenses. Organizations face a flood of security data—logs, alerts, and telemetry—making it nearly impossible to sift through. How do you spot the real dangers amid all that noise? Observo AI’s ML-Powered Threat Insights offers a game-changing answer.

Taming Telemetry Data Sprawl: How ML Reduces Data 2X Better

Security and DevOps teams are drowning in data. Fueled by the explosion of cloud-native architectures, microservices, and accelerated software development cycles driven by AI, telemetry volumes are growing faster than ever. For most organizations, security and observability data is now doubling every 2–3 years. At the same time, most of the tools used to analyze that data—SIEMs, log analytics platforms, and cloud-native observability tools—charge based on ingestion volume.

Forecasting with InfluxDB 3 and HuggingFace

Machine learning models must do more than make accurate predictions; they also need to adapt as the world around them changes. In real-world systems, data distributions shift due to seasonality, equipment wear, user behavior changes, or other external forces. If your models can’t keep up, the result is poor predictions. This can lead to outages, inefficiencies, or missed opportunities. That’s why forecasting systems need to be monitored and resilient, not just accurate.

Best VPS for Machine Learning

When it comes to machine learning, choosing the right Virtual Private Server (VPS) can significantly impact your project's performance, scalability, and cost-effectiveness. A VPS provides dedicated resources that can handle the intensive computational tasks associated with training machine learning models. Here, we explore the best cheap GPU VPS options for machine learning, highlighting the importance of GPUs and introducing Cloudzy as a leading choice.

Announcing Charmed Kubeflow 1.10

We are thrilled to announce the release of Charmed Kubeflow 1.10, Canonical’s latest update to the widely-adopted open source MLOps platform. This release integrates significant improvements from the upstream Kubeflow 1.10 project, while also bringing a suite of additional capabilities targeted towards enterprise deployments. Charmed Kubeflow 1.10 empowers machine learning practitioners and teams to operationalize machine learning workflows more efficiently, securely, and seamlessly than ever.