Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

Charmed Kubeflow 1.7 Beta is here. Try it now!

Canonical is happy to announce that Charmed Kubeflow 1.7 is now available in Beta. Kubeflow is a foundational part of the MLOps ecosystem that has been evolving over the years. With Charmed Kubeflow 1.7, users benefit from the ability to run serverless workloads and perform model inference regardless of the machine learning framework they use.

The Rise of the Cognitive NOC and the Role of IT Process Automation

Today’s Cognitive Network Operations Center (Cognitive NOC) is a significant advancement that employs artificial Intelligence (AI) and machine learning (ML) to dramatically modernize and improve network management and operations. Working together, the NOC and IT Process Automation (ITPA) propel superior efficiency and effectiveness of network operations, minimize downtime, lower operational costs, and overcome additional challenges in optimizing network performance.

Data & Traffic Are Key to Kubernetes Preview Environments

Preview environments are temporary environments where developers can test code changes before deploying them to production, also called ephemeral environments, they’re temporary and should be discarded after testing changes. Carrying out tests using accurate data is a major challenge when creating and destroying environments. Put differently, you need realistic data and traffic in the preview environment to reflect the performance of code changes in production.

How low-code platforms can help eliminate shadow IT in your organization

Businesses are constantly trying to up their game in the digital transformation era. Modernizing legacy systems, keeping on top of software updates, and building business applications are not easy feats. As organizations contemplate the path ahead, a large strain is put on the constantly constrained IT department. When the relevant stakeholders feel that the IT team will take too long to provide a solution, they choose to go for alternate options, bypassing the IT team.

Datadog On Reliability Engineering

There are many different ways to implement Site Reliability Engineering (SRE). From team structures to roles and responsibilities to planning and prioritization flows, there’s no golden path for how to organize things. As Datadog has shifted from a startup to a quickly-growing public company, we’ve seen our own SRE practice evolve. With over 22,000 customers sending trillions of data points each day, keeping Datadog reliable is critical to our business.