Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

5 Ways to Make Kubernetes Auditing an Effective Habit

Kubernetes has several components that produce logs and events containing information on everything that has happened in a Kubernetes cluster. Keeping track of all this data becomes extremely challenging when you run Kubernetes at a very large scale. With so many components generating logs, organizations need a centralized place to see it all. But this is only half your problem. You also need to correlate logs coming from different components to draw the right conclusions and take effective actions.

Incident Response Automation: How It Works & Best Practices

It's 2 a.m. and your engineering team is sound asleep when suddenly a barrage of alerts start flooding in. A critical service is down and customers are complaining. Your developers scramble to sift through the noise, identify the root cause, and fix the issue—all while racing against the clock to meet tight SLOs.

Beyond the Horizon: Navigating the Future of AI and ML Innovation Panel

In this panel Navigate Local discussion, industry experts Josh Mesout, James Gress, Brandon Dey, and Cate Gutowski explore the future of AI and machine learning. They discuss the shift from augmentation to automation in software development, the impact of open-source vs. proprietary models, and AI's role in democratizing access to technology. The panel also addresses concerns about AI's influence on human cognition and the importance of human oversight.

Tailored Azure Cost Savings Notifications and Alerts

Guided by expert Michael Stephenson, discover how to optimize your Azure spending with Turbo360's tailored cost savings notifications and alerts. In this detailed walkthrough, you'll learn to configure personalized notifications that identify potential cost-saving opportunities and resource optimizations within your Azure environment. Understand how weekly and monthly email alerts can keep you informed about underutilized resources and right-sizing recommendations, ensuring you make the most of your cloud investment.

Round Robin escalation policies: do's and don'ts

The concept of Round Robin comes from sports. And it has nothing to do with anyone called Robin, but the french word ruban (ribbon). In a Round Robin tournament, all participants face each other by taking turns. When applied to on-call schedules, a Round Robin escalation policy means that responders assigned to a level will take turns responding to alerts. When is this strategy useful and when isn’t?

Spot Instances Explained: How They Can Lower Cloud Costs

According to Amazon, EC2 Spot Instances can save you up to 90% of your spending on On-Demand Instances. It’s been proven that Reserved EC2 instances are cheaper than On-Demand ones. But, Spot Instances can offer even bigger discounts. Despite these cost benefits, Spot Instances have many caveats that make them unsuitable for some operations and processes. Even Amazon emphasizes this. Although this EC2 can save you a lot of money, some situations are best for On-Demand or Reserved Instances.

Gremlin's API makes it easy to integrate testing in your CI/CD pipeline

Thinking about integrating Gremlin into your existing pipeline? Look no farther than the Gremlin API. "The next step then was to build the right tooling such that the resiliency tests can be run from a pipeline. Gremlin's API first approach made it possible to do this in a very easy manner because everything that we could do from the UI and manually, we could replicate all of that through the API as well.

Three Common Ways Cycle Pays for Itself

In today's competitive and uncertain tech landscape, engineering organizations are constantly seeking ways to optimize costs without compromising on performance. Efficient resource management and cost reduction have become crucial for businesses aiming to stay alive and ahead. At Cycle, our goal is to offer a robust solution that enhances efficiency while delivering significant cost savings to our users.

Behind the scenes: Launching On-call

March 5th was a big day for incident.io as we released our on-call product to the world. Nine months of listening to our customers, coding, fixing, testing, and polishing came together for our biggest product launch to date. Releasing On-call was a huge milestone and represented the next step in our journey as a company.