Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Essential Kafka Security Best Practices for 2024

Ah, Kafka—the powerhouse behind real-time data streaming in today’s world. It’s efficient, scalable, and handles vast amounts of data with ease. But with great power comes great responsibility, right? And in 2024, with cyber threats more sophisticated than ever, securing your Kafka environment is no longer just a good idea—it’s non-negotiable.

Achieving your AI Strategy with Automation

AI must be the hottest topic in the business news at the moment whether its public offerings like ChatGPT or the plethora of business offerings starting to appear. In a recent Gartner survey1 92% of CIOs believe AI will be implemented in their organizations by 2025 but turning the promise of AI into reality is not easy, in the same survey 49% of leaders highly involved in AI report that their organizations struggle to estimate and demonstrate the technology’s value.

AWS GovCloud vs Azure Government Cloud - What's the Top Government Cloud Provider

If you’re ready to leap to the government cloud, you’re likely looking back and forth between Amazon and Microsoft, wondering which is the best (and safest) bet. We’ve got you covered! Learn all you need to know from our cloud experts about which government cloud offering will work best for you – and it may come as a surprise, but there are other options outside of AWS and Azure… get into the details below!

What is GovCloud - Compete Guide to GovCloud in 2024

If you’re a U.S. federal, state, or local government agency trying to deliver services to the public faster without sacrificing a single inch of security, GovCloud is the PaaS (Platform as a Service) solution. But what exactly is GovCloud, and how can it ensure you deliver services more efficiently and effectively? We’ll tell you all you need to know so you can decide if you’re ready to upgrade your tech stack with this tool.

How to build automatic remediation workflows in Grafana Cloud

When incidents occur, engineers must jump into action to get systems back to running at peak performance. However, there are a myriad of challenges that can prevent them from resolving the issues swiftly. Imagine a scenario where a team of DevOps engineers manages a cloud-based e-commerce platform that experiences occasional spikes in traffic during peak shopping seasons. During one of those major sales events, the team notices a sharp spike in CPU usage across several critical application servers.

Git, your way: Expanded strategies for branch sync & merge

By popular demand, we are happy to introduce several new strategies for syncing and merging branches in Bitbucket Cloud. Our goal is to provide you the full functionality of rebase and merge within the Bitbucket UI to help you manage your Git history according to your team's preferences. These new options have been among our most highly voted feature requests. In brief, here's what we've added.

How Effective are Your Alerting Rules?

Recently, I came across this Reddit post highlighting the challenges of having ineffective alerting rules: And, here at OnPage we have experience with various companies who have dealt with just that, so I felt I should share some of our top tips for creating effective alerting rules in this blog. Read on to discover…

Docker Log Rotation - Definition, Configuration Guide, and Best Practices

Docker containers generate logs to monitor their operations, but without a mechanism in place to manage these logs, they can grow indefinitely, leading to excessive disk space consumption and performance degradation. Implementing docker log rotation is crucial to control log file size and quantity, ensuring efficient log management and optimal system performance.

Syncing PagerDuty Schedules to Slack Groups

We’ve posted before about how engineers on call at Honeycomb aren’t expected to do project work, and that whenever they’re not dealing with interruptions, they’re free to work on whatever will make the on-call experience better. However, all of our engineering rotations rely on hand-off meetings where they update the Slack groups with everyone who’s on call. During my last shift, a small problem kept causing friction for some of our incident management automation.