Operations | Monitoring | ITSM | DevOps | Cloud

What Can We Learn from AWS's December Outagepalooza?

2021’s slew of Internet outages or disruptions show how connected and relatively fragile the Internet ecosystem is. Case in point: December’s trifecta of Amazon Web Services (AWS) outages, which really brought home the fact that no service is too big to fail: The reality is, the next outage is not if, but when, where, and for how long. Pretending they don’t exist or won’t happen is not only pointless but harmful to your business.

8 Issues With AWS Tags And How To Overcome Them For Good

AWS resource tagging is fundamental for effective cloud cost management. By creating and allocating cost-related tags in AWS, you can organize and manage your resources according to keys and values that make sense to you. This helps you better understand your cloud costs and manage your spending. But proper tagging isn't easy. While AWS provides several useful resources, you may still run into some issues that require more involved solutions.

Distributed network visibility, the ultimate weapon against chaos

2022, the world is the technological paradise you always dreamed of. Space mining, smart cities, 3D printers to make your own Darth Vader mask… Just a little problem, society is based on digitization and communications and you have no idea about the visibility of distributed networks. Something of vital importance considering the rise of cybercrime. Well, don’t worry, we’ll help you.

How We Define SRE Work

At the time of writing this post, I have officially been at Honeycomb for one year as a site reliability engineer (SRE). I had shared my initial experiences and impressions in this post and thought it would make sense to check back in now that I’ve had the opportunity to spend time learning about the team, the culture, and the code base more in depth.

AWS EC2 Cost Optimization Best Practices

Amazon Elastic Compute Cloud (EC2) is one of the core services of AWS, designed to help users reduce the cost of acquiring and reserving hardware. EC2 represents the compute infrastructure of Amazon’s cloud service offerings, providing organizations a customizable selection of processors, storage, networking, operating systems, and purchasing models.

Exploring the Importance of Change Management in Healthcare

Change management is an organized, structured approach with methods that enable healthcare organizations to transform workflows seamlessly. Organizational change management requires the collective involvement of C-level executives and stakeholders to successfully implement changes within a care facility. Change is required when individuals, processes, teams, and tools cannot keep pace with the ever-changing needs and expectations of the organization.

AIOps in 2022 and Beyond: A Conversation with Gartner

Modern digital businesses adopt AIOps tools to enable continuous insights across an IT stack. These insights tell the full story of what’s happening behind systems, allowing IT teams to achieve the operational efficiencies and high availability that lead to customer satisfaction. Old siloed monitoring disciplines provide data specific to performance of the digital experience, IT infrastructure, application or network.

Cover Your DRaaS: Everything you need to know about Disaster Recovery

Unplanned downtime carries a hefty price tag for enterprises. In 2020, critical server outages cost enterprises on average at least $10,000 per hour, with 95% of respondents stating that the cost was $200,000 per hour or more. 40% said that the average cost was closer to $1 million per hour, and 17% lost $5 million or more for every hour offline. Those are some sobering statistics that demonstrate the importance of being prepared for the worst. But you’re thinking, “We back up everything!

Improved routing for Jira Cloud and Jira Server tickets with multi-project support

If you love Jira then you probably love customization, and we’ve made your integration with Jira Cloud and Jira Server even better with multi-project support! You can now route your incident tickets and follow-up work to remediation teams' Jira projects directly from FireHydrant, saving you valuable time and clean-up work. Let’s take a look at what has changed and some additional use cases unlocked with this integration.