Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Setting up Route 53 Health Checks

We live in an age where the internet and digital data drive modern day markets, which results in huge amounts of data being generated and consumed. Hence, it has become very important for online platforms to manage this traffic and serve their customers more efficiently. In this blog we will explore the Amazon Route 53 service and see how it addresses domain name system routing and health check problems.

Crossing "The Last Mile" with an Incident Response System

Delivering dependable and high-performing IT services in 2022 requires coordination and collaboration across different workflows, areas of expertise, and even time zones. Whether serving in-house colleagues or external clients, there is immense pressure on IT management to create seamless experiences 24/7/365. Seconds matter when critical systems break down, and slow incident resolution can have costly ramifications on customer experience and employee productivity.

Driving Effective Communication in Nursing

Effective communication in nursing is central to providing top-quality patient care. Nurses communicate with patients to understand their health issues, and they provide them with the care and compassion needed for recovery. Accomplishing effective communication with patients directly impacts patient health outcomes, and it has far-fetched implications when carried out ineffectively. As such, effective communication in nursing drives patient-centered care.

3 ways to improve your incident management posture today

Too many of us are still playing whack-a-mole when it comes to incidents: an incident is declared, the on-call engineer is paged, the incident is resolved and then forgotten — until next time. It’s time to start thinking in terms of proactive incident management, not just reactive incident response.

Summit Recap: How to adapt to a "Digital Everything" World

Every interaction with our customers, partners, and employees is special – but this year’s PagerDuty Summit went far beyond my wildest dreams. Together we committed to helping you learn and grow in how you manage business critical operations – in other words, getting you ready for anything in a world of Digital Everything.

Minimize MTTR to Mitigate Impact of Change Management

In the first blog this demo series, we showed you how to use Restorepoint to remediate after a network breach. In our second blog of this three-part series, we walk you through a change management instance—showing how to speed problem resolution and how to mitigate the impact of poor change management to minimize MTTR.

Calling all Reliability Practitioners: Participate in the SRE Survey 2022

For the past four years, Catchpoint and various partners have been running a yearly SRE Survey. This year, Blameless is excited to partner with Catchpoint for the fifth annual survey. We want to hear from you if you are in a DevOps or SRE role or even if you work on reliability with some other title or role. There are tremendous, valuable learnings when we listen closely to practitioners.

Receiving PagerDuty alerts from MetricFire

One of the most critical aspects of monitoring your digital assets is getting a timely alert when something goes wrong. Even when you finish building a monitoring stack and expose metrics on a beautifully designed dashboard if you cannot notice abnormal behaviors and fail to take pre-emptive or follow-up actions swiftly, this means your monitoring system does not serve the purpose.

How the unicorn got its horn: a tale of market opportunity and technical innovation

Insight Partners is a leader in working with scale-up companies that have existing product/market fit and can use our help establishing best practices for their businesses. But my specific focus is in developer-driven companies. I look for the best technical teams that are building products that developers love and adore.