FireHydrant has a sophisticated set of response actions for coordinating communications, activities, and retrospectives for incidents that affect your services. Relay helps by automating remediations that involve orchestrating actions across your infrastructure. In this example workflow, an incident that affects an application deployed on Kubernetes can trigger a rollback to a previous version automatically.
A few months ago, I wrote about PagerDuty’s mission to reimagine the workplace in the wake of COVID-19, and wanted to share an update. When we look back at the last 10 months of work, one thing is clear: We are never going back to “normal.” Though it might sound daunting, I am excited about the opportunity it offers to forge our next normal and to make it a better normal than the one we’ve left.
Kubernetes is a popular container orchestration system at the heart of the Cloud Native Computing Foundation projects. It automates the deployment, lifecycle, and operations of containers, containerized applications, and "pods," which are groups of one or more containers. The platform itself, along with each of these workloads, may generate event data. There are different kinds of data associated with these processes.
Hospitals require a solution to streamline after-hours communication between patients and medical professionals. Exceptional after-hours care helps enhance the patient experience in critical situations. In this blog, we’ll discuss the importance of dedicated lines and after-hour live calls.
Welcome to part two of the five-part “Well-Architected Serverless” series. This article will discuss the second most crucial pillar of the AWS Well-Architected Framework (WAF): Operational Excellence (OPS). We have a Well Architected webinar coming up! To learn more about the AWS Well-Architected Framework (WAF) through the serverless lens and how to build Well-Architected architectures, make sure to attend our upcoming webinar on Friday, 27 November.
IT automation is growing in adoption this year as IT organizations grapple with constantly changing priorities, the pressure of supporting large remote workforces and tight resources. However, IT teams are hesitant to deploy automation workflows on production infrastructure that supports important business applications and services. Trust is an issue – but errors do occur. Unsupervised automation can sometimes create more problems by missing the actual context for issue resolution.