Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Embracing virtual connections at AWS re:Invent 2020

Get started with Gremlin's Chaos Engineering tools to safely, securely, and simply inject failure into your systems to find weaknesses before they cause customer-facing issues. This year has seen a complete re-imagining of tech conferences. Some were cancelled or postponed, while others have evolved and embraced the opportunity to go virtual. This meant innovating to bring the in-person event experience online.

Deployment Rollbacks via FireHydrant Runbook

FireHydrant has a sophisticated set of response actions for coordinating communications, activities, and retrospectives for incidents that affect your services. Relay helps by automating remediations that involve orchestrating actions across your infrastructure. In this example workflow, an incident that affects an application deployed on Kubernetes can trigger a rollback to a previous version automatically.

Accelerate Incident Response and Incident Management with AIOps. 5 Key Benefits in Cisco Environments

Artificial Intelligence for ITOps (AIOps) can help accelerate incident response with all the incident context, impact assessment, triage data and collaboration & automation tools at one place.

10 Automated Service Desk Workflows We're Thankful for in 2020

Thanksgiving is just a couple days away for folks celebrating in the US, so what better time than now to share what we’re thankful for? Many of us are giving thanks for loved ones, health, a roof over our heads, and full bellies. We’re also paying homage to the unsung IT heroes and service providers in our organization who make our jobs easier—especially amid the unanticipated shift to remote work this year.

The Future of Work at PagerDuty: Why Go Back to Normal When We Can Go Back to Better?

A few months ago, I wrote about PagerDuty’s mission to reimagine the workplace in the wake of COVID-19, and wanted to share an update. When we look back at the last 10 months of work, one thing is clear: We are never going back to “normal.” Though it might sound daunting, I am excited about the opportunity it offers to forge our next normal and to make it a better normal than the one we’ve left.

Introducing Inputs Data Manager on Splunk Cloud

Splunk Cloud’s ecosystem of apps and technical add-ons boasts a comprehensive set of input sources that enrich customer data insights. Many of these inputs reside in Cloud contexts, such as AWS, Salesforce, Azure, GCP, and many others. The Inputs Data Manager was introduced to aid the ingestion of these cloud data sources. As a result, in many cases, customers no longer need to host their own infrastructure to run scripted and modular inputs.

The importance of metadata in your Kubernetes observability initiatives

Kubernetes is a popular container orchestration system at the heart of the Cloud Native Computing Foundation projects. It automates the deployment, lifecycle, and operations of containers, containerized applications, and "pods," which are groups of one or more containers. The platform itself, along with each of these workloads, may generate event data. There are different kinds of data associated with these processes.

3 Ways How Live Call Routing Increases Care Team Efficiency

Hospitals require a solution to streamline after-hours communication between patients and medical professionals. Exceptional after-hours care helps enhance the patient experience in critical situations. In this blog, we’ll discuss the importance of dedicated lines and after-hour live calls.

AWS Well-Architected Framework in Serverless: Operational Excellence

Welcome to part two of the five-part “Well-Architected Serverless” series. This article will discuss the second most crucial pillar of the AWS Well-Architected Framework (WAF): Operational Excellence (OPS). We have a Well Architected webinar coming up! To learn more about the AWS Well-Architected Framework (WAF) through the serverless lens and how to build Well-Architected architectures, make sure to attend our upcoming webinar on Friday, 27 November.

Workflow Automation with Human Supervision

IT automation is growing in adoption this year as IT organizations grapple with constantly changing priorities, the pressure of supporting large remote workforces and tight resources. However, IT teams are hesitant to deploy automation workflows on production infrastructure that supports important business applications and services. Trust is an issue – but errors do occur. Unsupervised automation can sometimes create more problems by missing the actual context for issue resolution.