The latest News and Information on Incident Management, On-Call, Incident Response and related technologies.
Technology is changing the world faster than ever. Thanks in part to the rise of the Software-as-a-Service (SaaS) model, customers have come to expect the apps they use to be accessible at all times. As a result, companies are transforming the way their teams operate in order to meet these demands. And perhaps no team experiences the impact of a transformation like this more than IT.
This week’s AWS Summit in New York was an exciting one for both AWS and PagerDuty. The AWS team rolled out Amazon EventBridge, a set of APIs for AWS CloudWatch Events that makes it easy for AWS SaaS partners to inject events for their customers to process in AWS. PagerDuty is excited to continue and deepen our long partnership with AWS by supporting EventBridge as a launch partner.
I hear it all the time when talking to future BigPanda customers; “I’m not sure BigPanda can really help me correlate all these alerts together because our CMDB is very immature.” Or sometimes, they don’t even have a CMDB, and incorrectly assume this disqualifies them from meaningful noise reduction and alert correlation. I’m happy to tell you the same thing I tell the folks who are looking at BigPanda for the first time. “No CMDB? No problem!”.
Last year at PagerDuty Summit 2018, we officially launched PagerDuty University (PDU), a training program that provides hands-on classroom training to current and prospective customers so they can get the most out of the PagerDuty platform. Since its debut, PDU has taught hundreds of learners how they can optimize their instances to minimize downtime and improve responders’ quality of on-call life, in addition to providing thought leadership best practices to customers around the globe.
Software vendors and analysts love to rattle off scary numbers about how many thousands of dollars per minute or hour an infrastructure outage will cost the typical company. Those numbers can be scary indeed; for example, Gartner quotes $5,400 per minute as the cost borne by a medium to large-sized retailer. Your company, however, is most likely not identical to the “typical” company on which the numbers are based.
In the IT world, support teams race to resolve incidents, ensuring that clients and stakeholders remain satisfied. Yet, many teams struggle to meet time-to-response goals due to ineffective alerting processes, human errors and sluggish procedures.
July 2019 Update introduces the option to opt-out for certain categories as well as some enhancements in the Web portal. You can now opt-in/out of certain categories under Settings -> Services & Systems. This works on a per-user basis and is useful when you do not want to receive certain alerts but your team members still need to get them. Another scenario is to listen in, meaning you see what is going on but all notifications can be muted.
No matter how you design your architecture or what technologies you implement, critical incidents will happen. When things go wrong, it is easy to get carried away and forget about the bigger picture. But your work isn’t done after you fix the immediate problem; now is the time to take a look at how the incident actually happened so that you can learn from it.