Operations | Monitoring | ITSM | DevOps | Cloud

%term

Five Things Your APM Platform Should do for Your Container Application Deployments.

One of the chief complexities in running large scale containerized applications is the need for continuous systems/application monitoring. Containers are very different from traditional VMs and the 3 tier applications that run on them. Monitoring that needs to ensure that SLAs promised to the business are being met as well as an ability to forecast usage trends while identifying problem areas such as bugs, capacity challenges, slowing performance, and any potential downtime.

Dynamic Sampling by Example

Last week, Rachel published a guide describing the advantages of dynamic sampling. In it, we discussed varying sample rates to achieve a target collection rate overall, and having different sample rates for distinct kinds of keys. We also teased the idea of combining the two techniques to preserve the most important events and traces for debugging without drowning them out in a sea of noise.

Why Your Lambda Functions May Be Doomed To Fail

AWS Lambda has a cool feature that can be both a blessing and a nightmare for a serverless application, depending on whether it’s properly handled by our code: the retry behavior. A retry occurs when an invocation of a Lambda function results in an error and the AWS Lambda platform automatically invokes the function again, with the same event payload. Before we get deeper, make sure you are familiar with the AWS documentation on the subject.

Alert escalation - How it works in SIGNL4

Part of any managers role is to make sure their team is taking accountability. Managers are not the front lines resolvers that handle issues, that is what they have a team for. However, managers do need to be aware of incidents that are occurring as well as making sure their team is taking ownership and resolving those issues. SIGNL4 takes the managerial work out of being a manager by providing alert ownership transparency.

PagerDuty: Your Journey To Real-Time Operations

In a world where people expect always-on, seamless digital experiences, it is essential that teams are empowered with the right tools and processes to work together and deliver in critical moments of truth. Our CEO, Jennifer Tejada, shares how PagerDuty acts as the central nervous system for the digital enterprise, helping connect teams to real-time opportunity and elevate work to the outcomes that matter.

PagerDuty Pulse May19

Catch up on all the exciting things we’ve released over the past several months. In this edition of PagerDuty Pulse, you’ll get a view into our Spring release, which helps teams across the enterprise effectively take action during the most critical moments with the power of data, intelligence, and automation at scale. We’re excited to release and share new enhancements across all of our products (Event Intelligence, Modern Incident Response, Analytics, Visibility), as well as to the core platform.

Best Practices for Monitoring Your Azure Environment

Adoption of cloud services and Azure services in particular has exploded in the last few years – over 60% of enterprises now use Azure. As Azure users deploy ever more sophisticated application architectures, it becomes even more important to have a logging and monitoring system that can handle the complexity. The ELK stack is the most popular tool for this, but comes with its own challenges.

Setting up Zia, your service desk's conversational virtual IT support agent [Webinar]

The cloud version of ServiceDesk Plus now comes with your own virtual support agent, #Zia, who can be the first point of contact for your service desk. Zia helps perform simple service desk activities and fetch information, so end users don't have to rely on a technician. And with access to a conversational #virtual_support_agent, technicians in the field can now perform #servicedesk activities with simple hands-free voice commands. Learn more about Zia and her capabilities in this webinar.