Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

MTBF Is an Integral Part of Business Operations - Here's Why

In today’s fast-paced digital world, your customers expect your services to be available 24 hours a day, seven days a week. If your services are unreliable, these customers will likely take their business elsewhere — and spread the word. To retain their business, you must understand and optimize your service and system health to ensure your services are reliable. Gauging your service and system health requires much more than knowing whether they’re on or off.

What's new: Updates to Event Intelligence, mobile, and more!

As we near the end of the Summer season, we’re excited to announce a new set of updates and enhancements to the PagerDuty platform. These updates will help our users and customers: Make sure to view the latest PagerDuty Pulse or learn more from our community team and developer advocates who have launched new programs to help you learn more about our latest products and best practices.

Call Handling - Relieve the burden of your service desk and on-call staff

These days, I keep encountering inquiries from various customers on the topic of call handling. Due to the current transformation, triggered by the increased use of home offices, it is becoming more and more important to make on-call staff more accessible. Often the already overloaded service desk is used for this purpose. Of course, this leads to a) a deterioration in the quality of the service desk and b) delays between the receipt of the problem and the start of problem resolution.

Automate your LogDNA + PagerDuty Incident Workflow

LogDNA integrates with your PagerDuty instance to help trigger incidents based on log data coming in from your ingestion sources. This allows your teams to quickly understand when there are issues with your application, and where in the logs you can investigate to understand root cause. To help further accelerate your team’s ability to understand the state of your applications, we are introducing the ability to automatically resolve those PagerDuty Incidents directly from LogDNA.

Self-Compassion Instead of Self-Blame

The tech industry is competitive and not without challenges. People are always growing and improving by pushing their limits. Innovation comes in many forms. In order to foster a healthy culture while allowing people to flourish, organizations must carefully enact policies. Growth should be encouraged while discouraging competition and comparison. One of the core policies organizations implement to achieve these goals is blamelessness.

Best practices to help retailers make the grade for the holiday season

It’s hard to believe we’re already talking about the return to school, but it’s set to be a big one. In fact, this year promises to be the biggest in the last five years. The National Retail Federation expects back-to-school spending to reach $37.1B , up from $33.9B last year. Back-to-college spending is also expected to rise, reaching $71B this year. This increase is buoyed by parents and students gearing up for their first in-person classes after a year of virtual learning.

Introducing the Spike.sh Alert Reliability Engine

At Spike.sh, our mission is to help dev teams understand and resolve production issues faster. At the core of this is our Alert Reliability Engine, whose job is to make sure that a team member always gets an alert on their preferred channel. Currently, we support 7 channels - phone call, SMS, mobile push notifications, email, Slack, Microsoft Teams and Discord. We wanted to give you a peek into how we achieve high deliverability across these channels.