Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Top tips: 5 lessons learned from the recent Microsoft Azure disruption to survive the next cloud outage

The recent Microsoft Azure outage had a profound impact, disrupted services for countless businesses and individuals around the globe, and exposed the risks of relying exclusively on cloud solutions. This incident, triggered by a mix of technical failures and unexpected complications, resulted in substantial downtime, access issues, and operational interruptions across multiple industries.

Setting up and Understanding OpenTelemetry Collector Pipelines Through Visualization

Observability provides many business benefits, but comes with costs as well. Once the (not-insignificant) work of picking a platform, taking an inventory of your applications and infrastructure, and getting buyin from leadership (both from the business and engineering sides of the house) is done, you then have to actually instrument your applications to emit data, and build the data pipeline that sends that data to your observability system.

Debug (even) faster with 8 Sentry updates

Over the past few months, we introduced several new features and capabilities. While we released larger product updates like Trace Explorer, Insights modules, and our JavaScript V8 SDK (to name a few), it’s the smaller, iterative improvements that really make a big difference in your debugging workflow. Let’s dive into 8 recent updates that you should know about.

Unlock the Value of Cloud: Introducing Splunk Cloud Value Calculator

In the rapidly evolving digital landscape, organizations are increasingly turning to the cloud powered with AI capabilities to enhance efficiency, scalability and innovation. Splunk, a leader in security and data observability, has been at the forefront of this transformation.

Carbon Footprint Reporting

We’re excited to share some big updates and enhancements that underscore our dedication to innovation and efficiency. Check out our new Carbon Footprint Reporting add-on feature, a sleek dark mode GUI theme, and major improvements to Business Entity tracking and Dashboard Widgets. Plus, we’ve added new custom components for better peripheral logging and expanded firmware update support for Panduit Gen6 rack PDU products.

Improving Developer Efficiency

Developers are expensive to hire, and it takes time to get new hires up to speed. Getting the most out of developers and retaining them should be a priority for any organization. Fortunately, developers like creating new stuff, and organizations want new functionality. Therefore, if there was a way of minimizing the time spent fixing bugs, the new feature backlog would be reduced, and happy developers would stay around.

Are you Prepared for Your Next Major Outage?

Software is not perfect. And ultimately, it’s not a matter of if you will have an outage, but of when. With the increasing complexity and frequency of IT incidents, is your organization prepared to respond and recover when each second counts? Here at PagerDuty, we’ve compiled a list of best practices to keep your systems up and running.

Steps to AIOps maturity: Improve MTTR with AI

Many organizations face increased costs from excess noise, manual workflows, and long outage times. These inefficiencies negatively impact budget, service uptime, and, ultimately, customer satisfaction. With effective use of AI, you can give operators the most relevant, full-context incident data, providing a greater understanding of an incident within seconds.

Without AI, Your Telemetry Data Pipeline Sucks

History is filled with stories of human triumph. One of the most famous such stories is that of John Henry, “The Steel Driving Man.” As the traditional American folk story goes, John Henry and his fellow workers were faced with the arrival of the steam engine, which threatened to replace their manual labor. To prove that human strength and skill could outperform the new technology, John Henry challenged the machine to a contest.