Latest News

Thought Leadership Panel: What is a "real" SRE?

Apr 21, 2020 By Blameless In Blameless

Blameless recently had the privilege of hosting SRE leaders Craig Sebenik, David Blank-Edelman, and Kurt Andersen to discuss how can SREs approach work as done vs work as imagined, how to define SRE and DevOps and the complementary nature of the two, the ethics of purchasing packaged versions of open source software, and more. The transcript below has been lightly edited, and if you’re interested in watching the full panel, you can do so here.

Read Post

Blameless

Read more about Thought Leadership Panel: What is a "real" SRE?

PagerDuty Recognized in G2's Annual Best Software Awards

Apr 21, 2020 By PagerDuty In PagerDuty

G2, the largest software marketplace and review platform, recently announced the 2020 winners of its annual Best Software Awards, which recognizes 100 companies globally—and PagerDuty is thrilled to be named the leader in the Best Incident Management category.

Read Post

PagerDuty

Read more about PagerDuty Recognized in G2's Annual Best Software Awards

What's New: Related Incidents, Business Response, Mobile Status Dashboard, & New Integrations

Apr 20, 2020 By Alex Ware In PagerDuty

An always-on world requires a proactive and preventative approach to managing your digital operations. PagerDuty is proud to announce our latest release, which helps streamline remote remediation by providing an at-a-glance overview of your system’s health. While we’re known for on-call management and incident response, PagerDuty does much more, including providing visibility into the business impact of an incident.

Read Post

PagerDuty

Read more about What's New: Related Incidents, Business Response, Mobile Status Dashboard, & New Integrations

Extracting Insights from Metrics with AIOps for Better Observability

Apr 20, 2020 By Adam Frank In Moogsoft

In this second installment of this blog series, we’ll discuss the importance of analyzing metrics, and how AIOps helps you with this fundamental pillar of observability. Without proper metrics analysis, you’re left blind to potential outages, or possibly worse — inundated with false positive anomalies, leading to alert fatigue and ultimately business impacts. Automated discovery and analysis can’t be achieved with legacy tools nor will it scale with humans.

Read Post

Moogsoft

Read more about Extracting Insights from Metrics with AIOps for Better Observability

Advice for On-call Teams During COVID-19

Apr 16, 2020 By Rich Burroughs In FireHydrant

I’ve offered some tips up for folks who are oncall during the COVID-19 crisis, but I thought it would be helpful to get some more ideas from people with different perspectives. So I reached out to some people I trust to see what they had to say. They all have different viewpoints, but some themes emerge, like managing alerts, having empathy, and practicing self-care. The participants, in alphabetical order: Aaron Aldrich is a Developer Advocate at LaunchDarkly, with a focus on DevOps.

Read Post

FireHydrant

Read more about Advice for On-call Teams During COVID-19

IT Teams Under "High Stress" Resolving Faster Than Ever Before

Apr 16, 2020 By Rachel Obstler In PagerDuty

Seemingly simple digital moments, like checking into a flight, trigger a complex technical flow of events under the IT covers. A simple swipe or click relies on a complex IT ecosystem made up of millions of lines of code, spanning multiple software applications, hybrid and multi-cloud technologies, state-of-the-art IT infrastructure, security apps, and more.

Read Post

PagerDuty

Read more about IT Teams Under "High Stress" Resolving Faster Than Ever Before

Modern ITSM Solutions: Flexibility in Incident Response

Apr 16, 2020 By AlertOps In AlertOps

We no longer live in a world where a few tools determine the way organizations structure their processes. From IT Service Delivery to Incident Response, Modern IT Operation Solutions need to embody the flexibility that most Enterprises require. The dynamic ITOps ecosystem has shifted to put choice back in the hands of the user. Now, IT Solutions must follow suit. Modern Incident Response platforms, in particular, need the flexibility that enterprises need to mirror their enterprise architecture.

Read Post

AlertOps

Read more about Modern ITSM Solutions: Flexibility in Incident Response

Remove Manual Bottlenecks in DevOps with AIOps

Apr 15, 2020 By Juan Perez In Moogsoft

DevOps pipelines generate massive amounts of data. To maintain the stability and speed of application delivery, operations leaders must analyze it quickly and continuously. But how can they keep DevOps — and their business — agile? Gartner’s “Augment Decision Making in DevOps Using AI Techniques” provides, in our view, the answer for operations leaders to make precise data-driven decisions and automate actions for rapid application delivery.

Read Post

Moogsoft

Read more about Remove Manual Bottlenecks in DevOps with AIOps

Getting SRE Buy-in from a VP or Director for Automated Metrics and Continuous Learning, Part 2

Apr 14, 2020 By Lyon Wong In Blameless

After getting managerial approval for incident management, your SRE buy-in program is well underway. How can you prove that it’s effective, and that adopting more best practices is necessary? In part 2 of this blog series, we’re going to share how to convince a VP or director to invest in additional SRE practices to strategically improve business results: automated metrics and continuous learning.

Read Post

Blameless

Read more about Getting SRE Buy-in from a VP or Director for Automated Metrics and Continuous Learning, Part 2

Meeting customer support SLAs on Freshdesk using proactive alerting and escalations with Zenduty

Apr 14, 2020 By Vishwa Krishnakumar In Zenduty

As businesses close more deals and add more accounts, it is still imperative for businesses to maintain their SLA levels and resolve customer support tickets within SLA timeframes. Having a solid support team is great, but supporting hundreds or thousands of users in the most efficient, cost-effective way while maintaining SLAs continues to be a challenge for the majority of companies. An SLA policy ( service level agreement) lets you set standards of performance for your support team.

Read Post

Zenduty

Read more about Meeting customer support SLAs on Freshdesk using proactive alerting and escalations with Zenduty

Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Thought Leadership Panel: What is a "real" SRE?

PagerDuty Recognized in G2's Annual Best Software Awards

What's New: Related Incidents, Business Response, Mobile Status Dashboard, & New Integrations

Extracting Insights from Metrics with AIOps for Better Observability

Advice for On-call Teams During COVID-19

IT Teams Under "High Stress" Resolving Faster Than Ever Before

Modern ITSM Solutions: Flexibility in Incident Response

Remove Manual Bottlenecks in DevOps with AIOps

Getting SRE Buy-in from a VP or Director for Automated Metrics and Continuous Learning, Part 2

Meeting customer support SLAs on Freshdesk using proactive alerting and escalations with Zenduty

Monthly Archive

Follow Us