Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Empowering Remote Users

2020 has certainly presented all of us with its fair share of challenges. Small businesses and large organizations alike have been forced to change policies and procedures to adapt to the concept of ‘the remote worker’. As more and more employees are working from home, it is critical that they are set up for success and armed with the tools to address issues that arise, no matter where they are located.

SREview Issue #8 December 2020

🎼 Frosty the SRE/ Was a jolly happy soul/ With his runbooks tight and automated/ and SLOs made out of gollldddddd! 🎼 It’s the most wonderful time of the year, and to celebrate, here’s your December issue of the SREview! This monthly zine features epic Tweets, content, and events happening in the SRE and resilience engineering community.

Create a New Integration in Opsgenie

Opsgenie is a powerful alert management service that allows you to flexibly set up teams for different alerting groups. Our development team have been working hard to deliver new features and integrations, and now you are able to integrate Opsgenie with RapidSpike to help with your website monitoring.

Fuelling Always-On Digital Services in the Financial Sector

The financial services sector in Australia has undergone seismic change recently with the rise of neo disruptors and a cashless society driven by the pandemic. Australia is quickly becoming one of the more mature markets to embrace digital transformation, with the federal government announcing it has committed $800 million to a digital infrastructure upgrade. As we move closer to 2021, the financial services sector will continue to see accelerated change and a greater reliance on digital technology.

10 Tips for Handling a Major Outage (When Your Website Is Down on Black Friday)

If your e-commerce website is down< due to a Black Friday major outage/b>, are you prepared to handle it? Customers demand exceptional digital retail experiences on Black Friday, Cyber Monday, and throughout the holidays. If your company can manage issues that can cause an e-commerce outage, it can limit their impact now and in the future. Ultimately, we want to help your team handle major incidents.

Digital Incidents in Retail Have Increased 37% Year-Over-Year

2020 will go down as one of the hardest years that brick-and-mortar businesses have ever experienced. By the end of March this year, half of the world’s population was estimated to be on “lockdown,” causing an unprecedented shift in priority for businesses from brick-and-mortar stores to ecommerce channels.

Introducing Blameless Runbook Documentation

At Blameless, our mission is to provide teams with the tools they need to operationalize SRE and embrace a culture of resilience. We help teams automate toil and adopt best practices across integrated incident management, comprehensive retrospectives, service level objectives, reliability insights, and more. We are very excited to announce that teams now have a new tool in their tool belts with our latest launch. Blameless Runbook Documentation is now available for early access.

Borrow Expertise With Runbook Automation

Every team has their experts. Maybe you’re the expert for a segment of your team’s applications—the person who’s always called when there’s a problem or when something unexpected happens or when things just look “weird” and the solution isn’t simple. Maybe there’s two of you or even, if you’re lucky, three!

Hybrid cloud: easing the migration pain

Most enterprises today are in the stages of a hybrid cloud migration, and no matter how they view this migration or implement it, the challenges they face are fairly common across the board. Does that mean there are common best practices for a hybrid cloud migration, and if so – what are they? Join us in a CTO Perspective discussion with Scott Stradley, Field CTO at BigPanda. Lean back and watch the interview, or if you prefer reading, take a few minutes to read the transcript.

Major incident reporting template: Downloadable and with a tutorial

According to a recent report by IBM, the damage caused by major IT incidents is greater than ever. An incident that results from a data breach will cost the organization an average of $3.86 million, with the average time to breach containment coming in at 280 days! And according to the ITIC, hourly downtime costs come in at over $300,000, with some at even $1 million per hour.