Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

New Year's (observability) Resolutions

A new year has started and I've been pondering my hopes and dreams for the year to come. In the world of SRE, observability is the most prominent pillar of my work. So, I decided to drill into the topic of observability and what I'd like to see happen in the industry in 2023. Rather than focusing on any tool, technology, or methodology, I'lll be exploring concepts that can be broadly applied in any organization.

Jira Automation Demystified

Repetitive tasks can be time consuming. In an ideal world, automation would remove all of the grunt work when it comes to solving business problems, freeing us up to execute on more strategic decisions. Luckily, Jira has the capabilities to take a load of tasks off your hands – including tracking your issues, posts, features, and more. This blog will walk you through the options available and offer top tips on how to set this up.

SLO walkthrough: measuring microservice performance

To improve reliability, we need to measure it, and to measure it we use SLOs (Service Level Objectives). Or at least, that’s what Google SRE has popularized. In practice, it can be difficult and time-consuming to identify the right things to measure, to get to the right data, and to surface the results in a way that engages the stakeholders and teams involved. And all this is especially hard as we scale our teams and applications across multiple technology stacks.

Incident response: Unlocking knowledge and breaking down silos

In a world of monolithic applications and microservices, responding to incidents can be a painful process, involving multiple people with siloed knowledge jumping between different tools to find the relevant data and take action. Individuals within a business often hold the knowledge of how a particular component works, or how it depends on other services. The key to successfully responding to incidents is unlocking this knowledge and breaking down the silos between teams.

FAQ: SquaredUp Cloud

SquaredUp Cloud has been in development for over two years (we first previewed it at SquaredUp Live, Spring 2021). It continues our mission to unlock and summarize data – think of it like “BI for engineering”. In building SquaredUp Cloud, we drew upon what we’ve learned with our Microsoft solutions over the last ten years, and built a solution independent of any one tool, like SCOM.

How I monitor cloud application costs in one simple but powerful dashboard

Although there are many great tools out there to get on top of application monitoring, there’s one vital metric that’s often overlooked by us technical folks – cost. In the days of running apps on servers in private datacenters, the kit was a one-time purchase that the systems team had to deal with. But running apps in public clouds is a different story. Whether you’re running on VMs, containers in Kubernetes, or entirely serverless, execution of your code adds to the bill.

How to get complete CI/CD pipeline observability

It's not like it used to be back in the day! Before CI/CD, we were building on-premises, service-oriented products following system style architecture and we were able to map out the build system and end-to-end process in a PowerPoint or Visio document. Although time-consuming and inefficient, it was relatively straightforward and the build pipeline was unlikely to change drastically. But that's no longer the case.