Operations | Monitoring | ITSM | DevOps | Cloud

Uptime.com Check Types | How to Build the Ultimate Uptime Monitoring System

How much infrastructure for a domain or application can fail before the customer starts to notice? What about before your productivity is affected? The answer to these questions will help you fully utilize uptime monitoring. Here are just a few examples of services that can be monitored for better piece of mind.

Sentry for Good

Errors are expensive; they steal resources allocated for other things and potentially negatively impact revenue and user sentiment. And, for teams comprised of volunteers working in their spare time, errors can take weeks to triage and resolve. So, despite what Google might tell you, Sentry for Good is not merely a solution to your pet’s pesky pheromone problems (although it is clearly also that, if PetSmart’s Google results are any indication).

Features review from 2017

This is the second edition of our features review from the past few years, here we will share some features that were created or updated in 2017. We are currently moving content from our old blog platform so all features mentioned here are not new but it is always good to take a fresh look at things. And as we are always upgrading & enhancing features, some of the items have been edited to reflect the current state.

10 Reasons You Should Run Your Serverless Applications & FaaS on Kubernetes

Over the last year, along with Kubernetes, Serverless computing platforms have acquired tremendous mindshare among the development community. As Serverless implementations begin to proliferate, I want to make the case that there are tremendous synergies to be gained by bringing both these paradigms together. Some of these benefits have been covered in previous posts. The majority of enterprises are embarking on their DevOps journey. Scaling such processes across a large enterprise is complicated.

Serverless app to speed up all your Lambda functions

A while back, I wrote about how you can shave latency off every AWS SDK operation by enabling HTTP keep-alive, like this. It had the desired effect and I saw lots of people apply this technique in their projects. But it also resulted in the same 10 lines of code being copied and pasted everywhere! I began thinking about ways to distribute an optimized version of AWS SDK so everyone can benefit.

June 2019 Release Overview: Work In Real Time, All The Time, Wherever You Are

This month, we are excited to announce a new set of product capabilities and enhancements designed to ensure that teams can work in real time, all the time, wherever they are. Whether they’re on-the-go with their mobile devices or at their desks on a typical work day, we will continue to innovate without sacrificing ease-of-use and adoption.

How to Decode Your AWS Bill (and What's within DevOps' Control)

The typical AWS bill, otherwise known as the AWS Cost and Usage Report, includes line items that are useful to both finance and DevOps. However, many of the metrics that are within engineers’ and cloud architects’ control aren’t so simple to discover. To make cost a first-class operational metric for DevOps, teams need visibility into the data that’s relevant to engineering activity.

Investigating Timeouts with Tracing

Tracing is one of the key tools that Honeycomb offers to make sense of data. Over the last few weeks, we’ve made a number of improvements to our tracing interface — and, put together, those changes let you think about traces in a whole new way! Tracing makes it easier to understand control flow within a distributed system. We render traces with waterfall diagrams, which capture the execution history of individual requests.

OnPage and ConnectWise: Incident Alert Management Workflows

Let’s set the scene: You’re an on-call engineer, working for a dedicated support team. Your priorities are twofold, including, (1) speedy incident resolution and (2) satisfying clients and stakeholders. With these demands in mind, you adopt OnPage’s integration with ConnectWise. The integration streamlines the ticketing-to-alerting process, ensuring that your team achieves client service excellence.

17 Tech Support Tickets You'll Be Happy You Didn't Receive

If tech support had a motto, it’d be reminiscent of Rule #4 of the Auvik Way: Even when it’s not your fault, it’s your problem. But sometimes, there are problems so bad you wouldn’t want to deal with them. We’ve rounded up 17 examples from the r/techsupportgore subreddit that are sure to send a palm to your face and a shiver down your spine: Plugging in your USB receiver with a hammer for that flush mounted look. from r/techsupportgore Good luck getting that one out.