Snoozing alerts and advanced Slack notifications
We've introduced two very cool new features to Oh Dear: the ability to temporarily silence alerts and advanced Slack notifications.
We've introduced two very cool new features to Oh Dear: the ability to temporarily silence alerts and advanced Slack notifications.
Mattermost 5.22 ships with new features designed to help your team work faster.
Seemingly simple digital moments, like checking into a flight, trigger a complex technical flow of events under the IT covers. A simple swipe or click relies on a complex IT ecosystem made up of millions of lines of code, spanning multiple software applications, hybrid and multi-cloud technologies, state-of-the-art IT infrastructure, security apps, and more.
When working with observability data, a good portion of it comes in as time series data — things like CPU or memory utilization, network transfer, even application trace data. And the Elastic Stack offers powerful tools within Kibana for time series analysis, including TSVB (formerly Time Series Visual Builder). In this blog post, I’m going to attempt to demystify rates in TSVB by walking through three different types: positive rates, rate of change, and event rates.
Elasticsearch allows you to store, search, and analyze large amounts of structured and unstructured data. This speed, scale, and flexibility makes the Elastic Stack a powerful solution for a wide variety of use cases, like system observability, security (threat hunting and prevention), enterprise search, and more. Because of this flexibility, effectively architecting your deployment’s data storage for scale is incredibly important.
We no longer live in a world where a few tools determine the way organizations structure their processes. From IT Service Delivery to Incident Response, Modern IT Operation Solutions need to embody the flexibility that most Enterprises require. The dynamic ITOps ecosystem has shifted to put choice back in the hands of the user. Now, IT Solutions must follow suit. Modern Incident Response platforms, in particular, need the flexibility that enterprises need to mirror their enterprise architecture.
We’ve been monitoring 100,000’s of serverless backend components for 2+ years at Dashbird. In our experience, Serverless infrastructure failures boil down to: These isolated faults become causes of failure due to dependencies in our cloud architectures (ref. Difference of Fault vs. Failure). If a serverless Lambda function relies on a database that is under stress, the entire API may start returning 5XX errors.
There is a dark side to digital transformation, but nobody wants to talk about it. I previously wrote about technical debt. But there’s also complexity debt. When a CIO decides to delay modernizing or upgrading systems, there are usually budget considerations and skills gaps that stand in the way. The IT leader’s job is one of continual evaluation of risk and opportunity amid rapid technology disruption.
I’ve offered some tips up for folks who are oncall during the COVID-19 crisis, but I thought it would be helpful to get some more ideas from people with different perspectives. So I reached out to some people I trust to see what they had to say. They all have different viewpoints, but some themes emerge, like managing alerts, having empathy, and practicing self-care. The participants, in alphabetical order: Aaron Aldrich is a Developer Advocate at LaunchDarkly, with a focus on DevOps.
We’ve added production instances in two new locations: US-WEST-2 (Oregon) and AP-SOUTHEAST-2 (Sydney). These are just the first of many, but together they deliver serious performance improvements for our customers around the globe.