Latest Posts

4 Things you Need to Know about Writing Better Production Readiness Checklists

Feb 16, 2021 By Emily Arnott In Blameless

When we think of reliability tools, we may overlook the humble checklist. While tools like SLOs represent the cutting edge of SRE, checklists have been recommended in many industries such as surgery and aviation for almost a century. But checklists owe this long and widespread adoption to their usefulness. Checklists can also help limit errors when deploying code to production. In this blog post, we’ll cover: Production checklists should be holistic.

Read Post

Blameless

Read more about 4 Things you Need to Know about Writing Better Production Readiness Checklists

Graphite Dropping Metrics: MetricFire can Help!

Feb 16, 2021 By Nick Campion In MetricFire

Sometimes a seemingly well-configured and fully-functional monitoring system can malfunction and lose metrics. Subsequently, you get a distorted picture of what is happening with the monitoring object. In this article, we will look at the possible causes of Graphite dropping metrics and how to avoid it. MetricFire specializes in monitoring systems. You can use our product with minimal configuration to gain in-depth insight into your environments.

Read Post

MetricFire

Read more about Graphite Dropping Metrics: MetricFire can Help!

Application Performance Monitoring: Why is it important for your organization?

Feb 16, 2021 By Motadata In Motadata

Application Performance Monitoring (APM) refers to monitoring or managing the performance of your code, application dependencies, transaction times, & overall user experiences. It is an important technology that ensures the computer application programs are performing as expected. The ultimate goal of performance monitoring is to supply end users with a top quality end-user experience.

Read Post

Motadata

Read more about Application Performance Monitoring: Why is it important for your organization?

An Intro to PromQL: Basic Concepts & Examples

Feb 16, 2021 By Gedalyah Reback In logz.io

PromQL, short for Prometheus Querying Language, is the main way to query metrics within Prometheus. You can display an expression’s return either as a graph or export it using the HTTP API. PromQL uses three data types: scalars, range vectors, and instant vectors. It also uses strings, but only as literals. This intro will provide basic PromQL examples and concepts to understand as you get used to Prometheus queries.

Read Post

logz.io

Read more about An Intro to PromQL: Basic Concepts & Examples

How Puppet Supports DevOps Workflows in the Windows Ecosystem

Feb 16, 2021 By Alexa Sevilla In Puppet

For Windows teams that adopt a DevOps approach, augmenting their native toolset (GPO, SCCM, PowerShell) can offer reliable and repeatable processes that successfully affect change. This quick overview highlights how Puppet Enterprise can complement existing Windows tools for better visibility and transparency across the automation processes.

Read Post

Puppet

Read more about How Puppet Supports DevOps Workflows in the Windows Ecosystem

The essential config settings you should use so you won't drop logs in Loki

Feb 16, 2021 By Owen Diehl In Grafana

In this post, we’re going to talk about tips for securing the reliability of Loki’s write path (where Loki ingests logs). More succinctly, how can Loki ensure we don’t lose logs? This is a common starting point for those who have tried out the single binary Loki deployment and decided to build a more production-ready deployment. Now, let’s look at the two tools Loki uses to prevent log loss.

Read Post

Grafana

Read more about The essential config settings you should use so you won't drop logs in Loki

Close the Loop with User Feedback

Feb 16, 2021 By Philipp Hofmann In Sentry

Everyone’s software crashes. As an engineer, you don’t feel your users’ frustration unless they reach out to customer support, write bad reviews, or tweet about it. This feedback is often lacking relevant information to resolve the issue. In some cases, you can re-engage with the customer, but that process is time-consuming and inefficient. Another option would be to examine the crash reports, but sometimes they don’t give sufficient insight to fix the problem.

Read Post

Sentry

Read more about Close the Loop with User Feedback

Online CNCF event: Why you should use NATS for your next Cloud native application

Feb 16, 2021 By Romaric Philogène In Qovery

When building Cloud applications, we often put significant effort into breaking down our monoliths into small code pieces. They are easier to maintain but hard to make them communicate together. This is where NATS comes in. NATS is a simple and highly performant messaging system for Cloud-native apps. In this talk, I will share my experience using NATS at Qovery, why you should or should not use it, and the difference between the well-known RabbitMQ and Kafka.

Read Post

Qovery

Read more about Online CNCF event: Why you should use NATS for your next Cloud native application

Three ways tight integration makes logging and monitoring easier

Feb 16, 2021 By John Day In Google Operations

Driving productivity of software development and delivery teams is critical for any organization. The six years of research by DevOps Research and Assessment (DORA) showcases the role easy-to-use tooling plays in driving this productivity and in turn a better work/life balance for the team. The research finds that highest performing teams are 1.5x more likely to have tools they consider easy to use.

Read Post

Google Operations

Read more about Three ways tight integration makes logging and monitoring easier

Using Let's Encrypt Free Certs with your Linux Servers

Feb 16, 2021 By GroundWork In GroundWork

Part 2 of our Blog series on certificates focuses on a practical matter: using the free Let’s Encrypt certificates to secure servers that may not be publicly available, but still need better security than self-signed certs can give you. As we explained in our last blog on this subject, to use HTTPS encryption with certificates, you can choose from a number of options.

Read Post

GroundWork

Read more about Using Let's Encrypt Free Certs with your Linux Servers

Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

4 Things you Need to Know about Writing Better Production Readiness Checklists

Graphite Dropping Metrics: MetricFire can Help!

Application Performance Monitoring: Why is it important for your organization?

An Intro to PromQL: Basic Concepts & Examples

How Puppet Supports DevOps Workflows in the Windows Ecosystem

The essential config settings you should use so you won't drop logs in Loki

Close the Loop with User Feedback

Online CNCF event: Why you should use NATS for your next Cloud native application

Three ways tight integration makes logging and monitoring easier

Using Let's Encrypt Free Certs with your Linux Servers

Monthly Archive

Follow Us