%term

Site Reliability Engineering Meets Traditional Operations

Dec 5, 2018 By Mike Lunt In Zenoss

Google has effectively made the discipline of site reliability engineering (SRE) a DevOps best practice by publishing two decades’ worth of lessons in keeping alive the most scalable apps on the planet. As more organizations make the shift (or “transformation,” as it were) to becoming IT organizations, the demand for reliability increases substantially for customer-facing services.

Read Post

Zenoss

Read more about Site Reliability Engineering Meets Traditional Operations

PagerDuty Incident Response Training (Summit Series Chicago 2017)

Dec 4, 2018 By PagerDuty In PagerDuty

Incident Response Training @ PagerDuty Summit Series Chicago, September 27th, 2017

View Video

PagerDuty

Read more about PagerDuty Incident Response Training (Summit Series Chicago 2017)

Celebrating Our Launch!!

Dec 4, 2018 By Lauren Detweiler In OpsMatters

If you've ever interacted with the operational applications and tools industry, odds are you've run into trouble finding one source for all your news and information. You're not alone: up until now, no such platform existed. We have great news for you.

Read Post

OpsMatters

Blog

Read more about Celebrating Our Launch!!

ActiveMQ architecture and key metrics

Dec 4, 2018 By David M. Lentz In Datadog

Apache ActiveMQ is message-oriented middleware (MOM), a category of software that sends messages between applications. Using standards-based, asynchronous communication, ActiveMQ allows loose coupling of the elements in an IT environment, which is often foundational to enterprise messaging and distributed applications.

Read Post

Datadog

Read more about ActiveMQ architecture and key metrics

Collecting ActiveMQ metrics

Dec 4, 2018 By David M. Lentz In Datadog

In Part 1 of this series, we looked at how ActiveMQ works, and the key metrics you can monitor to ensure proper performance of your messaging infrastructure. In this post, we’ll show you some of the tools that you can use to collect ActiveMQ metrics. This includes tools that ship with ActiveMQ, and some other tools that make use of Java Management Extensions (JMX) to monitor ActiveMQ brokers and destinations.

Read Post

Datadog

Read more about Collecting ActiveMQ metrics

Monitoring ActiveMQ with Datadog

Dec 4, 2018 By David M. Lentz In Datadog

As you operate and scale ActiveMQ, comprehensive monitoring will enable you to rapidly identify any bottlenecks and maintain the flow of data through your applications. Earlier in this series, we introduced some key ActiveMQ metrics to watch, and looked at some tools you can use to monitor ActiveMQ.

Read Post

Datadog

Read more about Monitoring ActiveMQ with Datadog

SolarWinds Adds SDN Monitoring Support to Industry-Leading Network Management Portfolio

Dec 4, 2018 By SolarWinds In SolarWinds

The latest updates bring Cisco ACI support and expanded anomaly detection capabilities to provide deeper visibility into network environments

Read Post

SolarWinds

Read more about SolarWinds Adds SDN Monitoring Support to Industry-Leading Network Management Portfolio

How to Diagnose and Fix AWS Lambda Iterator Age

Dec 4, 2018 By Mark Siebert In Blue Matador

AWS Lambda can use stream based services as invocation sources, essentially making your Lambda function a consumer of those streams. Stream sources include Kinesis Streams and DynamoDB streams. When you allow streams to invoke your Lambda function, Lambda will emit a CloudWatch metric called IteratorAge. In this post, we discuss what this metric is and how to fix it if it’s too high.

Read Post

Blue Matador

Read more about How to Diagnose and Fix AWS Lambda Iterator Age

Streamlined Kubernetes Cluster Agent

Dec 4, 2018 By sematext In Sematext

Sematext provides a single pane of glass and machine learning powered alerts for logs, metrics, traces and digital user experience data. The new Sematext agent is fully Docker Engine and Kubernetes-aware. (Re)written in Go, it has a minimal memory and CPU footprint. It also collects Kubernetes metrics in the most optimal fashion possible.

Read Post

Sematext

Read more about Streamlined Kubernetes Cluster Agent

Stackdriver tips and tricks: Understanding metrics and building charts

Dec 4, 2018 By Joy Wang In Google Operations

Seeing what’s going on with your IT infrastructure, applications and services has always been critical to the success of modern businesses’ day-to-day operations. Google Stackdriver monitoring provides out-of-the-box visualizations and insights for Google Cloud Platform (GCP) users so you can easily understand your systems.

Read Post

Google Operations

Read more about Stackdriver tips and tricks: Understanding metrics and building charts

Operations | Monitoring | ITSM | DevOps | Cloud

%term

Site Reliability Engineering Meets Traditional Operations

PagerDuty Incident Response Training (Summit Series Chicago 2017)

Celebrating Our Launch!!

ActiveMQ architecture and key metrics

Collecting ActiveMQ metrics

Monitoring ActiveMQ with Datadog

SolarWinds Adds SDN Monitoring Support to Industry-Leading Network Management Portfolio

How to Diagnose and Fix AWS Lambda Iterator Age

Streamlined Kubernetes Cluster Agent

Stackdriver tips and tricks: Understanding metrics and building charts

Monthly Archive

Follow Us