Blog

Troubleshooting with Log Management - Best Practices

Dec 3, 2019 By Graylog In Graylog

We’ve already covered why log management is important, but we’ve only briefly touched upon one of the best ways that managing your log files can help you and your enterprise, which is, namely – with troubleshooting.

Read Post

Graylog

Read more about Troubleshooting with Log Management - Best Practices

Monitoring and observability:

Dec 3, 2019 By Sean Porter In Sensu

Monitoring versus observability is a hotly debated topic. It’s been argued that they’re two distinct things — the former just a high-level overview of a problem after the fact, while the latter enables you to be proactive. Observability has also been dismissed as jargon, much how “DevOps” sounds to the seasoned operator.

Read Post

Sensu

Read more about Monitoring and observability:

BugSplat's New Look

Dec 3, 2019 By Joey P In BugSplat

Today we're announcing BugSplat's new look to our customers via email. We've made some pretty significant updates, and we're excited to walk you through them.

Read Post

BugSplat

Read more about BugSplat's New Look

The IT Status Page for Employees ROI Calculator

Dec 3, 2019 By StatusCast In StatusCast

Does your help desk get overwhelmed with tickets when an IT issue occurs because communication with employees is either insufficient or non-existent? Have you considered implementing an IT status page to improve this communication failure, but have found it difficult to justify the expense? Statuscast built its IT Status Page ROI Calculator with you in mind.

Read Post

StatusCast

Read more about The IT Status Page for Employees ROI Calculator

What Is MTBF? Mean Time Between Failures Explained in Detail

Dec 3, 2019 By Carlos Schults In XpoLog

Time for another installment in the series where we explain in detail yet another important metric for tech organizations. After covering MTTD and MTTF, today we answer the question, “What is MTBF?” As the post title makes clear, MTBF stands for “Mean time between failures.” The acronym refers—like the others that came before it—to an important DevOps KPI. But what actually is it? What is it good for? How do I implement it?

Read Post

XpoLog

Read more about What Is MTBF? Mean Time Between Failures Explained in Detail

Monitoring AWS Lambda functions with Datadog

Dec 2, 2019 By Evan Mouzakitis In Datadog

The “serverless” movement is taking the industry by storm and now, with Datadog, you can start monitoring your serverless applications and functions on AWS Lambda. As soon as you enable the Lambda integration, you’ll start to see your metrics in an out-of-the-box dashboard like the one above. Monitor and alert on AWS Lambda serverless functions in minutes with Datadog.

Read Post

Datadog

Read more about Monitoring AWS Lambda functions with Datadog

Monitor G Suite activity with Datadog

Dec 2, 2019 By Mallory Mooney In Datadog

G Suite is a collection of cloud-based productivity and collaboration tools developed by Google. Today, millions of teams use G Suite (e.g., Gmail, Drive, Hangouts) to streamline their workflows. Monitoring G Suite activity is an essential part of security monitoring and audits, especially if these applications have become tightly integrated with your organization’s data.

Read Post

Datadog

Read more about Monitor G Suite activity with Datadog

[KubeCon Recap] Configuring Cortex for Maximum Performance at Scale

Dec 2, 2019 By Julie Dam In Grafana

In this KubeCon + CloudNativeCon session last month, Grafana Labs Software Engineer Goutham Veeramachaneni offered a deep dive into how to make sure your Cortex will scale with your usage.

Read Post

Grafana

Read more about [KubeCon Recap] Configuring Cortex for Maximum Performance at Scale

Danny Mican on his experience as an SRE at Auth0

Dec 2, 2019 By Prakya Vasudevan In Squadcast

Danny is an SRE at Auth0 and currently manages the reliability of systems that authenticate over 2.5 billion logins per month and is expected to have 99.9% (Three Nines) availability. He loves learning about systems and making changes that positively impact client happiness, employee happiness and long term stability and growth.

Read Post

Squadcast

Read more about Danny Mican on his experience as an SRE at Auth0

Modern compliance with Sysdig Secure DevOps Platform

Dec 2, 2019 By Josh Ziman In Sysdig

Authorization to Operate (ATO) in a day and on-going authorization are compliance nirvana. The ATO is the authorizing official’s statement that they accept the risk associated with the system running in production environments using live business data. The idea that all of the information necessary to make a risk decision is at hand and can be consumed by decision makers is what every compliance program is trying to achieve.

Read Post

Sysdig

Read more about Modern compliance with Sysdig Secure DevOps Platform

Operations | Monitoring | ITSM | DevOps | Cloud

Blog

Troubleshooting with Log Management - Best Practices

Monitoring and observability:

BugSplat's New Look

The IT Status Page for Employees ROI Calculator

What Is MTBF? Mean Time Between Failures Explained in Detail

Monitoring AWS Lambda functions with Datadog

Monitor G Suite activity with Datadog

[KubeCon Recap] Configuring Cortex for Maximum Performance at Scale

Danny Mican on his experience as an SRE at Auth0

Modern compliance with Sysdig Secure DevOps Platform

Monthly Archive

Follow Us