Operations | Monitoring | ITSM | DevOps | Cloud

Blog

The IT Status Page for Employees ROI Calculator

Does your help desk get overwhelmed with tickets when an IT issue occurs because communication with employees is either insufficient or non-existent? Have you considered implementing an IT status page to improve this communication failure, but have found it difficult to justify the expense? Statuscast built its IT Status Page ROI Calculator with you in mind.

What Is MTBF? Mean Time Between Failures Explained in Detail

Time for another installment in the series where we explain in detail yet another important metric for tech organizations. After covering MTTD and MTTF, today we answer the question, “What is MTBF?” As the post title makes clear, MTBF stands for “Mean time between failures.” The acronym refers—like the others that came before it—to an important DevOps KPI. But what actually is it? What is it good for? How do I implement it?

Monitoring AWS Lambda functions with Datadog

The “serverless” movement is taking the industry by storm and now, with Datadog, you can start monitoring your serverless applications and functions on AWS Lambda. As soon as you enable the Lambda integration, you’ll start to see your metrics in an out-of-the-box dashboard like the one above. Monitor and alert on AWS Lambda serverless functions in minutes with Datadog.

Monitor G Suite activity with Datadog

G Suite is a collection of cloud-based productivity and collaboration tools developed by Google. Today, millions of teams use G Suite (e.g., Gmail, Drive, Hangouts) to streamline their workflows. Monitoring G Suite activity is an essential part of security monitoring and audits, especially if these applications have become tightly integrated with your organization’s data.

Danny Mican on his experience as an SRE at Auth0

Danny is an SRE at Auth0 and currently manages the reliability of systems that authenticate over 2.5 billion logins per month and is expected to have 99.9% (Three Nines) availability. He loves learning about systems and making changes that positively impact client happiness, employee happiness and long term stability and growth.

Modern compliance with Sysdig Secure DevOps Platform

Authorization to Operate (ATO) in a day and on-going authorization are compliance nirvana. The ATO is the authorizing official’s statement that they accept the risk associated with the system running in production environments using live business data. The idea that all of the information necessary to make a risk decision is at hand and can be consumed by decision makers is what every compliance program is trying to achieve.