Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

What Is MTBF? Mean Time Between Failures Explained in Detail

Time for another installment in the series where we explain in detail yet another important metric for tech organizations. After covering MTTD and MTTF, today we answer the question, “What is MTBF?” As the post title makes clear, MTBF stands for “Mean time between failures.” The acronym refers—like the others that came before it—to an important DevOps KPI. But what actually is it? What is it good for? How do I implement it?

Monitoring AWS Lambda functions with Datadog

The “serverless” movement is taking the industry by storm and now, with Datadog, you can start monitoring your serverless applications and functions on AWS Lambda. As soon as you enable the Lambda integration, you’ll start to see your metrics in an out-of-the-box dashboard like the one above. Monitor and alert on AWS Lambda serverless functions in minutes with Datadog.

Monitor G Suite activity with Datadog

G Suite is a collection of cloud-based productivity and collaboration tools developed by Google. Today, millions of teams use G Suite (e.g., Gmail, Drive, Hangouts) to streamline their workflows. Monitoring G Suite activity is an essential part of security monitoring and audits, especially if these applications have become tightly integrated with your organization’s data.

Danny Mican on his experience as an SRE at Auth0

Danny is an SRE at Auth0 and currently manages the reliability of systems that authenticate over 2.5 billion logins per month and is expected to have 99.9% (Three Nines) availability. He loves learning about systems and making changes that positively impact client happiness, employee happiness and long term stability and growth.

Modern compliance with Sysdig Secure DevOps Platform

Authorization to Operate (ATO) in a day and on-going authorization are compliance nirvana. The ATO is the authorizing official’s statement that they accept the risk associated with the system running in production environments using live business data. The idea that all of the information necessary to make a risk decision is at hand and can be consumed by decision makers is what every compliance program is trying to achieve.

Logz.io Enhancements and Changes with Kibana 7

We are happy to inform you that we are upgrading our user interface to support Kibana 7 for Logz.io! Kibana 7 offers users a long list of UI and UX enhancements that will make monitoring and troubleshooting your environment a much simpler and nicer experience. These enhancements include a cross-app dark theme, a new time picker, new filtering, a better dashboarding experience, and most importantly – a significant boost in performance. Shall we take a closer look?

Opsgenie's Microsoft Teams integration is now available in Microsoft AppSource

Utilizing ChatOps for issue resolution isn’t new, but the benefits of using a single tool for communicating and resolving issues gives it lasting power. The ChatOps model enables teams to take action on their day-to-day work directly from collaboration platforms, including Microsoft Teams. Since many Dev and ITOps folks are using Microsoft Office 365 for their daily work, it was a natural next step for Opsgenie to align with Microsoft Teams.

Scout Now Partnering With API Management Leader DreamFactory

At Scout, we pride ourselves in building a tool that is focused on the developers’ ability to quickly identify performance issues within their applications so they can fix them and resume building the fun stuff. DreamFactory is a robust role-based access tool to help you with API creation and management needs.

Powerful Ignore Rules for Noisy JavaScript Errors

Ignoring noisy and external errors is important to understanding the health of your client-side applications. Third-party scripts, user extensions, content crawlers, and non-impactful errors create lots of noise in web operations. With TrackJS Ignore Rules, you can filter out this noise and and have a clear view of your web application quality.