Operations | Monitoring | ITSM | DevOps | Cloud

Visibility is Critical to the Microsoft Teams User Experience

In today’s digital work environment, the necessity for seamless connectivity cannot be overstated, with any disruption significantly impacting productivity. Microsoft Teams has emerged as the most impactful application on a user’s day-to-day work life. According to Okta, the authentication vendor, Microsoft 365 and Microsoft Teams is the #1 application across enterprises.

How Cribl Stream Can Enhance Digital Operational Resilience Under DORA within Financial Services

In the swiftly changing digital realm of the finance and insurance sectors, sustaining operational resilience while complying with rigorous regulatory mandates is paramount. The Digital Operational Resilience Act (DORA) marks a significant regulatory milestone designed to ensure entities within the financial services sector are equipped to withstand, respond to, and recover from all types of ICT (Information and Communication Technology) related disruptions and threats.

7 ways to find and fix digital user frustration signals

Earning a customer's trust is tough, but losing it is unbelievably easy. That is why when a customer is happy, they stay for longer. A 2019 Accenture consumer survey of over 20,000 users across 19 countries revealed that a significant 47% of users avoid businesses that frustrate them with the user experience. Interestingly, an equal 47% said they were willing to pay a premium for a frustration-free user experience that exceeds their expectations.

How to monitor a home VPN from anywhere with Grafana Cloud

I’m a senior solutions engineer here at Grafana Labs, but I recently found myself trying to solve a real-world problem in my homelab. The issue was, I have some services running there and I want to be able to access my home network when I’m away. Of course, I had to make sure my network remains safe when I do that, so I decided to deploy a simple and secure VPN.

AWS Cost Explorer Vs. Pricing Calculator: How To Estimate Costs

Managing cloud costs has been the top challenge in cloud computing for more than half a decade now. It’s bigger than cloud security or hybrid cloud management. Several studies estimate that more than a third of cloud budgets could not be accounted for in 2023 alone. If you are a current or prospecting Amazon Web Services (AWS) customer, AWS Cost Explorer and AWS Pricing Calculator can help you manage your costs better.

Key metrics for monitoring etcd

Etcd is a distributed key-value data store that provides highly available, durable storage for distributed applications. In Kubernetes, etcd functions as part of the control plane, storing data about the actual and desired state of the resources in a cluster. Kubernetes controllers use etcd’s data to reconcile the cluster’s actual state to its desired state. This series focuses on monitoring etcd in Kubernetes.

Tools for collecting etcd metrics and logs

In Part 1 of this series, we looked at how etcd works and the role it plays in managing the state of a Kubernetes cluster. We also explored key etcd metrics you should monitor to ensure the health and performance of your etcd cluster. In this post, we’ll show you how you can use tools like Prometheus, Grafana, and etcdctl to collect and visualize etcd metrics. We’ll also show you how to collect etcd logs that provide context for those metrics.

How to monitor etcd with Datadog

So far in this series, we’ve walked through key etcd metrics and tools you can use to monitor etcd metrics and logs. In this post, we’ll show you how you can monitor etcd with Datadog, including how to: But first, we’ll show you how to set up and configure the Datadog Agent and Cluster Agent to send etcd monitoring data to your Datadog account.

What is a Kubernetes operator?

Operators take a real-world operations team’s knowledge, wisdom, and expertise, and codify it into a computer program that helps operating complex server applications like databases, messaging systems, or web applications. Operators provide implementations for operating applications that are testable and thus more reliable at runtime.