Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

Webinar: Debugging Lambda Performance Issues

One of the most common performance issues in serverless architectures, and specifically, AWS Lambda, is elevated latencies from external services, such as DynamoDB, ElasticSearch, or Stripe. In this webinar, we will focus on how to monitor, detect, and fix latency issues that arise when our Lambda functions need to talk to other services. Some of the topics we will cover include:

Why Monitoring Can Be the Lifesaver the Public Sector Needs

One of the business consequences from the pandemic—increased remote working—is causing technology challenges across most industries, including the public sector. Most employees who are now working from home have probably never had to do so before, while those still attending their place of work are mostly people in central government trying to keep the country running, or other critical roles.

Automated Root Cause Analysis & Anomaly Detection in Concert

Everyday IT operators are trying to prevent outages of business-critical applications. When prevention is not possible, IT operators strive to reduce the mean time to repair (MTTR) as much as possible. Improving resolution time can be quite a challenge. But IT operators don't stand alone in this challenge. They can use smart solutions that support Automated Root Cause Analysis and Anomaly Detection.

"Things get SREious": SRE from Home Recap

Without SRECon happening this year and the world turned upside down from COVID-19, we set out to hold a virtual event to bring SREs together to share their experiences of what has changed. Last week’s SRE from Home was exactly that. With 1900 registrants, 20 lively Slack channels, six illuminating and entertaining talks from a diverse range of experts in the field and our #askanSRE panel answering attendees’ questions with a candid generosity, it was an amazing, jam-packed day.

Bees Working Together: How ecobee's Engineers Adopted Honeycomb

At ecobee, adopting Honeycomb started as a grassroots effort. Engineers signed up for the free tier and quickly started sharing insights with teammates. When it came time for ecobee to make the “build vs. buy” decision for observability tooling, sticking with Honeycomb was the clear choice. Now on the enterprise plan, ecobee’s engineering squads rely on features like SLOs to support the business’s need to map engineering effort to user impact.

Serverless for Enterprises: Scale big or go home

We discuss quite a bit about going serverless for SMEs and startups, however it’s often those with an already huge infrastructure, such as enterprises, that can find the move and change daunting. We see many companies from the likes of Coca-Cola to Netflix managing it but what does it look like in action? In this article, we share some best practices and insights on the serverless designs that can scale massively and represent enterprise models.

sFlow vs NetFlow: What's the Difference?

In any given network, switches, routers, and firewalls may support different flow protocols. After all, there’s NetFlow, sFlow, IPFIX, and J-Flow, to name a few. With so many options, you may be wondering “Which flow protocol should I use?” It’s a common question, and it has a relatively simple answer: While some devices support multiple protocols, a device typically only supports one type of flow protocol, so you should use the protocol your device and collector supports.

Introducing LogDNA Web Server Template

With the ever-growing volume of application logs and analysis tools available, it can be time-consuming to set up your observability tools to keep up with best practices. Every new piece of infrastructure deployed also causes another piece of dashboard and monitoring that needs to be put in place to ensure stability and reliability.

How Playtech Fixed Metrics Over-Collection with Observability

According to Forbes, 2.5 quintillion bytes of data are created every day. Data volumes have grown exponentially in recent years due to the growth of the Internet of Things (IoT) and sensors. The majority of data collected has been collected in the last two years alone. For example, the U.S. generates over 2.5 million gigabytes of Internet data every minute, and over half of the world’s online traffic comes from mobile devices.