Operations | Monitoring | ITSM | DevOps | Cloud

Monitoring

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

ECS Monitoring Metrics that Help Optimize and Troubleshoot Tasks

Compute functions that run on Amazon’s Elastic Container Service (ECS) require regular monitoring to ensure proper running and managing of containerized functions on AWS – in short, ECS monitoring is a must. ECS can manage containers with either EC2 or Fargate compute functions. While EC2 and Fargate are compute services, EC2 allows users to configure virtually every functional aspect. Fargate is more limited in its available settings but is simpler to set up.

Grafana Tempo 2021: Year in review

Grafana Tempo has had quite a year. Just eight months after it was announced at ObservabilityCON 2020, the open source tracing solution went GA. Since the Tempo team released v1.0 in June, we have ingested more than 39 trillion spans, a 26x increase from last year. We also introduced Grafana Enterprise Traces, which is powered by Tempo, to the Grafana Enterprise Stack.

Datadog vs. Splunk vs. Scout | How Do They Compare?

Every day the world is changing in terms of technology. A new innovation happens every second, and software and websites are becoming more and more advanced. We can now access almost every service on the internet, and software needs to be maintained as a top priority so that customer service will not get hindered. Software monitoring, however, is not an easy task. It is a 24x7 business because any user can face an issue at any time.

Log4J Does What?!!!

You have probably heard of Log4Shell, the security vulnerability that has ‘earned’ itself an NIST rank of 10: In this post I will show a really basic example of how this vulnerability actually works. I will walk you through some basic usage of the Log4J library and then show how some fairly basic inputs into this library can cause truly unexpected, and potentially disastrous, outcomes.

Part I: A Journey Into the World of Advanced Security Monitoring

Dealing with hundreds of security alerts on a daily basis is a challenge. Especially when many are false positives that waste our time and all take up too much of our valuable time to sift through. Let me tell you how our security team fixed this, as we built security around the JFrog products. First, let me tell you a little bit about our team.

New in StatusGator: See component statuses

A small but useful new feature is now available in StatusGator: You can see the status of all the components of a given service from your filter configuration page. As a reminder, component filters are a feature of all StatusGator paid plans. They allow you to filter your notifications and dashboard service statuses to specific components of a given service. Services such as large cloud providers often have dozens or even hundreds of individual regions or products.

Outage Alert: Top 10 Downtime Incidents of 2021

2021 has been an eye-opening year for both businesses and consumers who use popular websites and applications. We have all seen notable increases in the frequency and severity of outages as dependency on internet infrastructure grows – with no signs of slowing down. With our reliance on automation and connectivity expected to increase in 2022 – let’s review some of the top internet outages and website downtime incidents of 2021.

Grafana Loki 2021: Year in review

This year, we were excited to deliver the easiest version of Grafana Loki to use yet. With Loki 2.4, the Loki team introduced a simple, scalable deployment, and over the past 12 months, we added lots of great new features. Not to mention, we launched Grafana Enterprise Logs, a new addition to the Grafana Enterprise Stack that’s powered by Loki. But none of this would have been possible without our active community: In 2021, Loki had 166 contributors and 823 PRs in GItHub.

Ruby Application Manual Instrumentation for Distributed Traces

OpenTelemetry is a project by the Cloud Native Computing Foundation aimed to standardize the way that application telemetry data is recorded and utilized by platforms downstream. This application trace data can be valuable for application owners to understand the relationship between the components and services in their code, the request volume and latency introduced in each step, and ultimately where the bottlenecks are that are resulting in poor user experience.