Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on Monitoring for Websites, Applications, APIs, Infrastructure, and other technologies.

How Common Application Issues Kill Performance

In the modern era of digital businesses, web applications need to deliver on several grounds–performance, user experience, robustness, and scalability. However, many developers might agree that performance is of the utmost importance in any software application. The bells and whistles of a fancy UI and extensive functionalities can sometimes force performance to take the back seat. Additionally, there are a lot of reasons for performance to degrade over time.

Webinar How to Monitor Serverless Apps - Jan 2021

The software we write does not always work as smoothly as we'd like. To know if something went wrong, find the root cause, and fix the problem, we need to monitor our system and get alerts whenever issues pop up. There are many useful tools and practices for non-serverless applications. As we adopt serverless architecture can we continue to use the same approach? Unfortunately, the answer is no.

A Practical Guide to Logstash: Parsing Common Log Patterns with Grok

In a previous post, we explored the basic concepts behind using Grok patterns with Logstash to parse files. We saw how versatile this combo is and how it can be adapted to process almost anything we want to throw at it. But the first few times you use something, it can be hard to figure out how to configure for your specific use case.

New Metrics for IT Operations: Part 2

This blog is the second in a two-part series and was adapted from The Enterprisers Project. At a time when CIOs can use cloud infrastructure to turn on new money-making services for customers overnight, how should we measure IT success? Hint: It's not about uptime. In part 1 of this series, we talked about how traditional IT metrics such as server capacity, I/O, utilization, and network throughput are less relevant today in our highly-digital world.

Networks at Risk Due to Widespread Gaps in Basic Network Management Activities: Report

A significant portion of companies have vulnerabilities in their network management practices. These vulnerabilities include a lack of network visibility, configuration backups, proactive network planning, and up-to-date documentation. Despite these vulnerabilities, the majority of IT pros report high confidence in their networks, indicating a potential mismatch between perception and reality.

The Importance of Cloud Performance and Security Platforms

Work, education, and even many of our leisure activities have all moved on-line at an incredible pace due to current social distancing mandates. The digital backbone of the Internet and the SaaS services that drive our personal and professional lives are now foundational. Ensuring that these systems are operating optimally and securely is of paramount importance.

Kubernetes is eating the world; you can digest K8's plume

Innovation in hypervisor technology in the early 2000’s from both commercial and open source projects was the genesis for the public cloud as we know it today. Virtualization and Moore’s law, together with advances in storage technology, mobile and wireless, created a data explosion that continues to accelerate through today.

The Elastic SSPL licensing change & ChaosSearch: FAQs

There’s no question that Elastic has built a truly amazing company, based on the Apache 2.0 open source business model, and on the shoulders of other projects like Lucene. Last week, Elastic announced that, starting with version 7.11, Elasticsearch will now be licensed via SSPL, a license that Mongo released in 2018. So you may be wondering what this all means. Here are what we anticipate will be a few Frequently Asked Questions around this Elasticsearch licensing change.

Automating SSL Certificate Expiration Monitoring

In my previous work experience, monitoring certificate validation was critical to our team. These certificates were used to sign commercial transactions between the payment gateway (us) and other providers. That check was manual and depended on the calendar of one person. So, if that person forgets to notify the team about the upcoming expiration of one certificate and doesn’t start the procedure of getting the new one, well, the platform starts to fail.