Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Building Resilience With the Splunk Platform One Use Case at a Time

You know that the Splunk platform is the ultimate tool to help advance your business on the path to resilience. You want to use it to see across hybrid environments, overcome alert fatigue, and get ahead of issues. You could be just starting out in your security journey and want to build an essential security foundation or if you're starting out in observability, you might want to accelerate your troubleshooting. You might be working in retail, telecommunications, or the public sector.

Centralized Log Management Best Practices and Tools

Centralized logging is a critical component of observability into modern infrastructure and applications. Without it, it can be difficult to diagnose problems and understand user journeys—leaving engineers blind to production incidents or interrupted customer experiences. Alternatively, when the right engineers can access the right log data at the right time, they can quickly gain a better understanding of how their services are performing and troubleshoot problems faster.

Amazon Linux 2023: Why we're moving to AL2023

Amazon Web Services (AWS) recently announced the release of Amazon Linux 2023 (AL2023) as the next generation of Amazon Linux with enhancements to its already-proven reliability. Besides offering frequent updates and long-term support, AL2023 provides a predictable release cadence, flexibility, and control over new versions. It also eliminates the operational overhead that comes with creating custom policies to meet standard compliance requirements.

Server Monitoring Best Practices: 9 Tips to Improve Health and Performance

Businesses that have mission-critical applications deployed on servers often have operations teams dedicated to monitoring, maintaining, and ensuring the health and performance of these servers. Having a server monitoring system in place is critical, as well as monitoring the right parameters and following best practices. In this article, I’ll look at the key server monitoring best practices you should incorporate into your operations team’s processes to eliminate downtime.

Deploy Open Telemetry to Kubernetes in 5 minutes

OpenTelemetry is an open-source observability framework that provides a vendor-neutral and language-agnostic way to collect and analyze telemetry data. This tutorial will show you how to integrate OpenTelemetry on Kubernetes, a popular container orchestration platform. Prerequisites.

Predictions: a Deeper Dive into the Rise of the Machines

As Gaurav described in his retail predictions blog, the impact of AI and automation on the retail industry should not be underestimated. The compound effects of improvements in technology and labour shortages have created an ideal scenario for innovation. Here we will take a deeper look into some of the AI and automation use cases that we have seen in retail and outline some of the areas of focus to help you get started.

A Guide to Enterprise Observability Strategy

Observability is a critical step for digital transformation and cloud journeys. Any enterprise building applications and delivering them to customers is on the hook to keep those applications running smoothly to ensure seamless digital experiences. To gain visibility into a system’s health and performance, there is no real alternative to observability. The stakes are high for getting observability right — poor digital experiences can damage reputations and prevent revenue generation.

Key Elastic Dev Commands for Troubleshooting Disk Issues

Disk-related issues with Elasticsearch can present themselves through various symptoms. It is important to understand their root causes and know how to deal with them when they arise. As an Elasticsearch cluster administrator, you are likely to encounter some of the following cluster symptoms.