Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Monitor Druid with Datadog

Apache Druid is a data warehouse and analytics platform that can capture streaming data from message queues like Apache Kafka and batch data from static files. Druid can be a valuable component in your technology stack if you need to collect real-time data for online analytical processing (OLAP) tasks like reporting, ad-hoc querying, and dashboarding.

Introducing Datadog Real User Monitoring

The performance of your website is a key element in the success of your business—slow page load times and errors can degrade the user experience, leading to customer churn, fewer ad impressions, or abandoned shopping carts. To give you end-to-end visibility into the real-time activity and experience of individual users, we’re excited to add Real User Monitoring (RUM) to Datadog.

Kubernetes Observability with Logs and Metrics in Logz.io

Yesterday, we announced the beta release of Logz.io Infrastructure Monitoring — our Grafana-based monitoring solution, and the planned release of a Jaeger-based tracing solution. These additions to our platform complement our ELK-based Log Management product, together constituting what is the world’s only open source-based observability platform for monitoring, troubleshooting and securing distributed cloud workloads.

How to Explore Prometheus with Easy 'Hello World' Projects

In this blog post, I would like to share with you some of the projects that I used to to get a better sense of what Prometheus can do. I am a very hands-on type of learner, and usually when I want to explore new technologies, I start with “hello world” apps and small toy projects. Therefore, the main goal of this blog is to share with you how easily you can set up Prometheus and how quickly you can create simple projects that can be monitored with Prometheus and visualized in Grafana.

Why you need a status page

There are as many ways to trigger an incident as there are new code deployments across the globe and, with the emergence of cloud-reliant businesses, uptime accountability has shifted from on-premise server teams to the service providers themselves. SLAs, SLOs, and websites dedicated to downtime have suddenly come to life in the internet age, and having a status page is now an industry standard.

ITAM - 4 Tips to Break Down Your Organizational Data Silos

With technology innovations occurring daily, it’s important for companies to ensure that their IT investment is working efficiently. To do this, a little something (or big depending on how long it is since you’ve taken an inventory!) called IT Asset Management (ITAM for short) helps you manage your IT assets throughout their lifecycle.

WHAT'S NEW Pandora FMS 741

As mentioned in the last release 740, even updates will focus on fixing bugs and odd ones on developing new features and improvements that help all users. In this release 741, a new Discovery integration has been added to monitor SAP centrally and remotely, a system to report bugs and suggestions, in addition to Omnishell, the first step of Pandora FMS in the world of automation and IT infrastructure management.

UserCentric: Redefining online recruiting for doctors and nurses

How do you match health care practitioners to the right job? When The Postgraduate Medical Council of Victoria (PMCV) had to recruit doctors and nurses for the healthcare match system it administers, they needed an efficient solution that would take into account a high number of complex variables while remaining agile and, most importantly, accurate. At UserCentric, we devised a solution that gives PMCV administrators control over the entire recruiting experience.

Ransomware, interrupted: Sodinokibi and the supply chain

Last month, the Elastic Security Protections Team prevented an attempted ransomware attack targeting an organization monitored by one of our customers, an IT Managed Service Provider (MSP). We analyzed the alerts that were generated after an adversary’s process injection attempts were prevented by Elastic Endpoint Security on several endpoints. Adversaries often attempt to inject their malicious code into a running process before encrypting and holding the victim’s data to ransom.

Tips for Modern NOCs - Alleviating Incident Routing Bottlenecks

Critical and sev1 incidents are always a priority, but what about those dozens and often hundreds of lower priority ones that often sit in a queue waiting for a first response engineer to get to them? Do you find that no matter how much effort your team puts into minimizing the number of queued incidents, their number always seems to grow? If this sounds familiar – this blog is for you.