Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

The Role and Responsibilities of a Data Center Technician

A data center technician holds the responsibility of installing, maintaining, and monitoring the equipment and systems within a data center. They ensure the proper functioning and updated status of both hardware and software, playing a vital role in safeguarding data security and availability.

3 new metrics CEOs want from Engineering leaders

Engineering leaders are accountable to shipping quality products on time and within budget. But a CEO's historical understanding of engineering operations has been pretty opaque: ships on time, or doesn't. And although DORA metrics are a good place to start, they don’t offer the complete picture needed for CEOs. With new tooling to support greater engineering efficiency, CEOs are learning that it's possible to get a richer picture of engineering productivity, quality, and velocity.

The Difficulties of Measuring Engineering

The report is so absurd and naive that it makes no sense to critique it in detail. - Kent Beck responding to the McKinsey Report. Luckily this was a hollow threat, because a few days later he and fellow blogger Gergely Orosz released a two part blog series critiquing not exactly Mckinsey's report but... any report that tried to put “effort based” metrics at the top of the list for things to track.

Monitoring Kubernetes tutorial: Using Grafana and Prometheus

Behind the trends of cloud-native architectures and microservices lies a technical complexity, a paradigm shift, and a rugged learning curve. This complexity manifests itself in the design, deployment, and security, as well as everything that concerns the monitoring and observability of applications running in distributed systems like Kubernetes. Fortunately, there are tools to help developers overcome these obstacles.

Graphite vs Prometheus

Graphite and Prometheus are both great tools for monitoring networks, servers, other infrastructure, and applications. Both Graphite and Prometheus are what we call time-series monitoring systems, meaning they both focus on monitoring metrics that record data points over time. At MetricFire we offer a hosted version of Graphite, so our users can try it out on our free trial and see which works better in their case.

Top 5 Resiliency Trends of 2023

In today’s world, resilience is no longer a conditioned desire or methodology to try but has become a necessity for sustained success in software development and IT operations. As DevOps and Agile teams keep moving forward to cross boundaries, come up with new methodologies, and drive innovation, it is now important to have the ability to quickly recover from failures, adapt to changing conditions, and maintain high performance under pressure.

Azure Integration Automates Asset Discovery on Tidal Accelerator

We’re excited to share an update on our Microsoft Azure integration that automates discovery and mapping of key cloud assets into Tidal Accelerator. Tidal has enabled a new integration that pulls information on Azure Virtual Machines (VMs), Azure App Service, and Azure Database instances, Elastic Pools and Servers, directly into Tidal Accelerator for further analysis.

Revolutionize Data Ingestion: Introducing Terraform Support for Splunk Cloud Platform

Splunk Cloud Platform has always been a powerful platform for aggregating, analyzing, and extracting actionable insights from your machine-generated data. As data volumes continue to grow exponentially, efficiently managing the ingestion of data into Splunk becomes crucial. To address this need, we are thrilled to announce the debut of Terraform support for the Splunk Cloud Platform.

Deploying a multi-availability zone Kubernetes cluster for High Availability

Many cloud infrastructure providers make deploying services as easy as a few clicks. However, making those services high availability (HA) is a different story. What happens to your service if your cloud provider has an Availability Zone (AZ) outage? Will your application still work, and more importantly, can you prove it will still work? In this blog, we'll discuss AZ redundancy with a focus on Kubernetes clusters.