Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Introducing a New Splunk Add-On for OT Security

The lines Between IT and OT are blurring. With IT and Operational Technology (OT) systems converging, ensuring the security of devices, applications, physical locations and networks has never been more difficult or more important. There is a growing recognition by security professionals that they have a readiness and visibility problem in plain sight.

4 ways to save on IT costs in the asset life cycle [Part 2: Deployment]

Welcome back to our four-part series on cutting IT costs in the asset life cycle. In our last blog, we discussed challenges with spending during the asset procurement stage and learned about valuable reports that can help. In this part, we’ll talk about asset deployment and look at more reports that can help cut costs during this stage of the asset life cycle.

Here are the Metrics you Need to Understand Operational Health

In recent polls we’ve conducted with engineers and leaders, we’ve found that around 70% of participants used MTTA and MTTR as one of their main metrics. 20% of participants cited looking at planned versus unplanned work, and 10% said they currently look at no metrics. While MTTA and MTTR are good starting points, they're no longer enough. With the rise in complexity, it can be difficult to gain insights into your services’ operational health.

Running Elasticsearch, Logstash, and Kibana on Kubernetes with Helm

Kubernetes (or “K8s”) is an open-source container orchestration tool developed by Google. In this tutorial, we will be leveraging the power of Kubernetes to look at how we can overcome some of the operational challenges of working with the Elastic Stack.

[KubeCon + CloudNativeCon EU recap] Getting some Thanos into Cortex while scaling Prometheus

Yesterday at KubeCon + CloudNativeCon EU, Grafana Labs software engineer Marco Pracucci, a Cortex and Thanos maintainer, teamed up with Thor Hansen, a software engineer at Hashicorp, to give a presentation called “Scaling Prometheus: How we got some Thanos into Cortex.” In their talk, the pair discussed a new storage engine they have built into Cortex, how it can reduce the Cortex operational cost without compromising scalability and performance, and lessons learned from running Cortex at s

Announcing Shared Scenarios to Promote a Culture of Reliability

Get started with Gremlin’s Chaos Engineering tools to safely, securely, and simply inject failure into your systems to find weaknesses before they cause customer-facing issues. Today, Gremlin is excited to announce the ability to share a Scenario across your entire organization. This allows you to build up a library of reliability exercises that are customized to your company’s applications and technology.

Achieving Major Efficiencies through Migration from OpenShift to Rancher

Sometimes technology partnerships are greater than the sum of their parts. That’s the case with two Swiss companies who have come together to deliver Kubernetes solutions to their customers. VSHN is Switzerland’s leading 24/7 cloud operations partner and first Kubernetes Certified Service Provider. amazee.io is an open source container hosting provider that offers flexible solutions built for speed, security and scalability.