Operations | Monitoring | ITSM | DevOps | Cloud

Multus: how to escape the Kubernetes eth0 prison

Kubernetes has been successful for a number of reasons, not the least of which is that it takes care of things that application developers may not want to bother with – such as, for example, networking. Multus is a feature that can be used on top of Kubernetes to enable complex networking use cases.

Service-Aware AIOps and finding answers to the question of 'what can I automate?'

Based on our interactions with buyers evaluating vendors in the AIOps market, much of what we’re hearing chimes with this quote - “What will AI allow us to automate? We'll be able to automate everything that we can describe. The problem is: it's not clear what we can describe.” Stephen Wolfram, computer scientist and physicist.

3 Regulatory Compliance Trends That Are Accelerating in 2020

A growing attack surface and the exponential rise of data has opened the floodgates for breaches, leading to increased scrutiny by regulatory agencies. It’s not surprising that in recent years, regulators have had to double down with compliance mandates that are more stringent and punitive than ever before.

How To Determine When a Host Stops Sending Logs to Splunk...Expeditiously

So I've only been at Splunk for 8 months, and in the short amount of time I've been here, one of the most common questions I've been asked is “How do I get an alert when Splunk is not receiving logs?". As a matter of fact, if I had $0.05 each time I was asked this question, I would have $0.25! Surprisingly, with this being such an often-asked question, I haven't been able to find much documentation on how to accomplish this using the native features of Splunk.

The essentials of monitoring AWS Elastic Load Balancing

AWS Elastic Load Balancing (ELB) dynamically distributes incoming application traffic across multiple EC2 instances and scales resources to meet traffic requirements. Elastic Load Balancing helps optimize the performance of various web and mobile applications by identifying failing EC2 instances before they affect the end-user experience.

ISO/IEC 20000 certification: What it is, why your organization needs it, and how to get it

One of the most important things that customers consider while purchasing a product or service is its credibility. A label that states the product has been tested, analyzed, and certified by an international regulatory body reassures a customer’s purchase decision. This is why organizations today strive to get themselves bench marked, differentiated, and validated. For this, they seek out regulatory bodies that develop and publish international standards.

GrafanaCONline Day 9 recap: Prometheus rate queries explained, and inside one company's adoption of a central telemetry platform

We’re into the third and last week of GrafanaCONline! We hope you’re able to check out all of our great online sessions. If you didn’t get a chance to watch yesterday’s sessions (or want to see them again), here’s a recap of day 9 of the conference.

The UX changes we made for Grafana 7.0 -- and what you can learn from them

Behind every part of Grafana, there are the ideas, creativity and commitment of the people who made it. While that includes code, it is not limited to it. Since August 2019, Grafana Labs has had a dedicated UX team, and we have been involved in countless recent features and improvements. We want to show you how we do our work, why you users are at the heart of everything we do – and most importantly, how design changes can make software better.

Kubernetes disaster prevention and recovery

Yeah, Kubernetes is great at making sure your workloads run as needed. But another of its amazing benefits is its ability to recover from failure all by itself. On an everyday basis, Kubernetes takes care of the complicated task of container orchestration. However, as with any complicated system, there is always the chance that you’ll experience failures and downtime.