Operations | Monitoring | ITSM | DevOps | Cloud

Blog

Multus: how to escape the Kubernetes eth0 prison

Kubernetes has been successful for a number of reasons, not the least of which is that it takes care of things that application developers may not want to bother with – such as, for example, networking. Multus is a feature that can be used on top of Kubernetes to enable complex networking use cases.

Service-Aware AIOps and finding answers to the question of 'what can I automate?'

Based on our interactions with buyers evaluating vendors in the AIOps market, much of what we’re hearing chimes with this quote - “What will AI allow us to automate? We'll be able to automate everything that we can describe. The problem is: it's not clear what we can describe.” Stephen Wolfram, computer scientist and physicist.

How to perform simple testing of critical network assets state

Certain changes in services or devices can often be misheeded; failure to recognize even subtle changes can later result in unpleasant consequences. Below we list several examples of such incidents; the checks described are rather lightweight and can be run frequently for critical network assets. The cases below assume that any change in current device’ state should be treated as security issue.

3 Regulatory Compliance Trends That Are Accelerating in 2020

A growing attack surface and the exponential rise of data has opened the floodgates for breaches, leading to increased scrutiny by regulatory agencies. It’s not surprising that in recent years, regulators have had to double down with compliance mandates that are more stringent and punitive than ever before.

How To Determine When a Host Stops Sending Logs to Splunk...Expeditiously

So I've only been at Splunk for 8 months, and in the short amount of time I've been here, one of the most common questions I've been asked is “How do I get an alert when Splunk is not receiving logs?". As a matter of fact, if I had $0.05 each time I was asked this question, I would have $0.25! Surprisingly, with this being such an often-asked question, I haven't been able to find much documentation on how to accomplish this using the native features of Splunk.

Tracking COVID-19 Data in South America Using Telegraf and InfluxDB

I wanted to better understand how COVID-19 has been developing in South America. As I’ve recently started playing with InfluxDB, the open source time series database, I created a dashboard of cases and deaths using InfluxData’s platform. I usually use InfluxDB, Chronograf, Grafana, Zabbix and other similar solutions to monitor services and systems. However, until this point, I hadn’t used them to process and visualize other kinds of data.

Cloud Adoption is No Longer an Option for Federal Agencies

In May 2019, Bloomberg Government reported that Federal agencies planned to move 272 information technology programs to the cloud in FY2020. Fast forward to April 2020 — they reported that there are more than 1,800 federal IT programs that are either migrating or considering migrating to the cloud in fiscal 2021, signifying a rapid increase in cloud adoption in the federal government. How might COVID-19 affect this explosive increase in cloud interest?

IT Risk Assessment vs. IT Risk Management: The Difference and What They Mean to the Service Desk

In life, risks can be perceived both negatively and positively. Taking a risk can sometimes yield great results, but other times, a risk is a yellow light of caution. For businesses in particular, if not managed properly, IT risks like malware malfunctions and employee errors can range in size and occur in several areas. The result is disruption and valuable time being used to resolve the issue. But even with risk present, there are measures IT can put in place to ward them off.