Operations | Monitoring | ITSM | DevOps | Cloud

Monitoring Kubernetes with Prometheus

Kubernetes is among the emerging open-source products expanding in the market at a very fast rate. It is a portable, extensible, and open-source platform used for managing containerized workloads and services. Companies are widely adopting it for the development of their major products. Docker is always used for running Kubernetes servers on local systems for testing purposes. It becomes essential for companies to monitor their Kubernetes container.

Better Kubernetes application monitoring with GKE workload metrics

The newly released 2021 Accelerate State of DevOps Report found that teams who excel at modern operational practices are 1.4 times more likely to report greater software delivery and operational performance and 1.8 times more likely to report better business outcomes. A foundational element of modern operational practices is having monitoring tooling in place to track, analyze, and alert on important metrics.

AIOps and Performance Monitoring: A One-Two Punch for IT Operations

Sugar Ray Robinson and Jake LaMotta. Marvelous Marvin Hagler and Tommy Hearns. Muhammad Ali and Joe Frazier. All were among history’s greatest boxers, but when they met in the ring, each brought out the best in the other. It’s the same in IT management. There are tools and platforms that on their own are essential to IT operations; but when paired as an infrastructure management tandem, each complements the other, ensuring maximal efficacy of both systems.

Looking forward to KubeCon

KubeCon + CloudNativeCon North America is just around the corner. I’ve been looking forward to this event for a long time, especially since 2020 was virtual and it looks like there will be an in person option this year. This should be a great event and there are going to be a ton of awesome sessions. Last year was simply enormous with over 15K attendees who joined virtually.

How to mitigate the 0-day Apache path traversal vulnerability with Puppet or Bolt

Apache has disclosed a critical actively exploited path traversal flaw in the popular Apache web server, version 2.4.49. This path traversal means that an attacker can trivially read the contents of any file on the server that the Apache process has access to. This could expose highly sensitive information, even as critical as the server's own private SSL certificates. See the Sonatype blog for more technical information on the vulnerability.

Facebook, Instagram, and Whatsapp's Outage - Understanding MTTR

Yesterday the most used social media platforms in the world were inaccessible for 6 hours straight. Later, in a press release, Facebook revealed that the outage was due to configuration changes in their routers. There is no doubt that Facebook has an intense incident response plan, yet a small blind spot resulted in a significant business interruption. So how do we avoid this? The truth is, outages and performance issues are bound to happen in any network.

Adding Search to Rails with MeiliSearch

There are many ways to add search functionality to a Rails application. While many Rails developers choose to use the native search functionality built into popular databases like MySQL and Postgres, others need more flexible or feature rich search functionality. ElasticSearch is probably the most well known option available but it has its own issues. Firstly, it is a resource hungry beast. To run ElasticSearch properly in production, you need a few beefy servers.

The Aftermath of the Facebook 6-Hour Outage

Less than 24 hours ago, the world came to a “social standstill” as Facebook, and its sister companies, WhatsApp and Instagram, became unavailable, leaving its 3.5 billion users in a flap. The outage, which lasted almost 6 hours, shut off access for users and businesses all over the world and caused ripple effects that we will likely continue to see in the immediate (and perhaps not-so-immediate) future.

The Future of AIOps Includes an ITOps Strategy

One of the questions I get asked a lot by customers, prospects, and partners is, “Will AIOps make them irrelevant?” To them, AIOps is often equivalent to automated remediation; an AIOps system automatically detects an incident and kicks off a remediation process in response to this incident, knowing exactly what process will solve the problem. IT is out of the loop, data centers and NOCs just keep humming along unattended, end users are none the wiser.