Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Any PLC alarm on your mobile device

Maintenance of machines is an incredibly important task. And it is important to fix a machine before it completely fails. In reactive maintenance scenarios, speed of response is key. Once an issue is detected is important to communicate as reliably and quickly as possible to the right engineer. Ideally, the machine is connected directly to team of mobile engineers in charge and can let them know what exactly happened and what needs to be fixed.

The incident resolution mandate of telehealth and telepharmacy providers in the age of Covid-19

The incident management challenges of a pandemic-driven world & how to overcome them “While the safety and well-being of workers affected by COVID-19 is the first priority, companies will also triage other essentials, such as incident management and stakeholder communications.” (PWC) In a pandemic-stricken world that is consuming products and services over the internet, more than ever, there is a great strain on digital and connectivity systems.

Using alternate PING utilities to test your network

Low latency networks are the ideal media; in typical intranet all the hosts can be reached in a few milliseconds. Monitoring tools, starting from typical default ping utility (the one that sends ICMP Echo packets and waits for response), are consistent in their results if typical response is above 10-15 milliseconds. When response time drops significantly and is circa 1ms or even less, the results may begin to vary considerably.

Here's your Complete Definition of Software Reliability

We live in the era of software convenience, where we take for granted that hundreds of services are always at our fingertips. These applications become part of our daily routines because they are so reliable. However, this consistency makes reliability work invisible to the end user. It can be difficult to appreciate the effort behind maintaining a high availability service. Because of that, people may misunderstand exactly what makes a service reliable.

Best PagerDuty Alternatives of 2020: An Independent Review by StatusGator

Modern applications offer more and more features, and the infrastructure needed to run them becomes increasingly complex. The need for Application Performance Monitoring (APM) and Network Performance Monitoring (NPM) tools like PagerDuty is obvious, as the cost of downtime can be exorbitant for a business of any scope. Thus, every business needs to use Pager Duty or one of its alternatives that alerts the Ops team should anything go awry.

How I'm using Grafana and Prometheus to monitor my 3D printing

My name is Jonathan Stines, and I am a Penetration Tester for Rapid7, a cybersecurity company located in Austin, Texas. A small handful of my former colleagues at Rapid7 now work at Grafana Labs and have said it was a pretty cool spot to have landed. I had a vague understanding of what Grafana was, but what really struck my interest was when I saw their sweet dashboards in the HBO series Silicon Valley.

All together now: Fleet-wide monitoring for your Compute Engine VMs

Cloud Monitoring has always provided comprehensive visibility and management into individual Compute Engine virtual machines (VMs). But many Google Cloud customers have hundreds, thousands, or tens of thousands of VMs that they need to manage. Cloud Monitoring now gives you zero-config, out-of-the-box visibility into your entire Compute Engine VM fleet, with quick access to advanced Monitoring features such as installing the Cloud Monitoring agent and configuring fleetwide alerts.

Working with a hybrid SquaredUp deployment

Many customers we have are hybrid – meaning they have both Azure and on-prem estate, and subsequently both SquaredUp for Azure and SquaredUp for SCOM deployments. In other cases, some customers are using multiple different deployments of a product, for example for multiple SCOM management groups or multiple Azure tenants.