Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Demystifying Java Lambda Expressions

SRE and IT Operations play a critical role in ensuring reliable, high-performance applications. Yet, SREs (Site Reliability Engineers) often face ‘thrown-over-the-wall’ code deployments to operate without having insights into the code-level features. In my previous article (“Is your Java Observability tool Lambda Expressions aware?”), I delved into one such code-level feature: Java lambda expressions which replace anonymous inner classes.

Finally: alerting and on-call scheduling for how you actually work

TL;DR You deserve a better alerting and on-call tool. So we built Signals. In our early days, we often used the tagline, “You just got paged. Now what?” It encapsulated how FireHydrant solved for all of the messy bits that come after your alert is fired, from incident declaration all the way through to retrospective. At the time, we saw alerting and on-call scheduling as a solved problem.

How to Simplify ITSM Integrations with iPaaS

Integration is a crucial aspect of modern IT service management. By linking their ITSM platform with external applications, businesses can simplify their IT operations, enhance efficiency and flexibility, and provide better customer service. Real-time integration allows businesses to quickly adapt to changing circumstances and ensure that IT services align with business requirements. However, ITSM integration can be complex and difficult.

Decoding .NET8: Unveiling Cloud-Native Observability

The.NET programming language is taking cloud native deployment and observability seriously, and most notably with the recent announcement of.NET Aspire stack unveiled at the recent.NET Conf 2023. In the latest episode of OpenObservability Talks, we reviewed the journey to making.NET a “by default, out of the box observable platform,” as ASP.NET Core creator David Fowler put it.

Diving into JTAG - Debugging (Part 2)

As noted in my previous article Diving into JTAG protocol. Part 1 — Overview, JTAG was initially developed for testing integrated circuits and printed circuit boards. However, its potential for debugging was realized over time, and now JTAG has become the standard protocol for microcontroller debugging. Many Firmware and Embedded engineers first encountered it in this particular context.

Integrating Prometheus AlertManager with PagerDuty in Calico

In the fast-paced world of Kubernetes, guaranteeing optimal performance and reliability of underlying infrastructure is crucial, such as container and Kubernetes networking. One key aspect of achieving this is by effectively managing alerts and notifications. This blog post emphasizes the significance of configuring alerts in a Kubernetes environment, particularly for Calico Enterprise and Cloud, which provides Kubernetes workload networking, security, and observability.

In Their Own Words: Three Ways NetOps Delivers Value to Customers

Now more than ever, modern networks play a pivotal role in today’s business operations. However, this increased importance comes with a challenge: Modern networks are becoming increasingly complex and heterogeneous. Network managers need to ensure optimal performance across various domains—from the data center to multi-cloud and software-defined networks. This requires the consolidation of vast amounts of data from across multi-vendor networks, including environments managed by third parties.

Is Waiting for the Thaw Unbear-able?

It’s not new news that organizations are producing more data than ever. But, in order to take advantage of this data, it needs to be collected, stored, retained, and then, at some point, analyzed. Most analysis tools also act as the retention point for this data. While this may (at first) appear to be the best option for performance, it quickly creates significant problems. First, those systems were never designed for the scale of today’s growing volume of data, currently at a 28% CAGR.

Step-by-step Guide to Monitor Riak Using Telegraf and MetricFire

Monitoring your databases is essential for maintaining performance, reliability, security, and compliance of your infrastructure. It allows you to stay ahead of potential issues, optimize resource utilization, and ensure a smooth and efficient operation of your database system. Effective monitoring of Riak involves collecting, analyzing, and acting on a variety of metrics and logs.

Mastering IPM: Key Takeaways from our Best Practices Series

As we conclude our Mastering IPM blog series, it's time to reflect on the wealth of insights we shared. From delving into the critical layers of the Internet Stack to navigating the intricacies of data analysis, each installment has provided valuable perspectives on optimizing digital experiences through Internet Performance Monitoring (IPM). Now, let's distill the key takeaways from the series.