Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

Lambda Extensions Just Got Even Better

AWS announced AWS Lambda Extensions back in October 2020 and I wrote extensively about it at the time – what it is, how it works, and why you should care. In short, Lambda Extensions allow operational tools to integrate with your Lambda functions and run either in-process alongside your code or in a separate process. To better understand the problems they solve and their use cases, please read my previous article.

Keep Calm and Simplify Managing your SIEM events with Siemplify

We created our Logz.io Cloud SIEM with a clear goal: providing a rapidly deploying, flexible, and cost-effect security management tool that can serve broad datasets and withstand the occasional bursts of events without a sweat. However, our users were coming back to us with requests for more. After all, it’s great to proactively detect proliferating security threats, but what’s the next step beyond just identifying the threat?

Better Tools = Better Monitoring

Everyone loves tools. Whether you’re a weekend craftsman, an aspiring chef, or a serious IT professional, the tools you use can make your tasks much easier. Monitoring tools in IT are mainstays when it comes to keeping an eye on network infrastructure and enforcing company security policies. But just like anything in life, not all monitoring tools are built equally—in fact, many can harm your ability to respond to emerging issues within your network.

Cost Challenges That Keep Execs and Admins Awake at Night

Reining in costs and ensuring your IT organization maximizes its technical ROI is a delicate balancing act of office politics and well-rooted processes. IT cost challenges tend to vary from business to business, but they have one thing in common: they’re all manageable. Taking the time to study the most common IT revenue black holes starts with developing an in-depth understanding of how each one can affect IT productivity and the business’s bottom line.

Best practices for modern frontend monitoring

Single-page applications (SPAs) provide some significant benefits over multiple-page apps. For JavaScript developers using frameworks like React or Vue, they offer flexibility in moving application logic to the frontend, reducing the need for complex backend operations. For users, SPAs can provide a smooth experience with a highly interactive UI and fewer page loads. But, with increased sophistication, there are some tradeoffs.

Monitor kube-state-metrics v2.0 with Datadog

In order to manage complex containerized applications, modern devops teams need to have deep visibility into the status of their Kubernetes resources. By listening directly to the Kubernetes API, the open source kube-state-metrics service generates key metrics about your Kubernetes objects, including pods, nodes, and deployments, which are essential for understanding the status and performance of your clusters.

Top SRE Toolchain Used By Site Reliability Engineers

We have compiled a list of the most popular and sought out tools (some you may have heard of) that SREs need in their toolkit - at every phase of a production system to keep up with SRE best practices Site reliability engineering (SRE) practices help organizations by ensuring smooth functioning of their deliverables with utmost reliability and resilience. These can be achieved by a set of well-defined tools that are deployed at every phase of the production system to keep up with SRE best practices.

SRE fundamentals 2021: SLIs vs. SLAs. vs SLOs

A big part of ensuring the availability of your applications is establishing and monitoring service-level metrics—something that our Site Reliability Engineering (SRE) team does every day here at Google Cloud. The end goal of our SRE principles is to improve services and in turn the user experience. The concept of SRE starts with the idea that metrics should be closely tied to business objectives. In addition to business-level SLAs, we also use SLOs and SLIs in SRE planning and practice.