Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Top Practices for Runbook Automation

Runbooks, also known as playbooks, are documents that walk you through a certain task with specific steps. For example, a runbook for spinning up a new server might ask some questions about the purpose of the server and its estimated load, then lead you to the appropriate instructions and settings. Runbooks ease the cognitive load of these common tasks by clearly outlining the process for each.

Performance tuning MongoDB with Chaos Engineering

You’ve pored over the MongoDB documentation, crafted highly polished and well-tuned queries, and confidently deployed your new code to production. Everything ran great at first, but once CPU or RAM usage hit a certain point, your queries suddenly slowed to a crawl. What happened, and how can you prepare for situations like this in the future? This is an unfortunate but common scenario with databases like MongoDB.

How to Automate the End-to-End Lifecycle of Machine Learning Applications

Machine Learning (and deep learning) applications are quickly gaining in popularity, but keeping the process agile by continuously improving it is getting more and more complex. There are many reasons for this, but primarily, behaviors are complex and difficult to anticipate, making them resistant to proper testing, harder to explain, and thus not easy to improve.

Alien Wavelengths to Shared Spectrum

Alien wavelengths are commonplace today. Here a transponder pair from one optical system vendor connects to, and transmits over, the optical line system (OLS) – constituting fixed/reconfigurable multiplexer and amplification elements primarily – from another vendor. (While it can be technically feasible to pair transponders from different vendors, typically this is not done for commercial and operational reasons.)

Azure Functions Live - June 2020

The Azure Functions team has yet again joined us for another monthly live webcast by staying remote and safe. In this live webcast, along with Jeff Hollan, Anthony and Matthew joined us to give a picture of the latest happenings in Azure Functions space. Without any further delay, let us jump in as there are tons of update are awaiting.

SRE: A Human Approach to Systems

In the world of technology, the stakes have never been higher. The move to the cloud and microservices to maximize agility has given way to digital disruptors and unprecedented competitive threats. As distributed systems become increasingly complex, the scale of ‘unknown unknowns’ increases. On top of this, customer expectations are sky-high. The cost of downtime is catastrophic, with customers willing to churn if their needs are not promptly met.

AWS EBS Volumes: 5 Ways to Optimize Performance and Costs

Amazon Elastic Block Store (EBS) provides block storage for applications that are running in the cloud. However, not every company is getting the most out of the EBS volumes they are using. Some companies can pay too much for EBS volumes without utilizing the allocated storage and IOPS. Other organizations may pay high prices because they are using the wrong disk type for their needs. This article explains five techniques you can use to optimize the performance of your EBS workloads.

Netdata Agent v1.23: Kubernetes monitoring & eBPF observability

Deploying and monitoring performance for an entire Kubernetes cluster can be complex. To simplify the process, we’ve added service discovery functionality to eliminate complex configuration, in addition to more advanced monitoring for viewing activity inside containers. Service discovery identifies k8s pods running on a cluster and immediately starts monitoring system performance. All containers are identified, regardless of complexity.

Report: Lambda use among Blue Matador users in 2020

It’s no secret that AWS Lambda adoption has grown steadily since AWS first released it in 2015—and for good reason. The benefits of adopting Lambda are many: leveraging Lambda eliminates the need to provision and manage servers, enabling teams to just focus on their code without the mental and operational overhead of worrying about the underlying infrastructure.