Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Scaling Puppeteer & Playwright on Checkly with Terraform

Managing large numbers of checks by hand quickly becomes cumbersome. Luckily, Checkly's REST API allows us to automate most of the repetitive steps. Building on that API, the Checkly Terraform Provider takes automation one step further, enabling users to specify their active monitoring setup as code. In this article, we will be building on top of John Arundel's great intro from a few months back and showing how to manage multiple checks using groups and shared code snippets.

10 Tips for Implementing AIOps

With more and more people working from home and the ever-increasing complexity of IT infrastructure, it’s important to understand the best way to leverage Machine Learning (ML) and Artificial Intelligence (AI) to improve IT operations. ML and AI have promised to bring disruptive changes to IT operations, and many organizations have already decided to adopt Artificial Intelligence for IT Operations (AIOps) or to do it soon. Yet, implementing and deploying AIOps is still very challenging.

Is your infrastructure a transformation enabler or a roadblock?

According to research from McKinsey, only 16% of digital transformation efforts have successfully improved performance and equipped the organizations to sustain changes in the long term. That’s a shockingly low number. Why is this happening? And how can you prevent this from derailing your own digital transformation journey and putting your business models, revenue, and profitability at risk?

Kubeflow operators: lifecycle management for the ML stack

Canonical, the publisher of Ubuntu, releases Charmed Kubeflow, a set of charm operators to deliver the 20+ applications that make up the latest version of Kubeflow, for easy consumption anywhere, from workstations to on-prem, public cloud, and edge. > Visit Charmed-kubeflow.io to learn more. Kubeflow provides the cloud-native interface between Kubernetes, the industry standard for software delivery and operations at scale, and data science tools: libraries, frameworks, pipelines, and notebooks.

Monitor serverless configuration changes with Datadog Deployment Tracking

Serverless architectures remove the need to provision and maintain infrastructure components like servers and containers, so developers can focus on writing and deploying code. However, serverless architectures also introduce new challenges to monitoring and observability. Teams building serverless applications can iterate quickly and deploy frequent code and configuration changes, making it difficult to track what impact these changes have on your applications.

Here are 4 Ways SRE Helps New Employees Onboard

Onboarding is an essential yet challenging part of the hiring process. As your organization matures, more of its processes become unique. This makes it harder for new employees to get up to speed. Investing in custom processes and tooling to achieve your specific goals is a valuable practice. But, you must balance this with an investment in onboarding.

How to Reduce Your AWS Costs

Amazon Web Services (AWS) is a comprehensive cloud computing platform, providing an array of services to manage a business’s data infrastructure to help it grow and expand. With the wide range of services provided, it has multiple options to provide you with an elastic approach to optimize your costs. However, many users struggle with controlling their expenditure due to a variety of factors. Getting a complete understanding of how to reduce your AWS operational costs may be a daunting task.

Introducing Predictive Rebalancing: An application-driven approach for reliably utilizing spot instances

Here at Spot by NetApp we’re continuously innovating our machine learning models used for identifying and predicting spot capacity usage and interruptions for all major public clouds (AWS, Azure and GCP). These proprietary algorithms expand the ability to utilize spot capacity for production and mission-critical workloads, allowing our customers to enjoy up to 90% cloud compute cost reduction with SLAs and SLOs that guarantee availability.

Shipping Terraform Logs with the Logz.io Provider & API

Logz.io has deepened its partnership with Hashicorp over the last few months. Recently, we announced our integration with their service mesh, Hashicorp Consul. Simultaneously, we have worked on and completed an integration with their infrastructure orchestrator (a.k.a, infrastructure-as-code or IAC), Terraform. IACs take manual configurations and treats them as, well, code (along with procedures, build guides, run books, etc.).