Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Streamline Incident Response with Komodor and Squadcast

With the growing popularity of Kubernetes as a container orchestration platform powering the microservices revolution, comes greater complexity with managing, monitoring, and responding to incidents at scale. Challenges with real production environments include full visibility into your clusters and environment’s health, alongside real-time incident management and response.

Using DORA metrics Mean Lead Time for Changes to deliver iterations faster

Here's what you can expect to learn from this article: Raise your hand if you like shipping changes quickly. (Yes, let's assume that everything you're shipping has value and isn't a vanity project). Chances are, you, the person reading this now, agreed with the above. When you start on a project, big or small, you want to keep any changes moving along and avoid getting stuck. The less time between the beginning and end of a project, the faster you can shift your focus to other things.

AWS CloudTrail vs CloudWatch: Features & Instructions

In today’s digital world, cloud computing is necessary for businesses of all types and sizes, and Amazon Web Services (AWS) is undoubtedly the most popular cloud computing service provider. AWS provides a vast array of services, including CloudWatch and CloudTrail, that can monitor and log events in AWS resources. This article will compare AWS CloudWatch and CloudTrail, looking at their features, use cases, and technical considerations.

Site Reliability Engineering: Definition, Principles & How It Differs From DevOps

Site crashes and outages can cost hundreds of thousands in lost revenue and inconvenience users. Site Reliability Engineering helps build highly reliable and scalable systems, particularly important for companies that depend on their software to support their customers performing critical operations. Hiring a Site Reliability Engineer is the best way to ensure a software system stays up and running at all times.

A guide to dynamic application security testing (DAST)

Dynamic application security testing (DAST) is a critical security measure for modern software delivery pipelines. It involves evaluating the security of web applications by actively testing them in real-time, simulating real-world attacks to identify vulnerabilities. As the cybersecurity threat landscape has evolved, DAST has emerged as a key tool for enforcing application security in continuous integration and continuous delivery (CI/CD) pipelines.

The Role of AWS Direct Connect in Hybrid Cloud Networking

Amazon Web Services (AWS) Direct Connect is a system that connects your business to an AWS service without using mainstream internet. Implementing this system into your company means any information transferred over the internet will use a private, secure network. AWS Direct Connect designs virtual interfaces that are directly connected to public AWS service. It makes the transfer of information quicker and more streamlined for your company.

How to Monitor a Heroku App with Graphite, Grafana and StatsD

This article explores the efficient monitoring of Heroku Apps using MetricFire's HostedGraphite plugin and Grafana dashboards. By combining these tools, developers can gain valuable insights into their app's performance and resource utilization. This guide provides step-by-step instructions on setting up MetricFire, integrating StatsD, and creating comprehensive Grafana dashboards for effective monitoring and debugging.

Pipelines Full of Context: A GitLab CI/CD Journey

Do you know what version of your software is running in production? How often is that software deployed, and was it deployed right before last week’s p0 incident? What sort of dependencies are being deployed along with that software, and are any of them potential security risks? These are all common observability questions that may be difficult to answer.