Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Heroku to AWS Migration: Complete Guide for a Seamless Transition

Feeling overwhelmed by the complexities of migrating your application from Heroku to AWS? You're not alone. Many organizations face challenges like infrastructure management, scaling, and data transfer during this process. Migrating from a PaaS like Heroku to an IaaS like AWS provides many benefits including but not limited to more control, scalability, and cost efficiency. However, this migration comes with different challenges, especially the technical expertise required for a smooth migration.

Trusting AI for Incident Response: The Role of AI in Modern Incident Management

In an age where every second counts, the swift resolution of IT incidents can mean the difference between maintaining business continuity and enduring significant operational setbacks. As businesses increasingly embrace digitalization, the complexity and volume of incidents rise exponentially. This new reality calls for innovative approaches to incident management—ones that can manage the unpredictability, scale, and urgency of modern IT ecosystems. Enter artificial intelligence (AI).

An Engineer's Checklist of Logging Best Practices

The best DevOps and SRE teams have shifted their approach to monitoring and logging their systems. These teams debug problems cohesively and rationally, regardless of the system’s complexity. Gone are the days of having a slew of logs that fail to explain the cause of alerts, system failures, and other unknowns.

What Does Archiving Mean? Definition and Examples

Archiving is a crucial concept in both personal and business data management, ensuring that important information is preserved for future use without cluttering up active systems. In today’s digital world, where vast amounts of data are generated every second, understanding the value of archiving and how it works can help organisations stay efficient, compliant, and secure.

Graphite vs Prometheus: Which One Is Best For Monitoring K8s?

Monitoring K8s is crucial to ensure that your applications run smoothly. But before you look for a monitoring solution, you need to ask what tools are the best for your situation. There are several options, but Graphite and Prometheus are two leading options. This article will compare the two.

Canonical and OpenAirInterface to collaborate on open source telecom network infrastructure

Canonical is excited to announce that we are collaborating with OpenAirInterface (OAI) to drive the development and promotion of open source software for open radio access networks (Open RAN). Canonical will bring automation in software lifecycle management to OAI’s RAN stack, alongside additional infrastructure capabilities. This will better enable telcos to adopt open source software as the telecom industry transitions to Open RAN running on COTS hardware.

AWS NAT Gateway Pricing: Simple Strategies To Limit Costs

We are often asked about areas where customers overspend on Amazon Web Services (AWS). NAT gateway costs are high due to misplaced data transfers and are definitely near the top of our list. This article will walk you through five steps you can take to find out which data transfers you’re overspending on and how you can eliminate those excess charges.

Simplifying AWS Testing: A Guide to AWS SDK Mock

Testing AWS services is an essential step in creating robust cloud applications. However, directly interacting with AWS during testing can be complicated, time-consuming, and expensive. The AWS SDK Mock is a JavaScript library designed to simplify this process by allowing developers to mock AWS SDK methods, making it easier to simulate AWS service interactions in a controlled environment. Primarily used with AWS SDK v2, AWS SDK Mock integrates with Sinon.js to mock AWS services like S3, SNS, and DynamoDB.

Interpreting your reliability test results

Gremlin’s default suite of reliability tests analyzes critical functions of modern services: scalability, redundancy, and resilience to dependency failures. Services that pass this suite of tests can be trusted to remain available during unexpected incidents. But what happens when a service fails a test? How do you take failed test results and turn them into actionable insights? This blog aims to answer that question.