Operations | Monitoring | ITSM | DevOps | Cloud

Latest Blogs

Building On-call: Continually testing with smoke tests

With the release of On-call, our system’s reliability had to be solid from the outset. Our customers have high expectations of a paging product—and internally, we would not be comfortable with releasing something that we weren’t sure would perform under pressure. While our earlier product, Response, was the core of a customer’s incident response process after an incident was detected, we’re now the first notification an engineer gets when something’s wrong.

Virtualization vs Cloud Computing: What's the Difference?

What’s the difference between virtualization vs cloud computing? The server virtualization market is growing, driven by the need to modernize procurement procedures and manage compliance policies. Fortune Business Insights states that in 2023, 66% of businesses reported increased agility due to virtualization implementation. They also found that companies with over 100 computers have already adopted virtualization, and smaller industries with fewer than 100 workstations are quickly following suit.

The Role of Machine Learning in Cybersecurity

Machine learning (ML) in cybersecurity dates back to the early 2000s and has become a key tool today in fighting cyber threats. According to Cybersecurity Ventures, global spending on cybersecurity products and services is expected to exceed $1.75 trillion cumulatively from 2021 to 2025, highlighting the increasing reliance on advanced technologies to combat cyber threats.

How AWS Regions Affect Cloud Costs (And How To Reduce Fees)

AWS is the most popular cloud service provider partly due to its global data center network. The distribution enables organizations to configure their workloads to meet the needs of their global clients. The thing is AWS Regions charge different rates for almost everything, from compute and storage to data backup and retrieval services. And these cost variances can add up quickly.

Best Windows Server Monitoring Tools

Server monitoring involves continuously observing and tracking the performance, availability, and health of servers within an IT infrastructure and is a vital process for organizations aiming to enhance their servers. By conducting server monitoring, with the assistance of server monitoring tools, your organization can detect issues such as hardware failures or software glitches promptly allowing for quick resolutions as server monitoring tools continuously track server health and performance metrics.

How to Avoid Website Downtime

Website downtime refers to periods when a website is inaccessible or non-functional due to various issues. This can range from a few seconds to several hours or even days, depending on the severity of the problem and the efficiency of the recovery measures. During downtime, users cannot access the website's services or content, which can result in a loss of business and user trust.

Feature Friday #22: Don't fix, just warn

Did you know that CFEngine can simply warn about something not being in the desired state? Traditionally with CFEngine, you define your desired state and CFEngine works towards making that happen. Sometimes you might not want CFEngine to take action and instead warn that a given promise wants to change something. Let’s take a look at a contrived example.

Monitor Microsoft Fabric with Datadog

Microsoft Fabric is Microsoft’s new platform for all things data analytics—integrating key Azure data analysis products like Azure Data Factory, Azure Synapse, and Power BI into a unified platform. Fabric is intended to provide a one-stop shop where users with various levels of expertise across an organization can perform data analysis and collect insights.

Intelligent Alerting, Fewer Headaches: Insider View at ilert AIOps

You might have noticed that we released a series of AI-supported features last year. Intelligent alert grouping, developed to reduce alert fatigue, is the icing on the cake. ‍ With it, we combined all ilert AI features in a new powerful add-on that aims to reduce stress and give more clarity during IT incidents.

Troubleshooting Time Series Databases: Where Did My Metrics Go?

Complex modern applications rely heavily on observability, and metric monitoring is a crucial part of observability. The most common process of metric monitoring, which includes data scraping, processing, storage, and visualization, can be summarized in the diagram below: If an issue arises, for example, when users ask, “I have already recorded metrics in the application, why can’t I see my metrics on Grafana?”, how should we troubleshoot it?