Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Build Organizational Trust With PagerDuty Business Response

Imagine the following scenario: A large retailer experiences a major IT incident that impacts their point-of-sale systems. Their on-call engineers are alerted to the issue and begin their work to resolve it immediately. Behind the scenes, teams are collaborating on a fix, but in the storefront, frustration and tension are growing. Customers are complaining about not being able to check out, and in-store personnel have no good answers as to why the outage happened—or when it will be resolved.

Pandora FMS port monitoring

The ancient Roman Empire used to call what is now the Mediterranean Sea as Mare Nostrum (Nostrum Mare), stating what is now the port city of Barcelona as the entry point to Hispania (today the Iberian Peninsula) from the city of Miseno, located in the south of what is now Italy. For ancient Romans, port monitoring was crucial, and for us now too. “How?” you may wonder. Let’s see! Roman trireme model https://commons.wikimedia.org/wiki/File:Trireme_1.jpg.

Scaling Puppeteer & Playwright on Checkly with Terraform

Managing large numbers of checks by hand quickly becomes cumbersome. Luckily, Checkly's REST API allows us to automate most of the repetitive steps. Building on that API, the Checkly Terraform Provider takes automation one step further, enabling users to specify their active monitoring setup as code. In this article, we will be building on top of John Arundel's great intro from a few months back and showing how to manage multiple checks using groups and shared code snippets.

10 Tips for Implementing AIOps

With more and more people working from home and the ever-increasing complexity of IT infrastructure, it’s important to understand the best way to leverage Machine Learning (ML) and Artificial Intelligence (AI) to improve IT operations. ML and AI have promised to bring disruptive changes to IT operations, and many organizations have already decided to adopt Artificial Intelligence for IT Operations (AIOps) or to do it soon. Yet, implementing and deploying AIOps is still very challenging.

Infrastructure Monitoring With Amazon CloudWatch and OnPage Integration

Digitalization of business has transformed the world and its industries. Software that upkeep digital initiatives are no longer categorized as a support function. Rather, they are integral to every business process. Modern organizations require infrastructure monitoring tools to detect anomalies and alerting systems to automate remediation processes.

Hiring and Managing IT During a Crisis

Martha Heller is CEO of Heller Search Associates, an IT leadership executive recruiting firm. Martha is a frequent keynote speaker at IT industry events and author of two books: The CIO Paradox: Battling the Contradictions of IT Leadership, and Be the Business: CIOs in the New Era of IT. She spoke about IT career trends on a recent webinar.

Is your infrastructure a transformation enabler or a roadblock?

According to research from McKinsey, only 16% of digital transformation efforts have successfully improved performance and equipped the organizations to sustain changes in the long term. That’s a shockingly low number. Why is this happening? And how can you prevent this from derailing your own digital transformation journey and putting your business models, revenue, and profitability at risk?

Kubeflow operators: lifecycle management for the ML stack

Canonical, the publisher of Ubuntu, releases Charmed Kubeflow, a set of charm operators to deliver the 20+ applications that make up the latest version of Kubeflow, for easy consumption anywhere, from workstations to on-prem, public cloud, and edge. > Visit Charmed-kubeflow.io to learn more. Kubeflow provides the cloud-native interface between Kubernetes, the industry standard for software delivery and operations at scale, and data science tools: libraries, frameworks, pipelines, and notebooks.