Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

SRE Redefines IT Operations as Architect of Sustainable Systems

Site Reliability Engineering (SRE) is a term that’s getting attention and gaining momentum – and for a good reason. SRE takes features of software engineering and applies them to various problems in infrastructures and operations. Organizations look to build SRE teams with a couple goals in mind, including to create and increase scalability and develop solid software systems.

Cloud Native Security Must Go Beyond the Perimeter

One month after the MOVEit vulnerability was first reported, it continues to wreak havoc on U.S. agencies and commercial enterprises. Unfortunately, the victim list keeps growing and includes organizations such as the U.S. Department of Health and Human Services, the U.S. Department of Energy, Merchant Bank, Shell, and others.

Getting started with AWS CloudWatch

Out of more than 100 services that Amazon Web Services (AWS) provides, Amazon CloudWatch was one of the earliest services provided by AWS. CloudWatch was announced on May 17th, 2009, and it was the 7th service released after S3, SQS, SimpleDB, EBS, EC2, and EMR. AWS CloudWatch is a suite of tools that encompasses a wide range of cloud resources, including collecting logs and metrics; monitoring; visualization and alerting; and automated action in response to operational health changes.

Cloud connectivity and interoperability

The post-pandemic world has transformed our work habits and the landscape of conducting business. Organizations now take the hybrid approach to work, wherein employees may work from an office, while travelling, or from a remote location. This fundamental shift has accelerated the pace of cloud adoption, as the cloud makes data access possible from anyplace, anytime. But the cloud brings with it a set of complexities that must be managed.

What is Scalability?

The number of simultaneous requests that an application can successfully support is a measure of its scalability. The point at which an application can no longer successfully handle more requests is its scalability limit. When a key piece of hardware is exhausted and new or more machines are needed, this limit is reached. Scaling these resources can include any combination of CPU and physical memory (different or more computers), hard disc (larger hard drives, less "live" data, solid state drives), and/or network bandwidth (several network interface controllers, larger NICs, fibre, and so on).

CD for machine learning: Deploy, monitor, retrain

While there are an increasing number of off-the-shelf machine learning (ML) solutions that promise to adapt to your specific requirements, organizations that are serious about investing in ML for the long term are building their own workflows tailored exactly to their data and the outcomes they expect. To make full use of this investment, ML models must be kept up to date and working from the freshest available data.

Azure Distributed Transaction Performance Monitoring

In this article, we will explore Azure Distributed Transaction Performance Monitoring using Serverless360’s new feature called BAM Duration Monitoring. Our primary focus will be effectively monitoring a long-running business process implemented using the dynamic combination of Logic Apps and Data Factory.