Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Datadog's AWS re:Invent 2019 guide

AWS re:Invent is an annual gathering of tens of thousands of AWS staff, partners, and users for a full week of keynote sessions, feature announcements, customer case studies, hands-on workshops, and more. As in years past, we will be there with dozens of engineers, ready to answer your monitoring questions and show you the newest additions to Datadog.

Introducing Logz.io on the Azure Marketplace

The Azure Marketplace makes it easy for customers to identify, subscribe to, then use SaaS and API solutions by removing barriers in the procurement process for purchasing those products. By using the existing payment terms they have with Microsoft, Azure customers can click through a simple workflow to purchase an ISV solution without needing to negotiate new terms or speak directly with an Azure or ISV representative.

Best Tools for Git

Before diving into Git, it’s important to understand the concept of version control at a high level. While not all software developers utilize version control, nor is software development the only industry that does, it is becoming more mainstream every year. And you can bet you will be confronted with version control as a developer if you ever collaborate with others on a workplace team or on open source projects.

Site Reliability Engineering-Why you should adopt SRE

Site reliability engineering was a term coined by Google engineer Benjamin Treynor in 2003 when he was tasked with making sure that Google services were reliable, secure and functional. He and his team eventually wrote the book on SRE which is available online for free for anyone interested in research and implementation of SRE best practices.

Severity Matrix Updates

We’re on a mission to make responding to incidents a bit less chaotic. One of the best features we offer (we’re definitely not biased, no way) is a simple way to define how a severity gets determined when you open an incident. We call it the severity matrix, and today it has a new look. Previously, we had a preset list of conditions and impact that allowed you to pick a severity that matched them.

Practical Advice For Operationalizing IT Surveys

If you have successfully deployed an ITSM solution like Nexthink, you’ll next want to collect timely employee survey data to provide context for your support team. From our experience we recommend you take the following steps: In most large-scale companies, internal communication is usually already owned by the corporate communications department.

Manage People, Projects, and Tasks Efficiently with Taagly

With technology playing an important role today, it is just logical for businesses to use software solutions to streamline communication, improve operations, and enhance efficiency. A large number of companies use task management applications to improve their operations and enhance productivity. People and Tasks are two important elements that need to be managed well to eliminate bottlenecks and achieve success in business.

From Mayhem to Modernization: The Evolution of Critical Incident Management

Let’s face it, managing a critical incident has never been a walk in the park. Even, in the “good old days,” before the great cloud revolution and the onslaught of digital transformations, an incident often meant mayhem. Processes were manual, time consuming, difficult to execute, document, and learn from. Getting all the right people in the “same room” at the right time – was nearly impossible. Lots of time was wasted chasing down the right folks.

Business Intelligence and Log management - Opportunities and challenges

Business intelligence (BI) is all about making sense of huge amounts of data to extract meaningful and actionable insights out of it. Log management tools such as Graylog, instead, are the perfect solution to streamline data collection and analysis, so it’s easy to understand how these two technologies can make sense when they’re coupled together.