Operations | Monitoring | ITSM | DevOps | Cloud

Latest News

Streamline incident management with BigPanda's offering in the Datadog Marketplace

BigPanda is a domain-agnostic AIOps platform that helps organizations detect and resolve incidents in their complex IT environments. By unifying and correlating data from monitoring, change, and topology tools, BigPanda enables teams to quickly pinpoint the root cause of issues and prevent costly outages.

Are you an MS Teams shop? We've got you Covered with Blameless Incident Resolution

We have an exciting announcement. Blameless is providing early access to our Microsoft Teams integration. SRE and engineering teams can now resolve incidents faster without leaving the comfort of their favorite messaging tool. With the Blameless incident resolution product, Microsoft Teams users can now reduce toil in routine incident response processes through automation, codify processes with checklists, and craft retrospectives with the ‘add to timeline’ command.

Who is on standby? Simple question, simple answer.

In our feature session for the current Enterprise Alert release, we were asked if it was possible to make the on-call page available to every employee regardless of whether they have a user account in Enterprise Alert or not. This option has existed in Enterprise Alert for a long time, but admittedly it is not very well documented. So I would like to take this opportunity to show you what the on-call overview can offer you and how to share the on-call page.

Copy and Paste Multi-Team Schedules

With the release of Enterprise Alert 9, not only have our capabilities for tighter integration with almost any source system imaginable been massively expanded, but our front end has also received some much requested updates. Among them are our multi-team schedules. These allow – especially for international companies – a simple and clear planning of readiness of several teams across different time zones.

Integration of Enterprise Alert 9 with AzureMonitor

Our Azure Monitor connector provides seamless 2-way integration of Enterprise Alert 9 with Azure Monitor. Once added to your Enterprise Alert instance, the connector will read your Azure Monitor alerts fully automatically and trigger alert notifications, e.g. to your team members on duty. It also synchronizes the alert status from Enterprise Alert 9 to Azure Monitor so that if alerts are acknowledged or closed, this status is also updated on the according alert in Azure Monitor.

Error Budgets Explained (And How to Make One for Your Team)

Wondering what error budgets (EBs) are and how they are useful? We explain what they are, how they are defined, and how they can help your team. An error budget is the amount of acceptable unreliability a service can have before customer happiness is impacted. If a service is well within its budget, the developers can take more risks in their releases. If not, developers need to make safer choices.

The 7 SRE Principles [And How to Put Them Into Practice]

Whether you're just adopting SRE or optimizing your current processes, we can help. We’ll explain the 7 key principles of SRE and how to put them into practice. So, what are the SRE principles? The fundamental SRE principles are: SRE is a method that operates through principles. Instead of prescribing specific solutions, it guides you with best practices. These SRE principles help organizations decide what's best for them. Once you understand the principles, you can apply them in many areas.

Deliver Real-Time Alerts From Facility Management Systems

Facility managers, including service technicians, are expected to operate their facilities safely to meet the expectations of customers. They focus on the smooth functioning and maintenance of many components that fall within the scope of their facility. Typical components include roads, pavements, HVAC and plumbing systems. As a facility manager, staying on top of these siloed and geographically dispersed systems can be challenging.