Operations | Monitoring | ITSM | DevOps | Cloud

Outages Happen. Now What?

Network outages happen more often than you think. We may not experience them directly or even know they're occurring at all. When outages affect household names like Facebook, Amazon, Microsoft, and others, however, we're sure to find out after the fact that there was an issue. Depending on the user's activities and the duration of the issue, stress and frustration levels can vary. When a marketer can’t get that ground-breaking advertisement up on Facebook, they can get antsy.

Webinar Recap: How Observability Impacts SRE, Development, and Security Teams

In today’s fast paced and constantly evolving digital landscape, observability has become a critical component of effective software development. Companies are relying more on and using machine and telemetry data to fix customer problems, refine software and applications, and enhance security. However, while more data has empowered teams with more insights, the value derived from that data isn’t keeping pace with this growth. So how can these teams derive more value from telemetry data?

Sumo Logic platform video

Sumo Logic SaaS analytics platform makes the world's applications reliable and secure 24x7x365. Learn how Sumo Logic ingests data at scale, helps find and troubleshoot issues fast, and secures user experiences. We integrate with hundreds of out-of-the-box apps, making it easy and seamless to get more from your data quickly. Whether your data resides in multiple clouds or on-premises, now you can monitor, troubleshoot and secure your apps from ONE platform powered by logs.

Analytics in Squadcast | Visualize Team and Organization Level Analytics | MTTA MTTR | Squadcast

Analyzing incident data plays a key role to do better SRE. Squadcast's Analytics Dashboard helps you analyze the performance of your Organization/ Team, for a given time period. It also gives you more insight into past outages that affected your systems.

OnPage - Never Miss a Critical Alert Again (For IT, Clinical Comm. and Collab. & Crisis Comm.)

OnPage is an Incident Alert Management platform that elevates critical notifications to the right person on call to remediate critical events. With Alert-Until-Read capabilities, dynamic digital schedules, escalation policies, incident reports, and redundancies, OnPage aims to ensure that critical alerts are never missed. OnPage serves many industries including, healthcare, information technology, managed services, IoT, and manufacturing. With over 250+ integrations, the solution extends incident alert management to popular ITSM (ticketing), RMM, monitoring and cybersecurity tools. On the healthcare front, OnPage integrates with popular scheduling, IoT, nurse calls, and EMR systems.

GitHub Tried to Change the Checksum for Release Archives. You Should Start Hosting Your Own.

Yesterday, GitHub changed how the archives they provided are made. The result of this change surprised developers, triggering pipeline failures all over the world in most ecosystems. According to this GitHub post, this is a consequence of recent changes to Git itself, released almost six months ago and just deployed within GitHub now with unforeseen impact. This change has thankfully been retracted.

Datadog's commitment to OpenTelemetry and the open source community

The OpenTelemetry (OTel) project is an open source initiative with the goal of providing vendor-neutral standards and tools that enable users to collect telemetry from any source in their environment and send it to any backend. A core tenet of Datadog is to provide a single, unified platform for customers to easily collect and monitor all of their observability data, regardless of where it comes from.

Test Observability with Sumo Logic

The software industry has seen many evolutions. There is a new disruption in the market every five years or so. Software testing cannot remain isolated from all the latest trends and technologies. Testing strategies need to keep up with agile development, faster deployments and increasing customer demand for reliability and user-friendly interfacing. They need to be able to grow just as quickly and just as reliably as the business logic.