Operations | Monitoring | ITSM | DevOps | Cloud

Scaling HashiCorp's Cloud Platform - Dash 2021 (HashiCorp)

Identifying bottlenecks during times of high load is critical to building a scalable software platform. Stress testing is one way to simulate high load on a system and allows you to proactively capture potential bottlenecks before they impact customers. Once a solution is implemented to address the bottleneck, you need a way to measure success and find a new limit. See how HashiCorp Cloud Platform (HCP) has developed a stress testing framework which heavily relies on Datadog’s custom metric capabilities in combination with some out of the box integrations to give HCP engineers a comprehensive view of their platform and how they used these insights to scale their concurrent data-plane provisioning by 300%.

Panel: Handling Incident Response - Dash 2021 (Datadog, PagerDuty)

When customer-impacting downtime happens, it’s crucial that responders are prepared and can resolve these issues as quickly as possible. Knowing the right tools to use, from wherever you are working from, will help to have a well-defined strategy in place to come together as a team, work the problem, and get to a solution quickly. In this roundtable discussion, PagerDuty and Datadog engineers chat about incident responses and how we use all the tools at our disposal to respond quickly and effectively.

Roundtable: The Complexities of Cloud Migration - Dash 2021 (Datadog, LaunchDarkly, StockX)

Often when completing a migration project, you’re having your organisation straddle between two systems. You’re fighting habits and changing attitudes while also attempting to complete a high-risk operation. Every software team at one stage in their career will have to complete a migration. Whether it’s to improve scalability and performance, or transition between an on-prem to cloud solution, you’ll need a deep understanding of your current environment to create a strategy that minimises downtime for your team.

How to do serverless monitoring right #shorts

Monitoring CPU load and memory usage is common practice, but with serverless no action is required. In this video, we quickly explain that if your Cloud Run instances start hitting high CPU load, Google Cloud will automatically spin up new instances for you, and vice versa!

Various policy engines for Kubernetes policies - Saiyam Pathak

Kubernetes configurations are complex to manage across developers and operators. External tools like Helm, Kustomize cannot ensure environment-specific configurations and admission controllers provide a way to do this. Now, various tools have evolved over time that helps solve this problem - OPA Gatekeeper, Kyverno, Kubewarden and jsPolicy. In this talk during ContainerDays 2021, Saiyam Pathak from Civo goes through the need for a policy engine and discusses how each of the tools help along with the differences between them and where these are headed to.

CPG: Understand your consumers and meet them at the shelves

Product Pricing Isn’t Hocus Pocus. . . Scott Weitzman sits down with Alex Barnes to discuss what exactly Revenue Growth Management is, how it works, and why now is the most important time to focus on consumer retention. As Scott learns, this is not just some magical system that makes prices, but rather something that allows CPG companies to understand their consumers better and help meet them at the shelves.

Streaming Auth0 Logs to Datadog | Sivamuthu Kumar (Computer Enterprises, Inc.)

Are you using Auth0 in your application for user logins? How will you monitor the Auth0 logs and detect user actions that could indicate security concerns? In this session, we will see how Datadog helps you to extend security monitoring by analyzing Auth0 User activities in the logs. And also we will see how to set up threat detection rules to trigger notifications automatically based on them.

Maintaining Operational Sanity Across 100+ AWS Accounts | Eric Mann / Ryan Tomac (Vacasa)

At Vacasa, AWS accounts represent the unit of isolation for distinct applications & services in our software ecosystem, providing security benefits and operational autonomy for our teams as we scale. Managing accounts at this scale requires strong DevOps practices to maintain security, operational sanity, and uniform observability across the system. In this talk, we’ll cover the benefits of such an approach, the practices that make it possible, and the important role Datadog plays.

Democratizing Delivery: Seamless Observability for Optimal Application Performance |Ekim Maurer(NS1)

When application delivery performance issues happen, observability is critical to diagnosing the problem at hand. The adage “it’s always DNS” means that observability must extend to the foundational layers of the application delivery and access networking stacks. Yet granting administrative access to core network services like DNS and DHCP may run contrary to an organization’s least-privileged access policies. In this session, attendees will learn how global internet companies and enterprises use NS1 and Datadog to provide democratized DNS observability and reach optimal application performance.