Metrics At Scale: Understanding When A Spike in Sales Isn't Good News (Part 1)
Why could a spike itself not always be good news? Why is it so important to find the relationships between time series metrics at scale?
Why could a spike itself not always be good news? Why is it so important to find the relationships between time series metrics at scale?
In A Comedy of Errors, we talk to engineers about the weirdest, worst, and most interesting application and infrastructure issues they’ve encountered (and resolved) over the years. This week, we hear from Or Weis, co-founder and CEO of Rookout. Rookout’s focus is on collecting data in a seamless, immediate way that maximizes a developer’s insight into live code.
When you think of the top 100 sites in the world, you think of high-traffic domains and pages coded to perfection. In fact, even the most popular sites in the world have errors hidden behind the scenes that are still visible in your browser’s developer tools. These can affect your experience as a user directly, create inaccurate tracking data and security vulnerabilities, and even lose the company revenue.
We surveyed 1,264 chat users to find out, and we started with two seemingly simple questions. What we learned was fascinating and inspiring, so we gathered up the data and created the team chat guide.
If you’re building a new application from scratch and are responsible for maintaining its availability and performance, you might wonder whether you should be monitoring logs or metrics. For us, it’s a no-brainer that you’ll want both: metrics are fast and efficient for proactively monitoring the health of your system, while logs are essential for helping to troubleshoot the details of the issue itself to find the root cause.
With the proliferation of virtualization and high availability architecture, teams are chasing 99.999% uptime like knights of old hunted unicorns. Many site reliability engineers find more comfort in the Boy Scouts’ motto, “Always be prepared.” Your company’s Git server is mission critical to the daily operations of engineering and everyone they support. How do you create business continuity in the face of unpredictable circumstances?
Graphite Metrics are one of the most common metrics formats in application monitoring today. Originally designed in 2006 by Chris Davis at Orbitz and open-sourced in 2008, Graphite itself is a monitoring tool now used by many organizations both large and small.
Congratulations, VictorOps! OnPage would like to congratulate our contenders at VictorOps for their acquisition by Splunk. This acquisition of VictorOps validates the growing need for incident management and alerting platforms. As technology advances with sensors technology (IOT), and monitoring system utilizing AI, automation is necessary to achieve improved productivity and business resiliency. Therefore, incident management and alert automation is essential.
We are excited to announce our recent support of .NET Standard 2.0 and ASP.NET Core 2 applications for Raygun Crash Reporting. The update is for developers needing to target the .NET Standard 2 APIs. Our new provider targets both .NET Standard 1.6 and .NET Standard 2.0, so it can be used with both .NET Core 1 and .NET Core 2 applications. At the time of writing, it is just the .NET Core provider and ASP.NET Core provider that are .NET Core 2 compatible.
The Linux Audit framework is a kernel feature (paired with userspace tools) that can log system calls. For example, opening a file, killing a process or creating a network connection. These audit logs can be used to monitor systems for suspicious activity.