Metrics At Scale: Understanding When A Spike in Sales Isn't Good News (Part 1)
Why could a spike itself not always be good news? Why is it so important to find the relationships between time series metrics at scale?
Why could a spike itself not always be good news? Why is it so important to find the relationships between time series metrics at scale?
For any company with an online presence, website traffic is important. The more visitors you attract, the more opportunities you’ll have to advertise your brand, establish relationships and ultimately sell your service or product. This is why a sudden drop in search engine traffic is a frightening prospect, since it ultimately leads to business losses and lower revenue.
As a long-time security professional, I’m always interested to hear about how companies like Datadog are keeping up with the changing security landscape. I can recall when the security organization was solely responsible for security, and we were focused on protecting the perimeter of our business. However, with the advent of the cloud, mobile, and web applications, that perimeter has disappeared.
Congratulations, VictorOps! OnPage would like to congratulate our contenders at VictorOps for their acquisition by Splunk. This acquisition of VictorOps validates the growing need for incident management and alerting platforms. As technology advances with sensors technology (IOT), and monitoring system utilizing AI, automation is necessary to achieve improved productivity and business resiliency. Therefore, incident management and alert automation is essential.
I continue to be intrigued by the evolution of software architectures and their impact on business. In my 20+ year career, I’ve participated in four of these architecture transitions – the shift from client-server to the internet, the rise of 3-tier architectures underpinning rich internet applications, virtualization that upended the dominance of hardware providers, and now the shift to microservices-based architectures based on cloud infrastructure and software automation.
If you’re building a new application from scratch and are responsible for maintaining its availability and performance, you might wonder whether you should be monitoring logs or metrics. For us, it’s a no-brainer that you’ll want both: metrics are fast and efficient for proactively monitoring the health of your system, while logs are essential for helping to troubleshoot the details of the issue itself to find the root cause.
Graphite Metrics are one of the most common metrics formats in application monitoring today. Originally designed in 2006 by Chris Davis at Orbitz and open-sourced in 2008, Graphite itself is a monitoring tool now used by many organizations both large and small.
We surveyed 1,264 chat users to find out, and we started with two seemingly simple questions. What we learned was fascinating and inspiring, so we gathered up the data and created the team chat guide.
With the proliferation of virtualization and high availability architecture, teams are chasing 99.999% uptime like knights of old hunted unicorns. Many site reliability engineers find more comfort in the Boy Scouts’ motto, “Always be prepared.” Your company’s Git server is mission critical to the daily operations of engineering and everyone they support. How do you create business continuity in the face of unpredictable circumstances?