Operations | Monitoring | ITSM | DevOps | Cloud

Alerting

February product updates. Anomaly detection, incident management and better UX

This past couple of month were a bit hectic and there’s a good reason behind that. We’ve set out to create a better experience for our users, and it’s exactly what we did! Besides making a lot of quality-of-life changes, we’ve introduced new features that we think, will speed up the speed at which you’ll debug your applications and give you a whole new perspective on all things AWS Lambda.

IT Ops reporting is broken BigPanda Unified Analytics can help

Your IT Ops execs and your service owners want reports that show easy-to-understand reports on: Application and service uptime and performance, IT Ops and NOC team performance & Incidents by source, severity and other parameters. To do this, your IT Ops team is probably wasting precious hours every week, wrangling with spreadsheets and general-purpose reporting tools are hard to use and update. BigPanda Unified Analytics can change all of that. ..hours that your IT Ops team doesn’t have!

4 Business Disasters That Could Have Been Avoided With Real-Time Anomaly Detection

Digital, network-connected systems are transforming every aspect of business — from your mission-critical workloads to your most rarely used applications. But the increases in scalability and cost efficiency come at a cost. Because every system is so reliant on network connectivity, unplanned downtime is becoming increasingly expensive.

5 Trends Transforming Digital and IT Operations Management

In 2018, digital transformation ushered in a radical shift in how enterprises harness customer insights, technology capabilities, and rapid experimentation to drive revenue growth, profitability, and market leadership. Enterprises spent $1.3 trillion in 2018 on digital transformation technologies like public cloud platforms, microservices and containers, edge computing, machine learning, and artificial intelligence to improve customer experiences, business agility, and employee engagement.

Automate Resource Adjustments for Amazon EC2 with Opsgenie Actions, A Use Case

Opsgenie Actions enable you to automate manual, repetitive tasks so that your resources are freed up to concentrate on higher-value work. This blog post is the second in a series of use cases in which we discuss how Opsgenie works with various third-party automation platforms to automate these traditionally manual tasks—right from the Opsgenie console or mobile app— to reduce interruptions for your on-call responders, and ultimately help your bottom line.

Postmortems Part 2: How to Adopt a Learning Culture

Culture is the way we do things together. It’s the secret sauce that results in happy, healthy teams that consistently meet their goals. It’s also the hardest thing to define, cultivate, and change in an organization. True cultural change requires more than creating and communicating policies. It takes collaboration, persistence, and experimentation.

When it comes to system metrics, skip vanity and promote transparency

At Hosted Graphite, our users rely on us for a heavy-duty component of their business: monitoring their stack. This is a responsibility we take very seriously and we realize how critical it is for a user to know right away whether the problem detected is related to their own systems or to our system. That’s why we choose to publish our internal system metrics to our public status page.