Operations | Monitoring | ITSM | DevOps | Cloud

Latest Blogs

Creating a Project Post Mortem | The Why's and How's of Finishing Projects

A project post mortem is a lethal-sounding term that seeks to answer the question: did this project work? Was it worth the investment and the time, and if it wasn’t can you learn from it? As the term implies, the project must be “no more” or “ceased to be” or “bereft of life.” Creating this document requires a time investment, and it’s tempting to just move onto the next project.

Automate Resource Adjustments for Amazon EC2 with Opsgenie Actions, A Use Case

Opsgenie Actions enable you to automate manual, repetitive tasks so that your resources are freed up to concentrate on higher-value work. This blog post is the second in a series of use cases in which we discuss how Opsgenie works with various third-party automation platforms to automate these traditionally manual tasks—right from the Opsgenie console or mobile app— to reduce interruptions for your on-call responders, and ultimately help your bottom line.

Kafka Logging with the ELK Stack

Kafka and the ELK Stack — usually these two are part of the same architectural solution, Kafka acting as a buffer in front of Logstash to ensure resiliency. This article explores a different combination — using the ELK Stack to collect and analyze Kafka logs. As explained in a previous post, Kafka plays a key role in our architecture. As such, we’ve constructed a monitoring system to ensure data is flowing through the pipelines as expected.

The Sentry Workflow - Triage

We get it — errors suck. And you don’t want to spend too much of your time fixing them, dealing with them, investigating them, etc. In our Workflow blog post series, we’ll help you optimize your, well, workflow, from crash to resolution. To quote “the second-most frequently quoted writer in The Oxford Dictionary of Quotations after Shakespeare,” Alexander Pope, “to err is human.” There will always be errors, even in code written by the best developers.

Stackdriver usage and costs: a guide to understand and optimize spending

Google Stackdriver is a cloud-based managed services platform designed to give you visibility into app and infrastructure services. Stackdriver’s monitoring, logging and APM tools make it easy to navigate between data sources to view performance details and find the root causes of any issues.

How to Uncover Deep-Dive Citrix Analytics for Effective Troubleshooting

Monitoring, observability and analytics have become common phraseology in the Citrix world. Alongside every Citrix project, whether it’s a new deployment, expansion, upgrade, or cloud migration, what is essentially needed in the IT toolkit is analytics. Analytics means not just data, but insights obtained through continuous monitoring; periodic measurements of critical performance metrics that convey a lot about the health and availability of the Citrix Virtual Apps and Desktops session.

Postmortems Part 2: How to Adopt a Learning Culture

Culture is the way we do things together. It’s the secret sauce that results in happy, healthy teams that consistently meet their goals. It’s also the hardest thing to define, cultivate, and change in an organization. True cultural change requires more than creating and communicating policies. It takes collaboration, persistence, and experimentation.