Operations | Monitoring | ITSM | DevOps | Cloud

Elasticsearch Audit Logs and Analysis

Security is a top-of-mind topic for software companies, especially those that have experienced security breaches. Companies must secure data to avoid nefarious attacks and meet standards such as HIPAA and GDPR. Audit logs record the actions of all agents against your Elasticsearch resources. Companies can use audit logs to track activity throughout their platform to ensure usage is valid and log when events are blocked.

9 Best Practices for Application Logging that You Must Know

Have you ever glanced at your logs and wondered why they don't make sense? Perhaps you've misused your log levels, and now every log is labelled "Error." Alternatively, your logs may fail to provide clear information about what went wrong, or they may divulge valuable data that hackers may exploit. It is possible to resolve these issues!!!

Sponsored Post

Announcing: Bitbucket for APM

Raygun's latest integration with Bitbucket gives you code-level insights into your traces, directly in APM. Today, Raygun expands its suite of integrations for APM, introducing the latest addition - Bitbucket. Once your Raygun account is integrated with Bitbucket, you'll be able to see method source code pulled directly from your repository when inspecting a method in APM. If this sounds interesting to you but you use GitHub instead of Bitbucket, don't worry, we've got you covered for that too. Gain greater context into code execution and get to the root cause of slow performance, faster.

Assign Read-Only Access to Users in Logz.io

Cloud monitoring and observability can involve all kinds of stakeholders. From DevOps engineers, to site reliability engineers, to Software Engineers, there are many reasons today’s technical roles would want to see exactly what is happening in production, and why specific events are happening. However, does that mean you’d want everyone in the company to access all of the data?

Root cause analysis using Metric Correlations

As complexity of systems and applications continue to evolve and change, the number of metrics that need to be monitored grows in parallel. Whether you’re on a DevOps team, an SRE, or a developer building the code yourself, many of these components may be fragmented across your infrastructure, making it increasingly difficult to identify the root cause when experiencing downtime or abnormal behavior.

Logging Agents Vs Log Libraries

Log management has been around for a long time, but how we manage our logs has changed profoundly over the years. For effective log management, there are times when you may have to trade off the new for the old, and vice versa. A clear understanding of log agents and log libraries will help assess what works best for different applications and infrastructures.

How Uptime.com Can Help Troubleshoot a Server Outage

Everyone has heard about the 3 AM wakeup call, but what about those troublesome issues that dig at your team and eat away at your SLA hours? Hard-to-diagnose issues can strike at any time. They leach from your team, hurt morale, impede the customer experience… it’s just a whole mess. These kinds of incidents are ones that test what “response” really means to your organization, as fixing them is not always a simple task. Something has gone wrong.

Infrastructure as Code - IAC for Azure

Infrastructure as code and automating deployment and scale-up/down in Azure is becoming the new normal. Solution architects and system administrators are becoming coders and scripting is becoming part of their day-to-day job, whilst in parallel a raft of vendors is providing products to try and help avoid this need to script and address the shortage of staff with those skills to script and code this now necessary functionality.

Incident Review - AWS Outage Led To Spikes In Response Times For Applications Using AWS Services

On Tuesday August 31, users across large parts of the West coast (US-West-2 region) were impacted by major spikes in response time. Some of AWS’ most critical services were affected, including Lambda and Kinesis. SRE teams care about Service Level Indicators (SLIs) and Service Level Objectives (SLOs), and this practice is a must for SRE teams.