Operations | Monitoring | ITSM | DevOps | Cloud

Latest Posts

How to Notify Your Team of Errors: Email vs. Slack vs. PagerDuty

Site Reliability Engineering (SRE) and Operations (Ops) teams heavily rely on notifications. We use them to know what’s going on with application workloads and how applications are performing. Notifications are critical to ensuring SREs and Ops teams can resolve errors and reduce downtime. They’re also crucial when monitoring environments — not only when running in production but also during the dev-test or staging phase.

Accelerating Dev Workflows: Terminal-driven Debugging

The pursuit of Digital Transformation and DevOps practices has led to several benefits such as increased deployment rates and better collaboration across teams. However, it has also led to endless abstraction, an increase in responsibilities, and many new tools (Kubernetes, hybrid-clouds and all their services, etc.). This increase in complexity has turned observability into an essential component of all ecosystems.

The Spike Protection Bundle with Index Rate Alerting

For DevOps teams that want to accelerate release velocity and improve reliability, logs can unlock the insights you need to move faster. But for managers and budget owners, logging can be an unpredictable pain. Trying to estimate logging spend, especially with the adoption of microservices and container-based architecture, seems like an impossible task.

Announcing LogDNA Agent 3.2 GA: Take Control of Your Logs

The LogDNA Agent is a powerful way for developers and SREs to aggregate logs from their many applications and services into an easy-to-use web interface. With only 3 kubectl commands, the installation process is quick and simple to complete for any number of connected systems. To help control the logs that are stored and surfaced in the LogDNA web interface, users can set Exclusion Rules, which enables the exclusion of certain queries, hosts, and tags directly from the UI.

Using LogDNA To Troubleshoot In Production

In 1946, a moth found its way to a relay of the Mark II computer in the Computation Laboratory where Grace Hopper was employed. Since that time, software engineers and operations specialists have been plagued by “bugs.” In the age of DevOps, we can catch many bugs before they escape into a production environment. Still, occasionally they do, and they can spawn all kinds of unexpected problems when they do.

Using LogDNA and your Logs to QA and Stage

An organization’s logging platform is a critical infrastructure component. Its purpose is to provide comprehensive and relevant information about the system, to specific parties, while it's running or when it's being built. For example, developers would require detailed and accurate logs when building and implementing services locally or in remote environments so that they can test new features.

Using LogDNA to Debug in Development

Developing scalable and reliable applications is a serious business. It requires precision, accuracy, effective teamwork, and convenient tooling. During the software construction phase, developers employ numerous techniques to debug and resolve issues within their programs. One of these techniques is to leverage monitoring and logging libraries to discover how the application behaves in edge cases or under load.

Why Logging Matters Throughout the Software Development Life Cycle (SDLC)

There are multiple phases in the software development process that need to be completed before the software can be released into production. Those phases, which are typically iterative, are part of what we call the software development life cycle, or SDLC. During this cycle, developers and software analysts also aim to satisfy nonfunctional requirements like reliability, maintainability, and performance.

Announcing the LogDNA and Sysdig Alert Integration

LogDNA Alerts are an important vehicle for relaying critical real-time pieces of log data within developer and SRE workflows. From Slack to PagerDuty, these Alert integrations help users understand if something unexpected is happening or simply if their logs need attention. This allows for shorter MTTD (mean time to detection) and improved productivity.