The latest News and Information on DevOps, CI/CD, Automation and related technologies.
The rapid pace of updates and upgrades to operating systems, software frameworks, libraries, programming language versions – a boon to the future of fast-paced software development, has also come to slightly bite us in the back because of having to manage these very many dependencies with their different versions across different environments.
For many software engineers and developers, using standard libraries or built-in objects is just not enough. To save time and increase efficiency, most developers build on work done by others. Whatever the coding problem, there is likely another programmer who has already created a solution for it. There is usually no need to repeat the problem-solving process. This principle is known as Do not Repeat Yourself or DRY.
An on-call schedule tells you and everyone in the team who will be the first responder when an issue happens in production. The on-call team member is responsible for investigating the issue, either fixing the issue herself or adding other people who can help fix it. Having an on-call schedule is important for building reliable systems because making someone responsible for production issues makes sure that they're not ignored.
Adding alerts across your monitoring tools is taking a proactive approach to reliability. But if there are too many alerts, then it can become counterproductive because team members will start ignoring alerts or remove the alerting altogether. Which is why you need a systematic approach to adding alerts and dealing with them.