Managing application updates in production and ensuring the reliability of software releases in Kubernetes environments can be challenging. Small changes can sometimes lead to unforeseen issues in production. These unexpected problems, combined with the lack of scalability and the high costs associated with managing complex solutions, can be daunting.
To increase business agility, IT organizations are deploying dynamic, modern architectures enabled by virtualization technologies. That includes containers, elastic clouds, microservices, and virtual machines. If you are rethinking your IT stack, you must also reconsider its management. IT operational silos limit business velocity.
While the Biden administration aggressively pushes federal agencies to modernize their IT infrastructures, ITOps managers are left wondering how to do so without making network management more complex than it already is. Modernization necessitates the addition of more tools, which can easily lead to tool sprawl and increase technical debt. Managers are already using multitudes of vendor-specific tools to monitor different devices and applications. The last thing they want is to add more.
Containerized microservices have been the gold standard for cloud computing since they replaced the monolith architecture over a decade ago. The flexibility, scalability, and velocity they enable for teams make them an obvious choice. Yet, a strict interpretation of one service for one function doesn’t quite serve everyone, especially when architectures get large. We’ll discuss how flexibility in service architecture might be the way to go.
Every software-driven business strives for optimum performance and user experience. Observability—which allows engineering and IT Ops teams to understand the internal state of their cloud applications and infrastructure based on available telemetry data —has emerged as a crucial practice to help engage this process. For years, application performance monitoring (APM) was the de facto practice and tooling that organizations have used to keep tabs on their critical systems.
Goutham Veeramachaneni, a product manager at Grafana Labs, and Carrie Edwards, a senior software engineer at Grafana Labs, are both contributors to the Prometheus open source project. This post, which they wrote together, was originally published on the Prometheus.io blog in March 2024. The OpenTelemetry project is an observability framework and toolkit designed to create and manage telemetry data such as traces, metrics, and logs.
Artificial Intelligence (AI) technologies are evolving at breakneck speed. Today's cutting-edge model may become obsolete tomorrow. This rapid evolution, while exciting, presents a challenge – how to leverage the current best AI capabilities without being tied to a single model or provider. At InvGate, we apply a strategy that we call “agnostic AI” (as in platform agnostic).
To become battle-tested, you need to go through battles, not just read books or mentor newcomers. Both are helpful but the stakes are low. On the other hand, high stake jobs, such as running a big project or managing a team, are hard to get when you lack experience. So how can we solve this dilemma? Enter incident response.
I wish someone had told me that I shouldn’t hop between frameworks. Just like learning four programming languages in your first year, in my experience spending time content switching as a beginner is wasted effort. If I’d spent a solid year learning how to deploy services on AWS, then when it was time to learn Azure, I’d see more similarities than differences and find it a lot easier to pick up a second public cloud.