Operations | Monitoring | ITSM | DevOps | Cloud

Automating Identity Lifecycle Management

The identification of every user making a request to a given system is vital to ensuring that action is only taken by, and information only returned to, those who need it. This happens in two steps: first, the requester is identified (authenticated), and then that identity is used to determine which parts of the application they are allowed to access.

SigNoz Community Call - August 2021

SigNoz is an open source alternative to DataDog, New Relic. In this community call, we discuss how the technical architecture in detail and how data flows in the backend services. We also discuss steps on how we can make SigNoz more performant including ways to benchmark performance at different loads. We hold a community call in the last/second last Saturday of every month.

DevOps' Problem with Speed-to-Market Explained: IBM MQ, Multi-Middleware Role in Deploying New Applications & Updates

If your organization is frustrated with how long it takes to roll out new applications and updates, they are not alone. Speed-to-market is an obsession at many companies today (see call-out box below), so anything that restricts or slows it down is a problem.

Artificial Intelligence will be the commander of the future wars

Artificial intelligence is one of several hot technologies that have the potential to transform the face of combat in the next years. The Joint Artificial intelligence Center was established by the Department of Defense to win the artificial intelligence war. AI might enable autonomous systems to execute missions, achieve sensor fusion, automate activities, and make better, faster judgments than people, according to some visions. AI is quickly developing, and those objectives may be met shortly.

How the Pandemic Impacted the Government's Cloud Migration Plans

“Cloud-first” has been a government imperative for many years, but the pandemic usurped this strategy, making “cloud-now” a priority. The results have been transformational. The cloud made wide-scale government telework possible, but it’s also given agencies the opportunity to test drive new cloud applications and experience the scalability and security benefits first-hand.

Has the firefighting stopped? The effect of COVID-19 on on-call engineers

With digital becoming the primary channel for work, education, shopping, and entertainment in the last 18 months, it’s no surprise that workloads for technical teams and on-call engineers have increased. Data from PagerDuty’s inaugural platform insights report, The State of Digital Operations, highlights this reality. As of July 2021, the average number of events managed daily by PagerDuty is 37 million, with 61,000 of those being critical incidents.

Model-driven observability: Taming alert storms

In the first post of this series, we covered the general idea and benefits of model-driven observability with Juju. In the second post, we dived into the Juju topology and its benefits with respect to entity stability and metrics continuity. In this post, we discuss how the Juju topology enables grouping and management of alerts, helps prevent alert storms, and how that relates with SRE practices.