Operations | Monitoring | ITSM | DevOps | Cloud

The latest News and Information on DevOps, CI/CD, Automation and related technologies.

Live PostgreSQL Monitoring on Azure & AWS with pgNow

Managing credentials gets trickier as your Flyway project grows, especially when databases contain sensitive data. In this episode, Tony and Tonie break down how Flyway’s Property Resolvers help you keep secrets safe by pulling them securely from local or cloud-based stores, avoiding hardcoding them into config files.

Seeing the Bigger Picture: Why Security Needs Depth, Not Just Products

A recent BBC article, “Weak password allowed hackers to sink a 158-year-old company,” outlined a serious security lapse. This case reinforces the message that we, at Teneo, advocate every day: true resilience comes from defense in depth, i.e. policy, product and process, not just tools at the edge. In a recent customer engagement, we discussed a transition from VPN to ZTNA. While ZTNA offers enhanced security including continual checking, improved segmentation and a minimized attack surface.

Use Telegraf Without the Prometheus Complexity

Every system needs observability. You need to know what your CPU, memory, disk, and network are doing, and maybe keep an eye on database query latency or Redis connection counts. But setting that up isn’t always simple. You start with a couple of shell scripts. Then come exporters. Then Prometheus. Before long, you’re managing scrape configs, tuning retention, and watching dashboards fail under load after two days of data.

Lessons from Alaska's outage: Redundant resilient

Last Sunday, Alaska Airlines suffered a three-hour outage that led to more than 200 flight cancellations and disrupted 15,600 passengers. The culprit? “A critical piece of multi-redundant hardware at our data centers, manufactured by a third-party, experienced an unexpected failure. When that happened, it impacted several of our key systems that enable us to run various operations, necessitating the implementation of a ground stop to keep aircraft in position.”

Automate Disk Space Management on Windows with Resolve

Struggling with managing disk space issues on your servers or virtual machines? See how you can use Resolve to automate disk space addition and expansion on Windows systems, saving time, reducing manual errors, and eliminating the need for high-level administrative access. In this video, you'll learn how Resolve automates the process of: Whether you’re a system admin, IT operations engineer, or automation enthusiast, this demo highlights how you can streamline infrastructure tasks using intelligent automation.

Automating Linux Disk Expansion with Resolve: Add & Extend VM Disks in Minutes!

Running into disk space issues on your Linux servers or virtual machines? In this step-by-step demo, we show how Resolve’s powerful automation platform can help you automatically add and expand disk space on Linux systems, eliminating manual processes, reducing human error, and improving operational efficiency. In this video, you’ll learn how to: Technologies Featured: Whether you're a system admin, IT operations engineer, or automation specialist, this demo highlights how to streamline critical disk management tasks that normally require elevated access and technical knowledge.

Build. Release. Run. Repeat. But Where's the Control?

In every engineering organization, from fintech unicorns to 20,000-seat global bank, delivery happens in a loop. Code gets built. Releases get pushed. Systems run 24/7. Then it all happens again. This cycle isn’t an opinionated lifecycle dreamed up by a consultant or vendor, it’s just the reality of software delivery today.

What to expect in a Gremlin workshop

Gremlin workshops give your team hands-on training with Gremlin so they can get real results and dramatically improve your reliability. Full transcript:  The goal of our workshops is really to accelerate you and the team in your reliability journey. Whether you're starting out for the first time, or you're a more advanced user, this workshop is really designed for you to take you to the next level.

From Dial-Up to Colo: The Impact of AI on Data Center Design

In this episode of Uplink, we’re joined by Jay Smith, VP of Data Center Operations and Engineering at Evocative. With nearly 30 years in the industry, Jay unpacks how data centers are adapting to support AI’s massive power and cooling demands. This episode covers: Why colo is thriving in the AI era Liquid cooling and rear-door heat exchangers Powering 275kW racks and beyond How AI inference is shifting compute to the edge Career opportunities in infrastructure without a degree.