Operations | Monitoring | ITSM | DevOps | Cloud

Better integration tests in Cursor using proxymock

Cursor is fantastic at cranking out code changes. I recently used it to splice a brand-new downstream API call into one of our Go microservices, and the diff looked great. The unit tests finished before I lifted my coffee mug, yet I still had zero certainty the change would survive contact with real traffic. That gap is all about integration tests, so I paired Cursor with proxymock and the outerspace-go demo service to prove the behavior end to end.

Top 7 reasons behind poor user experiences and how to fix them

User experience (UX) has become a pivotal factor in influencing the success of a product. You've probably experienced it yourself by clicking away from a slow website or abandoning an app that just doesn't work right. For product owners, the difference between success and failure often comes down to how smoothly users can interact with your product. But here's the problem: Creating that seamless experience is tougher than it looks.

AI-Suggested Alert Thresholds for Mobile Telemetry

Life is pretty good. I’ve shipped a mobile app and I’m (happily) drowning in telemetry. Battery impact, time in foreground/background per screen, crash rates, slow frames, network retries – the works. The data is brilliant; the challenge is turning signals into reliable alerts that catch real issues which are relevant to my app’s functions. So… what should I actually listen for, and where should I set the thresholds?

Five ITOps best practices to stay ahead during major third-party outages

When external providers fail—whether it was CrowdStrike outage last year, AWS outage last month, or the Cloudflare DNS outage yesterday—the symptoms inside your environment often look like internal issues: timeouts, login failures, API errors, service degradation, or sudden spikes in dependency-related alerts. It’s natural for teams to start searching through their own infrastructure first, but none of these symptoms clearly point to your systems as the root cause.

OnlineOrNot's lessons from Cloudflare's outage on 2025-11-18

On 2025-11-18 at 11:48 UTC, Cloudflare declared an incident affecting the global network (that also affected OnlineOrNot). OnlineOrNot monitors websites, APIs, web apps, and cron jobs, while providing status pages as well. While we partially mitigated the issue by enabling a fallback to AWS-based monitoring, between 13:00 UTC and 14:33 UTC failing checks went unreported, heartbeat checks over-reported, and status pages were unavailable.

Navigating External Outages: How Selector Cuts Through the Cloudflare Noise

Yesterday’s widespread Cloudflare outage reminds us how crucial external dependencies are to the stability of our own applications. When a key edge provider like Cloudflare goes down, the impact on your internal monitoring systems can look like a catastrophic, internal system failure triggering a massive storm of alerts and sending engineering teams into frantic, misdirected debugging sessions.

The database professional of the future: headlines from Redgate's Keynote at PASS Data Community Summit 2025

Redgate took the main stage earlier today to open PASS Data Community Summit with our keynote, where we shared our vision for the future of the database development experience – one driven by speed, safety, and the intelligent use of AI. As data estates grow in scale and complexity, and as organizations push to deliver software faster than ever, the role of the database is undergoing significant change.

OTel Updates: Complex Attributes Now Supported Across All Signals

OpenTelemetry now supports maps, heterogeneous arrays, and byte arrays across all signals. Here’s where these new types shine — and where simple primitives still fit naturally. If you’ve been working with OpenTelemetry for a while, you’re likely familiar with the straightforward key-value approach to attributes. It’s simple, fast, and works well with how most telemetry backends store, index, and query data.

What is AWS Fargate for Amazon ECS?

As cloud applications moved from VMs to containers and then to microservices, the amount of background work needed to keep everything running grew just as quickly. You gain speed and flexibility, but you also end up managing clusters, scaling rules, and capacity choices that don’t really add to the product you’re building. AWS Fargate steps in right there. It lets you run your ECS tasks without looking after any servers at all.