Operations | Monitoring | ITSM | DevOps | Cloud

How to stop guessing where developer friction lives

Most platform teams know friction is a problem. They also struggle to figure out exactly where that friction lives. Developers lose time in ways that rarely show up on a roadmap. In many organizations, creating a new service can require multiple approvals and several Slack threads. Spinning up infrastructure can mean filing a ticket and waiting days. Onboarding to a new codebase involves a scavenger hunt through stale Confluence pages. None of these feel like emergencies in isolation.

What is engineering operations? A guide to the discipline transforming software teams

Engineering teams are writing more code than ever. AI coding tools have made individual developers dramatically more productive, yet most organizations report moving only about 20% faster than before. The real constraint has always been the operational fabric surrounding the act of writing code. The processes, standards, visibility, and coordination that determine whether hundreds of engineers and thousands of services ship reliable software at speed have always been where the real work happens.

Debugging Encrypted Microservice Traffic with Speedscale's eBPF Collector

Production bugs that only reproduce in actual traffic can be some of the most frustrating bugs in software development. You can stare at your logs, add traces to your code, add instrumentation – and still not be able to see the actual requests that went over the wire. And that gets even harder when the requests are encrypted and the system is a black box. You can use tools like Wireshark or Kubeshark to capture the requests.

Introducing Cortex as the Engineering Operations Platform

Software Engineering is once again being forced to evolve. We are entering the era of infinite code where the cost of writing code tends to zero. The data tells us that companies are only moving 20% faster than when humans wrote code by hand. We’re writing orders of magnitude more code than ever, yet our processes are barely keeping up with what we had before. The chaos and complexity is only being amplified by this new shift in how we work as developers.

Why measuring things openly is the first step toward a stronger engineering culture

Most engineering leaders know they should be measuring more. What holds many of them back is a quieter concern about whether the organization is actually ready to see the numbers. This tension, however, did not keep Ganesh Datta, our co-founder and CTO, and Randy Shoup, SVP of Engineering at Thrive Market, from diving down this rabbit hole on the Braintrust podcast.

Why business context is the missing link in engineering performance

Think about the last time your team shipped something impressive. It was probably on time, clean code, and had great metrics. And yet somewhere along the way, the business priorities had shifted, and what the team delivered was no longer the top priority. The work was solid, but the direction just wasn't quite right anymore. This is usually what happens when engineers are disconnected from business context.

Breaking up with backstage: Why "free" open source isn't always free

We’ve all had that moment where it seems like you've solved your company's biggest engineering challenges after a weekend of hacking something together. Your prototype is so good, you feel, that the obvious next steps are to build a slide deck, rally the team around your work, and prepare the ticker tape parade for your hero's welcome. Jeff Schnitter, a Solution Architect at Cortex, knows this roller coaster of experience all too well after his time at Workday.

Troubleshooting Microservices with OpenTelemetry Distributed Tracing

Distributed tracing doesn’t just show you what happened. It shows you why things broke. While logs tell you a service returned a 500 error and metrics show latency spiked, only traces reveal the full chain of causation: the upstream timeout that triggered a retry storm, the N+1 query pattern that saturated your connection pool, or the missing cache hit that turned a 50ms call into a 3-second database roundtrip.

How frictionless development created a trillion dollar mistake

We've all heard from an engineering leader about the exact moment they realized their architecture had gotten too complex. It usually happens when they look at a service map and realize it looks like a box of tangled Christmas lights. This cognitive overload is exactly what Steve Evans, the former SVP of engineering at Chegg, reflected on in a recent post on LinkedIn. He argued that microservices were a trillion dollar mistake because we often over-build for future problems that never actually arrive.

Cortex and Semgrep partner to strengthen application security and drive continuous improvement

At Cortex, our mission is to help engineering organizations deliver reliable, secure, efficient software, faster. With Cortex, teams can standardize against best practices and create a culture of continuous improvement to achieve this. Today, we’re excited to announce a formalized partnership with Semgrep, a leader in modern static analysis and code security.