Issue #70•

Jan 5, 2026

•13m read

Delivery Confidence

Right after releasing a product update, we spotted a handful of small bugs. The update had test coverage, full QA, and usage in a staging environment before release. Yet we still found issues. One of them was a bug that could only be replicated on production. I merged in not one, not two, but three separate Pull Requests before finally resolving it. It was related to a build configuration that only happened on the live site. No test or QA process caught it. And I couldn’t easily replicate it anywhere. The only way to resolve it was trial-and-error in the live environment.

Everything before real users use your product in a live environment is a simulation. You do your best to simulate the final product and its usage to increase your confidence. So you write tests, create QA processes, and let your team and select customers use it to give feedback. You take the feedback and determine what, how, and when to address it.

But, somehow, things still sneak through. No matter how much you test, a pesky bug surfaces only in production when real users start using it. How did we miss that?, you ask yourself. Then a chorus of “worked on my machine” and “it never looked like that” echoes as you consider what happened. And the cycle continues, operating on hopes and dreams that it will eventually fix itself.

But it doesn’t. Why? Because software is complex. There are too many variables. And it’s not just the possible bugs and regressions introduced with new code; you have to contend with different browsers, devices, viewport sizes, internet speeds, and any number of user-specific configurations. The permutations are endless. So you live in the tension of spending time testing, but not too much time to miss deadlines and halt your team’s progress.

But where do you draw the line? And how do you know when it’s time to release? These are the questions I’ve been thinking about the past few weeks. While I have yet to arrive at a definitive answer, I developed a set of principles to make the process more reliable:

Limit the scope of changes.
Test the changes with real users.
Test the changes in the live environment.

This idea is called Delivery Confidence, which lives in the Clarity Current of the Claritorium and Value Creation of Equilio.

The three pillars are:

Scope Compression to reduce the surface area of changes into the smallest segment of generative value.
Proxy Progression to improve the fidelity of usage and increase your confidence in the signals that emerge.
Reality Immersion to validate your changes in a real, live environment where you can understand actual behavior.

Scope Compression

When you ask an engineer to estimate work, their calculation is based on two vectors:

The horizontal breadth of changes that interacts with parts of the system.
The vertical depth of changes that adds complexity within existing system parts.

Now, let’s be clear: estimates, in software, are largely useless. Humans are terrible at estimating effort, and software is too complex to do so. No answer to “how much work is it?” will ever yield an actual increase in confidence for the time of effort. Even in codebases I know intimately, performing familiar tasks, surprises always emerge. All changes require local testing, branching in version control, review, tests, and deployment. There’s still a minimum time requirement, and these parts of the process aren’t typically in an engineer’s calculus of effort. Their focus is purely on the isolated expected work, not the full lifecycle involved to bring it to completion. Not to mention the fact that parts of the process have a high variability in time. For example, how long it takes to review a Pull Request (PR) is highly variable and dependent on your team’s process and individual responsiveness. I’ve watched the smallest PRs sit for days because no one got around to reviewing them. In lieu of a predictable process, the onus falls on the PR author to make constant requests for review.

The larger the scope, the larger the surface area and complexity of work and, with it, the lower the confidence of the estimate. We can then express scope and confidence as inversely proportional, drawing a clear relationship we can use to define mitigation strategies.

Higher scope = lower confidence

Lower scope = higher confidence

Scope is the surface area of changes.

Your job is to constantly reduce, to simplify, to constrain the scope of work. Because another thing humans are bad at? Simplifying. Every conversation discussing work-in-progress creates a cacophony of requests to add, to increase, to expand the work. This happens so frequently I often use my hands to mime the idea of shrinking scope, pushing my hands together like I’m forming a snowball. And what a fitting metaphor: a snowball. Because, if you let it, scope will grow so unwieldy you lose track of the original intention, like a snowball gaining unstoppable speed down a mountain.

Saying no to adding more isn’t a permanent deferral; it’s an expression of patience, discipline, and a willingness to accept immediate improvements over extended timelines and reduced confidence.

Are the changes better than what exists today?

If yes, release it. Let the limited scope build momentum and generate feedback to create direction for the next iteration. Value, delivered continuously, is generative. To create a generative process, you need to compress the scope of work into its smallest unit of impactful value—an atomic unit of impact.

But how do you define the scopes? And what constitutes a “unit of impactful value”?

Flows!

The sequence of steps a user takes as they move through your product defines a flow.

When you define a singular flow that moves the user to achieve an outcome, it focuses your efforts. You can ignore the dreaded edge cases that pull focus from the primary use case. And you create constraints as scope guardrails.

Principles

Here are the principles of Scope Compression:

Surface area is risk. More changes increase your risk. Reduce the surface area to increase your confidence.
Flows are units of value. Define one simple flow to focus your efforts on the highest yield of value for the effort.
Edges are deferred by design. Don’t get lost in the edge cases and fringe concerns. Stay focused on the core design.

Practices

Here are the practices of Scope Compression:

Flow Definition: Hold a focused session for defining the end-to-end process a user completes to achieve a goal.
Change Budgeting: Create a list of changes you expect. Treat it like a budget you compare to as you do the work.
Out-of-Scope Register: Create a list of changes that are explicitly out of scope to remind the team to stay focused.

Proxy Progression

When an engineer does the work, they test in a local instance of the product running on their computer. They write some code to do a thing, and then they check to see if the code does what they expected it to. Rinse and repeat until the work is complete—or complete enough to move to the next stage. Typically, the work is deployed to a staging or preview environment. This environment is very similar to the production environment, but it’s a safe space for testing without the risk of breaking the live product. My favorite approach is to automatically generate a dedicated preview URL for each PR. It makes the testing process much smoother than fighting over a single staging environment. The code is reviewed in a PR, which includes looking at the code, adding comments, and testing the actual behavior. So, at this stage, there are two engineers (at least) involved in testing.

Each person involved in testing the work is attempting to represent the actual user. But up until an actual user tests it, you’re simulating usage with different people—the proxies.

The engineers write the code, test it, and review the expected functionality. But they are focused solely on what was changed.

Then a QA person reviews the work and makes sure there aren’t regression breaking other parts of the product unintentionally. They follow a standard process, not as much organic exploration like a user would.

Then the larger internal team—PMs, designers, engineers, other stakeholders—test the feature and give more qualitative feedback.

Then you open up testing to a subset of real users, slowly releasing as you monitor data and determine when to release it to everyone.

This may not be your team’s exact process, but it’s what I’ve seen many times. And this isn’t for every update; it’s only for work with enough risk to warrant the added steps.

Proxy Levels

The three proxies of Proxy Progression are:

The Builders: The engineers writing the code and delivering the core functionality.
The Testers: The internal product team of designers, engineers, strategists, QA, and other key stakeholders.
The Users: The real customers who use your product actively.

The Builders make it work.

The Testers make sure it works well.

The Users validate the value of the work.

Confidence grows as validation moves away from the builder and toward the user.

Each level has a point of diminishing returns, where too much time creates noise, not signals. You should be mindful of the quality of feedback at each stage. Limiting the scope helps the work move faster through the stages, but, when the feedback stalls, ask yourself:

Is this something a better proxy could answer—or something only a user can answer?

If it’s something only a user can answer, focus on compressing the scope and shipping the smallest viable slice. Get it in user’s hands and validate more quickly. Develop a system for a subset of users to test work early. Don’t live in a land of simulation and proxies for long. The early proxies are only there to increase your confidence, but you won’t know until real users put it to the test in a real environment. No simulation or proxy will change that.

Principles

Here are the principles of Proxy Progression:

Each proxy has a ceiling. Once you’ve learned what you can, move to the next level as quickly as you can.
Simulation is not validation. Everything before a real user tests it is a simulation. Don’t live in the simulation for long.
Progression over precision. You learn along the way, even if it’s just from the simulation. Keep moving and learning.

Practices

Here are the practices of Proxy Progression:

AI Coding Agents: Equip your engineers with coding agents to move faster and be able to experiment more effectively.
QA Sessions: Bring all the testers together in a live session to test work and share feedback to make the end product better.
Private Betas: Get new work into real users hands as quickly as possible, using private betas to capture feedback early.

Reality Immersion

Most work is marked as “complete” before it hits reality with real users using it in a live environment. The engineers have already moved onto the next thing, leaving valuable validation work as an afterthought. It’s live in production, so it’s done, right? Not quite. This moment is when simulation transforms into validation. And sometimes what you initially release misses the mark, but is capable of getting there with focused fixes and improvements. Feedback serves as a compass, guiding the product’s development to a North Star of delivered value for paying customers. But you only get there through focused effort and intentional usage in the live product.

All the work before this is preparation, but now it’s put to the test and validated (or not). This is when confidence is realized. You see the work existing in the production environment with real users trying it. And that’s why getting to this stage quickly is critical. Reducing scope and having users test it early helps, but you need the work to live in the real environment to truly test it. All of those preview environments are simulations, too. The new work needs to mix and mingle with your entire product and user base. Only then will you have confidence in the value delivered, and how it should inform future work, which includes improvements to what you just released.

It’s like building a boat on land. You can design the boat perfectly, inspect every joint, and test it in controlled conditions, but you won’t know if it really works until it’s in the water. The water has a current, debris, and unexpected behaviors you can’t simulate. And the more insidious truth is that production (the water) is treated as a finish line, not the starting point. As the boat moves through the water, everyone watches from shore while building the next boat instead of staying in the water and making modifications in live conditions—adjusting the rudder, reinforcing weak seams, and responding to how people actually steer it.

There are two key parts of Reality Immersion:

Launch into reality by getting the boat in the water to determine its buoyancy.
Stay with it under force to make sure it can be steered in the right direction.

Reducing scope and moving through the simulation with haste isn’t about cutting corners; it’s about moving from the land to the water so you can make sure the boat works. Teams intuit this when building new products, but not with adding new features to products.

The work really begins when people start using it.

I’ve said this for years about 0-1 products. And now I’m saying it about improvements to existing products, too. Reality Immersion is how you get the boat in the water and make sure it floats, moves, and takes people where they need to go—to reach their destination.

Principles

Here are the principles of Reality Immersion:

Production is where truth appears. Get the work into the final context so you can really start to validate your work.
Shipping is the start of learning. Don’t view the release to production as the end; view it as the beginning of the process.
Time in reality over simulation. Don’t over-index on time in the simulation. Focus on time spent in the real environment.

Practices

Here are the practices of Reality Immersion:

Early Releases: Create small releases that go all the way to production.
Post-Launch Response: Monitor feedback and work with engineers to make improvements once the work is live.
Behavior-Driven Iteration: Make changes based on how the product is actually used, not what you assumed.

The Throughline

Confidence is earned by answering unknowns, validating assumptions, and mitigating risks.

By limiting the scope and getting the product into real user’s hands in the live environment, you increase confidence in delivering results.

But remember: there’s no shortcut to learning.

Your confidence only grows as you embrace discomfort and learn by doing.

Clarity Current Value Creation

Enjoying this issue? Sign up to get weekly insights in your inbox: