The Gap Between Code and Cowork

In enterprises, people sit on a spectrum that runs from technical to non-technical. The power of agents landed on one end of it first: coding with Claude Code. Claude Cowork has since stretched that power toward the other end. But it stretched fast, and I think the speed left a gap in the middle.

Claude Cowork was marketed as the golden path: a text box where anyone can just ask a question and get an answer - and it will work as long as there is a connector ecosystem your organization has enabled, that supports your question. The way I see it though, is that path only became possible because of everything Anthropic's product team learned mastering agentic loops in Claude Code.

That is the point I want to make. Technical users drove adoption. They picked up Claude Code fast, and underneath, Claude Code is a simple thing: a loop where a model runs until it has finished a task.

The difference is control. A developer can shape that loop, because a developer thinks in systems. A knowledge worker is more inclined to type a question and take whatever comes back.

Golden path breaks easily

For finance workflows specifically, I have found far more value sitting between Claude Code and Cowork than in either one alone. There is a definite gap there.

Cowork is good at getting you the artifact, the answer you came for. But a lot of that output is not repeatable, not auditable, and not something I would fully trust as somebody in Finance where accuracy has a high bar.

The reason is the data. You are usually pulling different datasets from different sources that were never stitched together by a common primary key. And models are still weak at numerical reasoning, at confirming the data is complete, at the nuances and the edge cases because of missing the whole context. It is true because these systems just don't have the soft-context that a human would. (This is what the tech circles on X are harping on about when they say the context layer is where the next trillion-dollar opportunity lies. More on that in another post.)

What closed the gap for me

I have had a different experience with Claude Code - and I wouldn't say that I am on either of the extreme ends of the spectrum. But I am definitely somewhere in the middle, and I have managed to make coding agents work for me.

What has actually helped me in Finance is building pipelines I can run again, with my own verification loops baked in. That gives me a health check on every single sheet I pull together. For example, for each one, I can see:

whether the checksum matches what the GL says
whether the number of deals matches what the CRM has
whether the headcount in APAC lines up with the people Salesforce shows in that region
whether the figure I pulled from Workday is complete
whether there are pending or in-progress journal entries from Accounting that are not being counted
whether the number in Salesforce has changed since my last pull, so I know I am not working with something stale

These are not three or four checks. They are thousands of edge cases I deal with day to day, and the pipeline catches them so I do not have to.

The capability overhang

OpenAI has a name for the moment we are in: a capability overhang. Sam Altman has framed it as the distance between what the models can already do and what the world has figured out how to get out of them, a gap so large that even a frozen model would leave enormous value on the table.

That gap is exactly what informs the investment in the deployment ecosystem: the forward-deployed engineers who turn raw capability into auditable pipelines and productized systems for enterprises.

So we are nowhere near doomsday. The labs are making two bets at once. One is that the models get good enough that all of this plumbing and pipeline-building eventually evaporates. The other is that it does not, which is why they are pouring money into the deployment layer anyway.

Takeaway

I do not know which bet wins. Nobody does. That is the whole point of an overhang: the gap between what the models can do and what we have figured out how to extract is wide enough that the question of whether the plumbing eventually evaporates stays open for a while.

But that is exactly why the move is to build now. The value is sitting there unclaimed, and the people who can turn raw capability of coding agents into work systems are the ones capturing it while the window is open. Finding the gap in the middle of the spectrum, and equipping yourself with the mental models and systems thinking that unlock these latent capabilities, is how you do that.

That is why I see AI as a true force multiplier, and not in the abstract. I feel it in my own work. I have been able to expand the scope of what I do dramatically, and as an ambitious person, that has made the work more fulfilling.