Focus Is a Scaling Law, Whether You’re Scaling People or Agents

There’s a formula from 1967 that explains why your 200-person team ships slower than when it was 30. It wasn’t written about organizations; it was written about CPUs. But the principle still applies.

The Law

Amdahl’s Law describes the theoretical speedup of a task when you add more processors. The insight is disarmingly simple: if any fraction of the work is inherently serial (i.e. can’t be parallelized), then adding more processors hits diminishing returns fast. If 10% of your workload is serial, you will never get more than a 10x speedup no matter how many cores you throw at it; not 100x, not 50x, just ten.

The formula is clean:

Speedup = 1 / (S + (1 – S) / N)

Where S is the serial fraction and N is the number of processors. As N approaches infinity, speedup converges to 1/S. The serial fraction is the ceiling.

Now, the computing world didn’t just accept this ceiling. Gustafson’s Law (1988) offered the optimistic counterpoint: as you add processors, you can also scale the problem size, and when you do, the serial fraction shrinks as a proportion of total work. The entire GPU revolution is a testament to this; people restructured their problems to be massively data-parallel, effectively defeating Amdahl’s pessimism through reformulation.

This matters for the organizational analogy, because the same option theoretically exists for teams. You can surrender to your serial fraction, or you can reformulate how you work to shrink it. Most organizations believe they’re doing exactly this when they expand their product areas and grow the scope of the problems they tackle. But scaling the problem size in an organization isn’t as clean as scaling a matrix multiplication across more GPU cores. Expanding scope often introduces new coordination demands, new stakeholders, and new dependencies that increase S even as the total workload grows. The Gustafson escape hatch is real, but it requires deliberate restructuring of the work itself, not just doing more of it.

In computing, the serial portion is usually some shared resource: a memory bus, a lock, a dependency chain. In organizations, the serial portion is decision-making, alignment, and communication. And unlike transistors, people get tired.

Why 1.0 Teams Feel Fast

Most people who’ve built products have felt this: the 1.0 team moves at a pace that feels almost unreasonable. Twelve engineers ship what later takes sixty engineers twice as long to iterate on.

Not every 1.0 team gets this right; plenty flounder. But the ones that ship well tend to share a structural property: low serial fraction. The product doesn’t exist yet, so there are no live users to protect, no incumbent features to preserve, no competing roadmaps to reconcile. The requirements are comparatively clear (build the thing that does X), decisions are fast because the option space is constrained, the communication graph is small, and everyone shares the same mental model.

In Amdahl’s terms, S is low and most of the work is parallelizable. Each person or small group can take a chunk of the problem and run, and synchronization costs are minimal because the goal is singular and legible.

This is the part teams remember fondly and mistakenly attribute to culture or talent density. Those matter, but the dominant variable is the serial fraction; during a well-scoped 1.0, it’s naturally compressed.

The Post-1.0 Drag

Then the product launches and users arrive. Success creates options, and options destroy focus.

The roadmap fragments into a surface area: growth, retention, monetization, platform concerns, partner requests, technical debt, regulatory compliance. Each of these is legitimate, each pulls in a different direction, and each requires alignment across people and teams that didn’t need to coordinate before.

The serial fraction explodes. Not because the people got worse, but because the work changed character. Success forces a phase transition from discovery to delivery; from “figure out what to build” (high risk, low S) to “protect what we’ve built” (low risk, high S). Much of what fills the post-1.0 calendar is inherently serial: regulatory compliance, security reviews, privacy assessments, backward compatibility guarantees. You can’t parallelize a legal review or distribute a compliance decision across twelve engineers. The defensive surface area of a successful product is, almost by definition, non-parallelizable work.

A larger share of every engineer’s week is now spent in this serialized portion: syncs, design reviews, cross-team alignment, roadmap negotiation, stakeholder and dependency management. The meetings aren’t dysfunction; they’re a direct consequence of the problem becoming less parallelizable.

This is where things get worse than Amdahl’s Law alone predicts. In the original formula, S is fixed. But in organizations, S grows with team size; communication overhead scales roughly with the square of headcount, as every new person adds edges to the coordination graph. This is closer to Brooks’s Law (from The Mythical Man-Month) than Amdahl’s, and it makes the scaling picture uglier: you’re not just hitting a fixed ceiling, you’re watching the ceiling drop as you add people. And the human “interconnect” can’t be upgraded. It’s bounded by the latency of speech, the bandwidth of meetings, and something like Dunbar’s Number (the cognitive limit on how many working relationships a person can actually maintain). CPUs got faster buses; we got ‘chat’, which arguably made things worse.

Suppose a team starts at 10 engineers with S at 5%, giving an effective speedup of 6.9x. The team doubles to 20, but as it grows, S climbs to 25% from all the added coordination. The new effective speedup is 3.5x. You doubled the team and got absolutely slower; not just slower per capita, but less total output than before. And if S grows quadratically with N, there’s an optimal team size where total output peaks. Add one more person beyond that point and you’ve entered negative scaling territory, where each hire makes the team slower in absolute terms, not just per capita. Most organizations never do this math explicitly, which is how you end up with teams of a hundred producing less than they did at sixty.

None of this is new, of course. Coordination costs are well-studied in organizational theory; Coase and Williamson were writing about transaction costs and governance overhead decades before anyone applied Amdahl’s Law to an org chart. But the physics metaphor clarifies something that management frameworks sometimes obscure: there is a hard quantitative limit to what adding people can do, and it’s set by the serial fraction.

Focus Is the Governor

The pattern, once you see it, shows up everywhere. Teams that stay fast post-1.0 tend to have something in common: an almost stubborn narrowness about what they’re doing right now (not forever; just right now). They sequence rather than parallelize when the serial costs of parallelization exceed the gains, and they make explicit bets about what they are not doing, which is harder politically than it sounds.

“Focus” is doing a lot of work in that sentence, so let me be more specific. There are at least two distinct things that drive the serial fraction in organizations. The first is strategic clarity: does leadership know what to prioritize, and have they made that legible to the team? The second is structural coupling: does the org design create unnecessary dependencies between groups that could otherwise work independently? Put differently: strategic clarity is knowing what to do; structural decoupling is the ability to do it without asking permission. You can have clear strategy but terrible org structure, or clean structure with muddled priorities. Both inflate S, but they need different fixes.

What focus means in practice is reducing both: making fewer bets (strategic clarity) and designing teams so those bets can execute without constant cross-team synchronization (structural decoupling). This is the organizational equivalent of what GPU architects did; reformulating the problem to be more parallel, rather than throwing more cores at an inherently serial workload.

The hardest part of this isn’t knowing you should do it; it’s having the organizational standing to say no. Every priority you add, every initiative you run in parallel, every “also can we” in a planning meeting increases the serial fraction. They add synchronization points and force alignment conversations that wouldn’t otherwise need to happen. And many of them are imposed from above: VP-sponsored initiatives, customer commitments, competitive responses, regulatory deadlines. S is often not a choice the team lead gets to make; it’s a constraint they inherit.

The most insidious version of this problem is when the loss of focus is disguised as ambition. The roadmap looks impressive with fourteen workstreams and everyone is busy, but the serial fraction has quietly climbed to 40%, and no amount of additional headcount will get you past 2.5x speedup. You have a team of a hundred performing like a team of thirty, but with the coordination costs of a team of a hundred.

There’s a tradeoff worth naming here: you can reduce S by giving teams full autonomy (no alignment needed, everyone runs independently). But you risk building an incoherent product. The interesting question isn’t always “minimize S” but “what’s the right S for your situation?” At the extremes, S=0 is entropy (every team diverges, nothing integrates, the product dissolves into chaos) and S=1 is stasis (every decision requires full organizational consensus, nothing ships, bureaucracy calcifies). The job of leadership is to find the critical S that allows for coherence without strangulation, and to make that tradeoff deliberately rather than letting S inflate by accident.

Enter the Agents

This is where the story gets most interesting, because we’re about to replay all of this; faster, and with higher stakes.

The promise of agent-powered development is essentially an Amdahl’s Law play: add massively more parallel capacity by giving every engineer a fleet of tireless, fast, cheap workers. The analogy to adding cores is almost literal, and for certain classes of work (the kind where the task is well-specified, the interfaces are clean, and the dependencies are minimal), agents deliver real parallel speedup. They don’t get tired, don’t need to context-switch, and can work twenty tasks simultaneously.

Agents also solve one piece of the scaling problem that humans structurally cannot: they don’t have communication bandwidth constraints with each other. Two agents don’t need a meeting to sync; they can share state through code, specs, and structured interfaces at machine speed. The communication graph doesn’t grow quadratically the way it does with people. This is a genuine structural advantage that removes one of the key mechanisms that makes S grow with team size in human organizations.

But “machine speed” has its own limits. Agents still operate within context windows and need shared taxonomies (consistent naming, clean interfaces, coherent architecture). If the underlying codebase is a tangle of implicit dependencies and undocumented conventions, agents hit a coherency wall: they can read code fast but they can’t infer intent from spaghetti. The communication overhead shrinks but it doesn’t vanish; it shifts from meetings to architecture.

But here’s what agents don’t solve: focus.

An agent fleet still needs to know what to build. It needs requirements, priorities, architectural direction, and product judgment, all of which come from humans. And if the humans haven’t resolved what the product should do (if there are three competing visions, or the roadmap is a sprawl of fourteen workstreams), then the agents inherit all of that incoherence. They’ll execute fast, but they’ll execute in conflicting directions. You’ll get twelve implementations of six features where you needed three implementations of two features.

There’s a subtler bottleneck too: verification bandwidth. Every piece of agent output needs a human to review, integrate, and validate it. That evaluation is serial, and it scales linearly with agent output volume. The faster and more prolific your agents become, the more human attention each cycle demands. If one engineer is orchestrating ten agents across three workstreams, the bottleneck isn’t the agents’ throughput; it’s the engineer’s ability to evaluate whether what came back is correct, coherent, and actually what the product needs.

The serial fraction for agent-powered teams isn’t “writing code.” Agents parallelized that away. The serial fraction is deciding what to build, verifying what was built, and owning the end-to-end problem; a chain that runs through a human at every link.

As N grows massive via agents, the ratio of deciders to doers shifts dramatically, and the serial fraction converges onto a single point of failure: the product owner’s cognitive load. Call it the sequential bottleneck of judgment. Even in a massively parallel agent fleet, truth is a serial resource; someone has to decide what “correct” means, what tradeoffs are acceptable, and whether the output actually serves the user. One person’s ability to hold the problem in their head, make those calls, and verify outputs becomes the governing constraint on the entire system. You’ve replaced a team bottleneck with an individual bottleneck, which is faster right up until that individual saturates.

In fact, agents may make the focus problem more acute. When execution is cheap, the temptation to pursue everything simultaneously gets stronger. “Why not try all fourteen workstreams? The agents can handle it.” But each workstream still needs a human to own the problem end-to-end: to define what done looks like, to make judgment calls when the spec is ambiguous, to reconcile conflicts when two workstreams step on each other, and to verify that the output is actually right. If you don’t have enough humans with enough ownership depth, you get a lot of code and very little product. Call it the hallucination of progress: the codebase is growing, PRs are landing, dashboards are green, but the serial cost of reconciling all those workstreams is growing exponentially, and no one has the bandwidth to notice that the pieces don’t fit together.

The Amdahl’s Law framing predicts this precisely. You’ve increased N enormously by adding agents, but if S hasn’t changed (if the serial fraction is still decision-making, verification, and product judgment), then your speedup is still capped at 1/S. You’ve just made the ceiling more visible.

The Same Lesson, Faster

The organizations that will use agents well are the same ones that already manage human parallelism well: the ones that invest in reducing S. That means clear ownership, sequenced bets, ruthless prioritization, and leaders willing to say no to good ideas because they’d increase the serial fraction.

The difference is that the penalty for getting this wrong will be faster and more expensive. With human teams, a bloated roadmap means people spend too much time in meetings and ship slowly. With agent-powered teams, a bloated roadmap means you burn through compute and get a codebase full of half-coherent features that no one fully owns. The failure mode is faster but structurally identical.

Amdahl published his law as a cautionary note to computer architects who thought they could just keep adding processors. The lesson was: before you add more parallelism, look at your serial fraction. Reduce that first. Everything else follows.

The same applies to your org chart, and soon, to your agent fleet.

Reduce S. Not because the rest is easy (reformulating the problem never is), but because nothing else you do matters until you do.

2 thoughts on “Focus Is a Scaling Law, Whether You’re Scaling People or Agents

  1. Brilliant insight. Understanding the discipline of focus was never more important than the era of agentic work streams.

    Like

  2. You need leaders who are willing to say, “this is a mission-oriented team, if you can’t disagree and commit, the exit door is over there”.For a time, I think too many teams in the Silicon Valley takes the culture of “autonomy” too extreme and in the wrong way: they would think letting everyone to be the decider is the way to retain talents, then they build a whole career progression system that is around the idea of “you don’t get to very senior rank unless you can being a decider of product direction,” they put “influence without authority” as a Staff+ level expecitation, and ask for artifacts of makig decisions.

    The outcome is organization where 80% of people spending time/energy on aligning/influencing each other and less than 20% are doing things because the ones who do feel less appreciated than the ones who talk in this system.We need to make doing cool again.

    Like

Leave a reply to Ned Cancel reply