clev@world: ~
~/blog$cat on-engineering-productivity.md
---
title: on engineering productivity.
date: february 2026
sources: [1]
---

there's a question that comes up in every engineering org eventually: how do we know if we're productive? it sounds reasonable. if you're spending money on engineers, you'd want to know if you're getting a good return. the problem is that every attempt to answer it makes things worse.

the most common approach is to count something. lines of code, commits, pull requests, story points, tickets closed. numbers feel objective. but what you're actually measuring is activity. and activity is not progress.

consider two engineers working on the same problem. one writes 3,000 lines of code that handles every edge case. the other writes 800 lines that do the same thing more elegantly. by any counting metric, the first engineer looks four times more productive. but anyone who's built software knows the second one did the harder work. they thought more and typed less.

in software, doing less is often doing more. the engineer who deletes code is frequently more productive than the one who adds it. the one who says "we don't need to build this" might be the most productive person on the team. try putting that on a dashboard.

and even if you could accurately count functionality — say, with function points or some other system — you'd still be measuring the wrong thing. direct output and useful output are different things. if i ship 100 features and only 30 are useful, and you ship 50 features that are all useful, who's more productive? the dashboard says me. reality says you.

story points tried to fix the counting problem by measuring complexity instead of volume. in practice they just created a new game. teams learn within weeks how to inflate estimates, split work into point-friendly chunks, look busy in the metrics while the product stalls. i've watched teams maintain perfect velocity charts while shipping nothing that mattered. the numbers were great. the product was dying.

there's also the problem of invisible contributions. some engineers ship features. others make the people around them faster — reviewing code, unblocking teammates, improving tooling, making architectural decisions that prevent six months of pain. their contribution is real. it shows up in the team's overall output. but you'd never see it on any individual metric. you'd have to be on the team to understand who's actually carrying weight.

now add llms to the picture.

an engineer with copilot or claude can produce code at a rate that would've been unthinkable five years ago. entire features in minutes. boilerplate gone. the raw volume of code a single person can output has gone up by an order of magnitude. by every traditional metric, productivity just exploded.

but did it?

generating code is easy now. an llm can reason, solve hard problems, architect systems. it's genuinely good at thinking through technical challenges. but it has no idea who your users are. it doesn't have your business context, can't tell which feature actually matters this quarter, or whether the thing you're asking it to build should exist at all. the bottleneck was never intelligence. it was taste and context.

when producing code becomes nearly free, the bottleneck shifts entirely to judgment. knowing what to build, how it fits into the system, what to leave out. the engineer who prompts an llm to generate 2,000 lines and ships it without deeply understanding what it does isn't being productive. they're creating future problems faster than anyone could before.

codebases are already bloating with generated code that nobody fully understands. features ship fast and break in unexpected ways because the person who "wrote" them can't explain the edge cases. velocity numbers go up. reliability goes down.

the engineers who are genuinely better with llms plan extensively before generating anything. they debate architecture, explore tradeoffs, then review the output function by function. the generation is the easy middle step. the planning and the scrutiny are the actual work. skip either one and you're just accumulating problems with a nicer interface.

people like to say "if you can't measure it, you can't manage it." that's a cop out. businesses manage things they can't measure all the time. how do you measure the productivity of a legal team? a marketing department? a university? you can't. you manage them anyway.

software productivity is a property of the system. the codebase, the tools, the culture, the clarity of what you're building and why. an average engineer with clear priorities and a clean codebase will outperform a brilliant one swimming in chaos. every time. "how productive is this engineer?" is almost always the wrong question. "what is this team building and does it matter?" is the right one.

i can see why measuring productivity is seductive. if we could do it, we could assess software work objectively. but false measures only make things worse. this is somewhere we have to admit to our ignorance.

the best teams i've worked on never talked about productivity. they talked about the problem. they had opinions about what was worth building. they shipped something small, watched what happened, adjusted. the cycle was fast because there was nothing in the way.

we've been trying to measure engineering productivity for decades. we're no closer. ai tools made the gap between activity and impact even wider. maybe that's the answer. maybe the thing itself resists measurement. and the teams that accept this and focus on building the right thing instead of counting how fast they build will always be ahead.