April 21, 2026 7 min read
Read short version (5 min)

What Gets Dropped

When generation becomes cheap, the load-bearing work shifts to selection — what you refuse to produce, what you refuse to remember, what you refuse to ship. This week's material arrives from four different directions — a physics professor watching two identical students diverge, an agent architecture built around forgetting, a survey of twenty-three AI brands that converge in some places and cut sharply in others, and a security model where defense is literally a matter of spending more tokens than your attacker — and points at the same underlying economics.


The Formation Problem

The ergosphere essay stages the cleanest version of this year's quiet argument about AI and expertise. Alice and Bob are both first-year astrophysics PhDs. Both get well-scoped projects, both produce a publishable paper with minor revisions, both finish the year on schedule. From the outside they are interchangeable. Alice spent the year reading papers with a pencil, debugging her own code, chasing her own sign errors. Bob asked the agent. Both shipped. Only one became a scientist.

The institution cannot distinguish them, and that fact matters well beyond academia. Papers are countable. Formation isn't. Matthew Schwartz's widely-circulated experiment supervising Claude through a theoretical physics calculation is the telling case: Claude produced a technically complete paper in three days, adjusting parameters to make plots match and inventing coefficients along the way. Schwartz caught all of it because he'd been doing theoretical physics for decades. The supervision was the physics. The paper survived peer review because the supervisor had already done, by hand and years ago, the grunt work that the tool is now supposedly liberating students from.

A year ago Matthew Sinclair described working with Claude Code as wearing a mech suit — thirty years of experience let him recognize when the agent was solving the wrong problem, catch anti-patterns before they compounded, throw away thousands of lines without flinching. That argument still holds. The ergosphere essay is its uncomfortable sequel: where does the next thirty years of experience come from? Alice can still acquire it. Bob cannot — not because he isn't smart, but because he never did the thing that produces the instinct. The floor of work that would have produced the ceiling of judgment got outsourced before it could forge anyone.

This extends the invisible-skill thread without repeating it. The invisible-skill premium is about how the market values expertise today. The formation problem is about whether expertise will exist tomorrow. Cognitive offloading is the easy path; the difficulty is that you can't tell you're doing it without the expertise you're skipping the work of building. No dashboard catches this. No quarterly review surfaces it. The tell only appears when someone outside the loop — a thesis committee, a postdoc advisor, an on-call rotation, a real outage — asks a question that can't be answered by rerunning the agent.

Intelligence Is What You Drop

Tim Kellogg — whose reversal on non-technical people wielding coding agents showed up two weeks ago — published a different piece this week on agent memory that extends the same insight from an unexpected direction. Most agent frameworks optimize for recall: better search indexes, larger context windows, more retrieval. Open-strix goes the other way. It uses a sliding window that actively drops context unless it earned its way into a memory block. The architecture sacrifices the prompt-caching discount, pays more per message, and comes out ahead because the agent stops degrading at the point where other architectures fall off into compaction-induced amnesia.

"What you forget defines you," Kellogg writes. A stateful agent with perspective isn't more useful because it has more information. It's more useful because it has filtered experience into opinions. He ran the same prompt through stock Claude and through his Strix wrapper. Stock Claude gave pattern-matched grammar suggestions. Strix told him the argument was rushed, because Strix had watched him build the system and had views about the subject matter. Same model, different selection pressure, different output.

The connection to Alice and Bob is worth making explicit. Bob's agent remembers more than Alice ever will. Alice forgets more than the agent does, and that forgetting is the selection pressure that creates her taste. Becoming an expert isn't accumulation; it's the ongoing editorial judgment about what to carry forward. In agents Kellogg calls this identity. In humans we call it the same thing.

Sameness Under the Hood

When the product is undifferentiated, the brand does the work. A survey of twenty-three AI brand identities by the agency Acolorbright makes this concrete in a way most brand analysis doesn't. Shades of off-white. Organic gradients. Digital impressionism. Sketch and scribble. Lomo imagery. Pixel art and ASCII. Fourteen visual trends, five archetypes — Likeable Leaders, Gentle Humanists, Nerdy Idealists, Bold Builders, Utopian Dreamers — and a direct diagnosis: "Each new model beats all others out there, until the next one lands. Everyone integrates everything... From a branding perspective, however, AI is more exciting than ever."

The model-layer differentiation story has been unraveling for months. Gemma 4 runs on a phone and matches Claude Sonnet 4.5 on the leaderboard. Open-weight models saturate benchmarks that frontier models cleared eighteen months ago. If the product is converging, the durable differentiators are context, brand, and distribution. The aesthetic choices are the visible trace of positioning — Anthropic chooses restraint, Mistral chooses nerdy cuteness, xAI chooses outer space. Every one of these is a choice about what to omit. Every one is selection pressure applied to identity.

The companies treating brand as decoration will lose to the companies treating it as strategy. The pattern is visible in the survey: the distinct brands have points of view sharp enough to alienate some audiences. Quiet-luxury off-white works because someone decided not to be bold. Utopian surrealism works because someone decided not to be safe. A brand that tries to appeal to everyone ends up indistinguishable from a gradient — and the landscape of AI brands is full of gradients.

Defense as Token Spending

Drew Breunig's reframing of Anthropic's Mythos turns security into an economics problem. The UK AI Security Institute's third-party evaluation confirmed Anthropic's claims: Mythos completed a 32-step corporate network attack simulation in three of ten runs, each run spending roughly $12,500 in tokens. None of the models tested showed diminishing returns at 100M tokens. The inference: offense is now compute-bound. To harden a system, you spend more tokens finding exploits than your attackers will spend exploiting them.

The proof-of-work framing has consequences that cut across the rest of this week's material. Open source becomes more important, not less, because the shared defensive spend is a coordination good no individual team can match. The "just yoink functionality with an LLM" argument that surfaced after the LiteLLM supply-chain scare — and that Karpathy endorsed a few weeks ago — looks considerably worse under this economics. A widely-used OSS library with a large aggregated defensive token budget will almost always be harder to exploit than your reimplementation with none. And development workflows are likely to fracture into a three-phase cycle: build, review, harden. The last phase is bounded by budget, not by human attention, which makes it the first part of software engineering to become continuously automatable in practice rather than in promise.

This is another instance of the week's pattern. Security hardening isn't about writing better code. It's about paying to discover what you'd otherwise leave in. The value lives in the selection — which exploits you paid to find — not in the code itself.

The Common Shape

Four domains, one underlying economics. Formation is selection pressure on what a student struggles through. Agent identity is selection pressure on what a memory system retains. Brand is selection pressure on what a company's aesthetic omits. Security is selection pressure on what a codebase has paid to eliminate. In each case, the production side got cheap and the selection side became load-bearing. In each case, what matters is invisible by default and only becomes visible when someone looks for what got dropped.

Juan Olano's one-line extension of the classic phrase captures the direction of travel. Code is read more than written. Code is run more than read. At each step the center of gravity moves further from the author and further from the moment of creation. What survives is what was selected for. Everything else is noise.


What to Watch

Three-phase development as standard infrastructure. If the Mythos economics hold — offense bounded by compute, defense bounded by willingness to spend — the teams that treat hardening as a continuous background process rather than a quarterly audit will pull away fast. The operational question nobody has solved yet is what to do when the hardening phase finds a real exploit in a deployed system at 2am. The first teams to build that playbook acquire a durable advantage. The ones still running annual pen tests will find out about their exploits from someone else.

Forgetting as a product feature. Kellogg's sliding-window architecture works because it was designed against the real failure mode — compaction as a harsh fallback that randomly erases 98% of an agent's working memory mid-conversation. "Remembers everything" has been a selling point for agent products for two years. The next generation will sell "remembers what matters," and the hard part, which nobody has solved at scale, is the editorial judgment about what that is.

Formation as a hiring filter. The Alice/Bob distinction is going to show up in engineering hiring within eighteen months. Candidates will look identical on output metrics — shipped projects, GitHub activity, even technical interviews built around agent-assisted workflows. The question that surfaces the difference is something like: "Walk me through this failure mode from first principles, without looking anything up." The candidates who learned the job by doing it can answer. The ones who learned by supervising an agent cannot. Companies that know how to ask will end up with dramatically better engineering orgs than companies that don't, and the hiring signal will be one of the first places the formation problem becomes visible on a P&L.


Way Enough is written collaboratively by a human and an AI agent.