Neural Networks Got Too Big to Read

Where I Started

On why studying a genome and studying a neural network keep turning out to need the same tools.

Two days ago I wrote about how subquadratic attention rhymes with the exome era of genome sequencing—the same compromise, made twice, in two fields that have no business resembling each other. I framed it as a one-off worth noticing. A single coincidence.

I've stopped believing it was a one-off.

The more interpretability papers I read, the less large language models reminded me of software, and the more they reminded me of genomics. That seemed wrong at first. DNA is a biological molecule shaped by billions of years of evolution; a neural network is a mathematical structure trained by gradient descent on a pile of text. They have nothing to do with each other. But the workflows kept lining up. We sequence genomes; we probe networks. Geneticists run association studies; interpretability researchers hunt for circuits. Biologists built CRISPR to edit behavior; AI researchers are starting to edit weights. At some point the resemblance got specific enough that I stopped treating it as a metaphor and started wondering what was generating it.

Traditional software doesn't feel like this

Most software is designed. Open a sorting algorithm, a compiler, a database engine, and the causal chain is visible. Functions exist because someone wrote them. Data structures exist because an engineer chose them. There are bugs, but the system is fundamentally legible, because it was assembled on purpose.

This holds even when the system is enormous. A modern CPU has tens of billions of transistors, and it still doesn't feel biological—every transistor exists in service of a design humans understand. Nobody studies quicksort the way a biologist studies a cell, because there's nothing hidden to recover. The mechanism is the documentation.

Large language models break that. Nobody programmed a model to understand sarcasm. Nobody hand-encoded arithmetic, or sat down to design a circuit for writing Python. Those capabilities showed up during training. We understand the optimization process completely and the resulting object barely at all, and that gap is exactly the situation a biologist is in when they sequence something.

The parallel is in the process, not the artifact

When people compare biology and AI, they usually start by pairing the artifacts: genes to weights, DNA to tokens. I think that's the wrong end to grab.

The interesting parallel isn't between the things. It's between the processes that produced them.

A genome isn't designed. It's the residue of an optimization process that searched an enormous space over a very long time and kept what survived. A trained network isn't designed either. Gradient descent searches an enormous parameter space and keeps the configurations that minimize loss. In both cases you end up with a compressed information structure containing far more complexity than anyone deliberately put there. The genome is what evolution found. The model is what gradient descent found.

Both leave behind the same kind of object—too large and too tangled for anyone to hold in their head. Which, it turns out, matters more than where the object came from.

Association studies, twice

The clearest example is GWAS. After the Human Genome Project, scientists had the sequence and still couldn't say what most of it did. Reading the letters wasn't the same as understanding the system. So they ran genome-wide association studies: take a phenotype—height, cancer risk, drug response—and scan the genome for regions statistically associated with it. You don't need the mechanism up front. You're looking for signal inside a structure too large to inspect by hand, and the mechanism comes later, if it comes at all.

Mechanistic interpretability runs the same loop. Take a behavior—arithmetic, code generation, refusal, deception—and scan the model for the neurons, heads, or features associated with it. Observe behavior, search for associations, form a hypothesis, intervene, refine. The substrate is different and the procedure is identical, down to the order of the steps.

The instruments line up too. In genomics a probe is a tool built to reveal something specific hidden in the biology—a variant, an expression pattern, a methylation state. It isn't the biology; it's a question you can ask the biology. A linear probe in interpretability is the same move: a small classifier trained to check whether some concept is present in the activations. Different word, same job. Both fields are building instruments to interrogate structures they can't read directly.

It's worth being precise about how far this goes, because it's tempting to map everything one-to-one:

	Genomics	Interpretability
Read the structure	Sequencing	Activation/weight extraction
Find what correlates with a trait	GWAS	Circuit and feature search
Ask a targeted question	Molecular probes	Linear probes
Measure a latent property	Assays (RNA-seq, ATAC-seq)	Benchmarks (MMLU, SWE-Bench)
Modify the system	CRISPR, gene therapy	Weight editing, activation steering

Like any analogy, it breaks if you push it. But at the level of "what do you do when handed a large structure you didn't author," it holds better than it has any right to.

Benchmarks are assays

The row I find most useful is the one about measurement. The AI community talks about benchmarks as scores—MMLU, SWE-Bench, leaderboards, who's on top this month. Genomics has a framing I think travels better: a benchmark is an assay.

An assay is a test built to estimate some hidden property of a system. RNA sequencing is an assay. Methylation profiling is an assay. The assay is never the thing itself; it's an instrument for guessing at an underlying state. Read benchmarks that way and they look different. MMLU isn't intelligence. SWE-Bench isn't software engineering ability. They're assays for latent capabilities, and the gap between the two is the whole point.

This matters because an assay can saturate without capturing the thing it was measuring. A benchmark gets solved while the capability it was supposed to track stays poorly understood. Genomics learned that one the hard way, more than once. AI is learning it now, in public, on a leaderboard.

Reading, then editing

The strongest version of the parallel shows up when you move from looking to changing. The Human Genome Project gets the attention, but the real turn was CRISPR. Sequencing let scientists read the system; CRISPR let them rewrite it, and biology shifted from a mostly observational science into something closer to an engineering discipline.

AI looks like it's approaching the same hinge. Weight editing, activation steering, circuit interventions, representation engineering—these are early attempts to change behavior directly rather than just describe it. The comparison isn't flattering yet. Current techniques feel less like modern CRISPR and more like the first generation of gene therapy: they sometimes work, often fail, and mostly reveal how little we understand the substrate we're poking. But the order of operations is the familiar one. First you learn to read. Then you learn to intervene.

What I'm not claiming

Here's where I want to be careful, because the easy version of this essay overstates its case.

The tidy story is that both fields are information sciences, and that's why they rhyme. I don't buy it. Computer science has been an information science for decades, and nobody studies an operating system the way they study a cell. So "it's all information" can't be the thing.

The better story is that both fields study structures produced by optimization rather than design. I think that's true, and it's most of why the genome and the model land in the same bucket—optimization is a reliable way to manufacture something nobody can hold in their head. But I don't think optimization is the thing itself, because you can get to the same place without it. A legacy codebase that no living engineer fully understands gets the archaeological treatment too, and every line of it was written by hand. It wasn't optimized into existence; it was designed into illegibility, one reasonable decision at a time. What it shares with the genome and the model isn't an origin. It's a property: too large, too entangled, too far from any single author to be read directly. Optimization is the most common road to that property. It isn't the property.

Though I'll admit there's a loose end I can't tie off here. Optimization artifacts don't just feel illegible; they feel found. The genome is what evolution found, the model is what gradient descent found, and that phrasing sits right in a way it doesn't for the legacy codebase—nobody says the codebase is what the company found, even when no one understands it either. So opacity explains why we reach for the same tools, but it doesn't explain that extra flavor, the sense of having uncovered something rather than just inherited a mess. I don't think that's a hole in the argument so much as a different argument, and probably the next one.

I'd also flag that I've been describing evolution as if it were a clean optimizer that kept the solutions that worked. It mostly wasn't. A great deal of the genome is drift, repeats, transposons, and broken-down genes that survived because nothing killed them, not because they helped. Which, if you read my last piece, is exactly the trap—we spent years calling that ninety-eight percent junk and were wrong about most of it. The mess is part of the artifact. And interpretability keeps running into its own version of the mess: polysemantic neurons, features that refuse to factor cleanly, dead structure that does nothing. If the analogy holds, that's not noise in the comparison. That's the comparison working.

There's a quieter caveat too. Some of this convergence isn't independent rediscovery at all. A lot of interpretability's vocabulary was carried in by people trained in or near biology and neuroscience. Probes and circuits didn't appear from nowhere; they were named by people who already had the words. How much of the resemblance is structural and how much is borrowed, I genuinely can't separate.

What's left

So here's the claim I'll actually plant a flag on, and it's the broad one.

The genomics resemblance isn't really about biology, and it isn't really about optimization. It's about a relationship. The moment a system gets large and opaque enough that you can't inspect it directly—however it got that way—your relationship to it changes, and the science you do on it changes with it. You stop reading the mechanism off the page and start interrogating the thing from the outside. You sequence it. You build probes. You run assays. You hunt for associations. You draw mechanistic maps. You learn to intervene. That toolkit isn't biology's and it isn't AI's. It's what's left when direct inspection stops working.

That's why a CPU doesn't feel biological and a language model does. Not because one is alive and the other isn't. Because we can still read the CPU and we can't read the model. The transistors are in service of a design we hold in our heads; the weights are in service of a loss we watched go down and a structure we never saw assemble. Same with the genome. Same with the legacy codebase nobody owns anymore. The sense that you're studying a living system shows up exactly when the system stops being legible, and not a moment before.

The exome essay was one tile in this pattern—a single place where the two fields made the identical move. I don't think it was special. I think it was the first one I happened to notice, and once you've seen the shape you start finding it everywhere, which is its own kind of warning: a pattern you can find everywhere is one you should be slightly suspicious of.

But I'll put it plainly, because I think it's the actual discovery here. The moment a system gets too large to read, working on it stops being engineering and becomes genomics.