Research program

The public statement of the program: the question it pursues, the established results it builds on, how the work proceeds, and the ordered tests it has to pass. This is a working document. It changes when the work does, and revisions are recorded in the lab notes.

The question

Can the architectures of learning systems be derived from physical and mathematical law?

Derivation has a specific meaning here, fixed before any result is claimed. The assumptions are stated and countable. Every step can be checked by someone who is not its author. Established systems must appear as special cases when the general structure is restricted. And the result has to disagree with current practice somewhere a numerical experiment can settle. Anything weaker is an analogy, and analogies cannot carry the weight of a foundation.

Starting points

The program does not start from speculation. Each of its load-bearing components is established, published science.

Computation is physical. Erasing one bit of information dissipates at least kT ln 2 of heat, roughly 3 × 10⁻²¹ joules at room temperature. There is no abstract computer; every inference has an energy cost with a known floor. Landauer, 1961
Inference can be relaxation. Associative memory works as descent in an energy landscape. The computation is the physics of settling into a minimum. Hopfield, 1982
Attention is an energy method. The update rule of modern Hopfield networks is, term for term, the attention mechanism of transformers. The dominant architecture of the current era is an energy model that was not recognized as one. Ramsauer et al., 2020
Generation can be thermodynamics. Diffusion models were constructed directly on nonequilibrium thermodynamics: noising as entropy increase, generation as its learned reversal. Sohl-Dickstein et al., 2015
Waves can compute. The wave equation maps onto recurrent computation, and an inhomogeneous physical medium can be trained to classify spoken vowels as the waves propagate through it. Hughes et al., 2019
Learning can be physical too. Equilibrium propagation extracts correct gradients from a system's own relaxation dynamics, with no separate backpropagation pass. Scellier and Bengio, 2017

Read separately, these are six results in six subfields. Read together, they outline a claim that nobody has carried to completion: entropy, energy, and wave dynamics are not metaphors for learning systems. They are what learning systems are made of, and the field keeps rediscovering this one fragment at a time. The program exists to do the assembly deliberately: one framework in which these results are consequences, and from which new architectures follow.

Diffusion is the instructive case. It is the one major model family whose objective came from physics, and it became one of the strongest in use. But the physics stops at the objective; the networks underneath are still chosen by trial. The one time physics was allowed to choose, it chose well. The program pushes the same move the rest of the way down, from the loss function into the architecture itself.

Four pillars

The problem decomposes into the substrate, its mathematics, its physical constraints, and the higher-order structure above them.

Pillar I

Wave and energy-based computation

The substrate. Which inference operations can superposition, interference, and relaxation implement directly, and at what cost in energy, time, and capacity. What a learning rule looks like when it must be a physical process acting on the same field that performs the inference.

Pillar II

Mathematics of intelligence

The structure. What the information geometry of wave-derived model families looks like. Which complexity classes bound learning and inference on these substrates. Which topological properties of a representation are invariants of the dynamics rather than accidents of training.

Pillar III

Physics of computation

The constraints. How far practical inference sits above the Landauer floor, and what closing the gap requires. What reversibility buys a learning system. Whether fluctuation theorems yield usable bounds on learning dynamics far from equilibrium.

Pillar IV

Self-reference and higher-order structure

The ceiling. Which formal structures let a system model its own computation without paradox. What reflection costs. Whether the obstructions known from logic, fixed points and incompleteness, reappear as physical constraints in systems that reason about themselves.

Method

The working loop is short and it repeats.

Derive. Pencil-and-paper work under stated assumptions. The output is a model class or an inference procedure with its preconditions visible, never a mechanism with a story attached.

Simulate. Small-scale numerical experiments on field and wave dynamics. Their job is to break derivations cheaply and early, before anything is built on them.

Build. Whatever survives both is implemented as a reference system and run on real tasks. A result that cannot be implemented is recorded as incomplete, not announced as progress.

The sequence of tests

The program is judged against an ordered sequence. Each step can fail, and a failure invalidates everything after it. Progress is recorded in the lab notes. Nothing on this page claims that a step has been passed.

T1

Define the substrate

A formal definition of the computational medium: its state space, its dynamics, and what counts as computation in it. Precise enough that the later steps can fail.
T2

Recover what works

At least two established model families must drop out as limiting cases: associative memories from energy relaxation, diffusion from entropy flow. A foundation that cannot reproduce known successes is wrong.
T3

Predict a divergence

The framework must disagree with current practice somewhere a numerical experiment can settle, before any larger system is built on it.
T4

Run something derived

An inference or learning procedure obtained from the theory, implemented, and measured on a task that was not chosen to flatter it.

Rejected assumptions. That any current architecture is a law of nature. That benchmark performance substitutes for explanation. That an analogy to physics is a result. That a claim without an artifact deserves publication.

Outputs

Ongoing

Lab notes

Dated working notes: positions, method decisions, readings, and negative results, published as they are written.

When results hold

Technical notes

Formal write-ups released when a derivation or an experiment survives scrutiny. None are published yet. The first will appear when there is one worth reading.

With every result

Reference implementations

Code for any procedure the theory produces, released so that anyone can rerun the result instead of taking it on trust.

Lab notes Pillars on the front page research@enkaidu.com