Wave Field LLM — O(n log n) attention via wave equation dynamics, within 5% of standard transformer

badaramoni · February 18, 2026, 6:58pm

I’ve been working on an alternative attention mechanism that treats language
as a physical field system instead of using standard O(n²) self-attention.

How it works:

Tokens are mapped onto a continuous 1D field
Information propagates via damped wave equations: k(t) = exp(-α·t)·cos(ω·t + φ)
Each attention head has just 3 learnable physics parameters (frequency, damping, phase)
Convolution computed via FFT in O(n log n)
Heads self-organize into different roles (local grammar, medium context, long-range)

Results (WikiText-2, 6M params, character tokenizer):

Model	PPL	Accuracy	Complexity
Standard Transformer	5.9	51.0%	O(n²)
Wave Field V3.5	6.2	50.5%	O(n log n)

At longer sequences the savings grow: 31x at 2K tokens, 107x at 8K, 367x at 32K.

Known limitations:

With BPE tokenizer (8K vocab), there’s a significant capacity gap vs standard transformer
This is a model capacity issue at small scale, not an architecture flaw
Currently scaling to 100M params to see if the gap closes

What’s unique:

Every bug during development was found through physics-based diagnostics
(energy flow, conservation, causality tests) — not guessing
Cross-head field coupling and wave interference for information routing
Not a Mamba/Hyena variant — different approach entirely

Code: GitHub - badaramoni/wave-field-llm: An O(n log n) language model architecture using wave equation dynamics instead of O(n²) self-attention. Within 5% of standard transformer quality.

Happy to answer questions about the physics, architecture decisions, or results.

Topic		Replies	Views
Physics-inspired attention: GPT-2-small performance with only 0.64M params, trained in 12min on a Mac Research	3	98	September 25, 2025
From TLinFormer to TConstFormer: The Leap to Constant-Time Transformer Attention: Achieving O(1) Computation and O(1) KV Cache during Autoregressive Inference 🤗Transformers	0	35	September 3, 2025
Wav2Vec2: how much context is available for self-attention Models	0	267	March 21, 2023
Reproduce attention is all you need Beginners	0	535	June 25, 2022
Implementing a custom Attention Transformer Awesome paper	5	3268	September 6, 2021

Wave Field LLM — O(n log n) attention via wave equation dynamics, within 5% of standard transformer

Related topics