Rethinking Language Models

From attention mechanisms to phase space trajectories

The Takens-Based Transformer replaces quadratic attention with explicit delay-coordinate reconstruction, achieving O(N) complexity and fixed memory while modeling language as dynamical trajectories through semantic manifolds.

Simul Pariter — Together Equally

Read the Paper Learn More

A Different Lens

Traditional View

Language models treat tokens as static points in semantic space. Attention mechanisms search backwards through history with O(N²) complexity, requiring ever-growing key-value caches.

Dynamical View

Language is a trajectory through semantic phase space. Context is embedded in the current position and momentum. Takens' theorem lets us reconstruct this trajectory from exponentially-spaced delays.

Phase Space Reconstruction

Instead of comparing every token to every other token, the TBT reconstructs the system's state from delay coordinates:

x_t = [e_t, e_t-1, e_t-2, e_t-4, e_t-8, e_t-16, ...]

Multi-scale structure: Recent delays capture syntax, distant delays capture narrative
Fixed memory: O(1) buffer size regardless of sequence length
Linear complexity: O(log N) lookups per token, not O(N²) comparisons
CPU-friendly: Trained on commodity hardware without GPUs

t-1

t-2

t-4

t-8

t-16

Exponential delays capture multiple timescales simultaneously

Architecture Comparison

Component	Standard Transformer	MARINA (TBT)
Context mechanism	Multi-head attention	Exponential delays
Complexity per token	O(N²)	O(log N)
Memory growth	O(N) KV-cache	O(1) fixed buffer
Context retrieval	Query-key similarity	Phase space embedding
Hardware requirement	GPU clusters	CPU sufficient
Interpretability	Attention weights	Manifold geometry

Proof of Concept

Three experiments demonstrate that explicit phase space reconstruction can successfully model language across different regimes:

Brown Corpus

15M parameters

General linguistic dynamics on balanced English text. Achieves stable convergence (validation loss: 4.21, perplexity ~67) on CPU hardware.

Stable training over 44 epochs
Coherent text generation
55K word vocabulary

Solar System Q&A

1.1M parameters

Precision memory through "tubular attractors." Progressive repetition forms narrow geometric channels connecting questions to answers.

100% basin separation
Memory fibres formation
Validation perplexity: 1.1 (4× training)

Corpus Ancora

84% validation improvement

Mythopoetic generation with thematic coherence. Repeated exposure strengthens geometric structure, improving both training and validation.

Broad attractor basins
Stylistic coherence
Geometric learning evidence

Key Discovery: Domain-Dependent Manifold Topology

Different linguistic tasks produce different geometric structures. Factual Q&A forms narrow "memory fibres" for precision, while creative generation forms broad basins for compositional flexibility. The same architecture learns the appropriate geometry for each domain.

The Paper

Introducing the Takens-Based Transformer

Kevin R. Haylett, PhD | December 2025 | Draft 0.1

This work presents a practical implementation of a Takens-based Transformer that fully replaces attention with exponential delay-coordinate reconstruction. The architecture achieves linear complexity and fixed memory usage while demonstrating stable convergence across general language modeling, structured reasoning, and creative generation tasks.

Download PDF View Online

Highlights

Theoretical Foundation: Builds on Takens' Delay Embedding Theorem from dynamical systems theory
MARINA Architecture: Manifold-Aware Reconstruction and Inference Network
Channel Theory: Topological separation of user input, system output, and reasoning
Memory Fibres: Novel geometric primitive for factual knowledge encoding
Geometric Learning: Evidence that models learn manifold structure, not just statistics

Citation

Kevin R. Haylett, "Introducing the Takens-Based Transformer," (December 2025), available at https://finitemechanics.com/papers/takens-transformer.pdf

Further Exploration

Geofinitism

The philosophical framework underlying this work: meaning as geometric relationships in finite manifolds, measurement-first approaches, and the Five Pillars of finite reality.

Visit Geofinitism.com →

GitHub

GitHub Repository with Open Source Code.

Takens-Embedding-Transformer →

Substack

Essays and updates on geometric thinking, language models, and the intersection of dynamical systems theory with AI research.

Read on Substack →

About

Kevin R. Haylett, PhD is an independent researcher with 25+ years of experience in medical engineering, neural networks, and nonlinear dynamical systems. His work emphasizes uncertainty over the Platonic Realm of perfection and approaches research through the lens of "Geofinitism."

For more details see:

Background and more information on Substack

Manchester, UK | kevin.haylett@gmail.com