Rethinking Language Models

From attention mechanisms to phase space trajectories

The Takens-Based Transformer replaces quadratic attention with explicit delay-coordinate reconstruction, achieving O(N) complexity and fixed memory while modeling language as dynamical trajectories through semantic manifolds.

Simul Pariter — Together Equally

A Different Lens

Traditional View

Language models treat tokens as static points in semantic space. Attention mechanisms search backwards through history with O(N²) complexity, requiring ever-growing key-value caches.

Static Points

Dynamical View

Language is a trajectory through semantic phase space. Context is embedded in the current position and momentum. Takens' theorem lets us reconstruct this trajectory from exponentially-spaced delays.

Flowing Trajectory

Phase Space Reconstruction

Instead of comparing every token to every other token, the TBT reconstructs the system's state from delay coordinates:

xt = [et, et-1, et-2, et-4, et-8, et-16, ...]
  • Multi-scale structure: Recent delays capture syntax, distant delays capture narrative
  • Fixed memory: O(1) buffer size regardless of sequence length
  • Linear complexity: O(log N) lookups per token, not O(N²) comparisons
  • CPU-friendly: Trained on commodity hardware without GPUs
t
t-1
t-2
t-4
t-8
t-16

Exponential delays capture multiple timescales simultaneously

Architecture Comparison

Component Standard Transformer MARINA (TBT)
Context mechanism Multi-head attention Exponential delays
Complexity per token O(N²) O(log N)
Memory growth O(N) KV-cache O(1) fixed buffer
Context retrieval Query-key similarity Phase space embedding
Hardware requirement GPU clusters CPU sufficient
Interpretability Attention weights Manifold geometry

Proof of Concept

Three experiments demonstrate that explicit phase space reconstruction can successfully model language across different regimes:

Brown Corpus

15M parameters

General linguistic dynamics on balanced English text. Achieves stable convergence (validation loss: 4.21, perplexity ~67) on CPU hardware.

  • Stable training over 44 epochs
  • Coherent text generation
  • 55K word vocabulary

Solar System Q&A

1.1M parameters

Precision memory through "tubular attractors." Progressive repetition forms narrow geometric channels connecting questions to answers.

  • 100% basin separation
  • Memory fibres formation
  • Validation perplexity: 1.1 (4× training)

Corpus Ancora

84% validation improvement

Mythopoetic generation with thematic coherence. Repeated exposure strengthens geometric structure, improving both training and validation.

  • Broad attractor basins
  • Stylistic coherence
  • Geometric learning evidence

Key Discovery: Domain-Dependent Manifold Topology

Different linguistic tasks produce different geometric structures. Factual Q&A forms narrow "memory fibres" for precision, while creative generation forms broad basins for compositional flexibility. The same architecture learns the appropriate geometry for each domain.

The Paper

Introducing the Takens-Based Transformer

Kevin R. Haylett, PhD | December 2025 | Draft 0.1

This work presents a practical implementation of a Takens-based Transformer that fully replaces attention with exponential delay-coordinate reconstruction. The architecture achieves linear complexity and fixed memory usage while demonstrating stable convergence across general language modeling, structured reasoning, and creative generation tasks.

Highlights

  • Theoretical Foundation: Builds on Takens' Delay Embedding Theorem from dynamical systems theory
  • MARINA Architecture: Manifold-Aware Reconstruction and Inference Network
  • Channel Theory: Topological separation of user input, system output, and reasoning
  • Memory Fibres: Novel geometric primitive for factual knowledge encoding
  • Geometric Learning: Evidence that models learn manifold structure, not just statistics

Citation

Kevin R. Haylett, "Introducing the Takens-Based Transformer," (December 2025), available at https://finitemechanics.com/papers/takens-transformer.pdf

Further Exploration

Geofinitism

The philosophical framework underlying this work: meaning as geometric relationships in finite manifolds, measurement-first approaches, and the Five Pillars of finite reality.

Visit Geofinitism.com →

GitHub

GitHub Repository with Open Source Code.

Takens-Embedding-Transformer →

Substack

Essays and updates on geometric thinking, language models, and the intersection of dynamical systems theory with AI research.

Read on Substack →

About

Kevin R. Haylett, PhD is an independent researcher with 25+ years of experience in medical engineering, neural networks, and nonlinear dynamical systems. His work emphasizes uncertainty over the Platonic Realm of perfection and approaches research through the lens of "Geofinitism."

For more details see:

  • Background and more information on Substack
  • Manchester, UK | kevin.haylett@gmail.com