From sequence statistics to conformational attractors
TakensFold applies Takens' delay embedding theorem to protein structure prediction, treating the amino-acid sequence as the observable of a dynamical system converging to a stable folded attractor. The MARINA architecture achieves O(N) complexity and O(1) fixed memory — no attention, no positional encodings.
Standard approaches treat protein folding as a learned sequence-to-structure mapping. TakensFold treats it as a problem in nonlinear dynamical systems — recovering a hidden attractor geometry from an observed time series.
Protein folding is framed as a sequence-to-structure mapping. Large neural networks learn statistical correlations between amino-acid sequences and known 3D coordinates. Attention mechanisms require O(N²) computation and growing memory — demanding GPU clusters and massive datasets.
Protein folding is a temporal dynamical process. A newly synthesised polypeptide explores conformational space, converging to a stable geometric attractor. The amino-acid sequence is the observable. Takens' theorem lets us reconstruct the attractor geometry from exponential delay coordinates of that sequence.
The amino-acid sequence is not merely a list of letters. It is a one-dimensional symbolic construction signal from which a three-dimensional molecular object is built.
Takens' theorem states that, under mild conditions, the full state space of a deterministic dynamical system can be reconstructed from delayed observations of a single time series. For a protein, the observable is the residue embedding processed position-by-position. The hidden state is the evolving conformation in 3D space.
Delay coordinates of the form:
produce a trajectory diffeomorphic to the original conformational attractor. Exponential spacing captures the natural multi-scale organisation of proteins:
Delays = [1, 2, 4, 8, 16, 32, 64, 128]
Training proteins are triplicated in the preprocessing pipeline. In a statistical model, repetition adds no new information. In a Takens-based architecture, repeated exposure deepens the learned attractor basins and thickens conformational trajectory filaments in phase space — directly improving prediction accuracy on structurally similar proteins.
Manifold-Aware Reconstruction and Inference Network Architecture — four core components, no attention, no positional encodings, scales linearly with sequence length.
Each of the 20 standard amino acids (plus non-standard extension vocabulary) is mapped to a learned embedding vector of dimension embed_dim = 128. No positional encodings are used — temporal order is encoded implicitly through the delay structure.
At each position t, a delay-coordinate vector is constructed using a circular buffer of size 2k+1 (k = 7 for the longest delay of 128). This yields O(1) memory usage independent of sequence length. The resulting state vector has dimension (8 + 1) × 128 = 1,152.
The high-dimensional delay vector is projected onto a lower-dimensional manifold via a learned projection matrix Wp followed by LayerNorm. This matrix is the geometric core of the model — its rows encode which combinations of temporal scales are most informative for structure prediction.
A stack of 6 feedforward residual layers (hidden dimension 512) performs non-linear mixing within each position's manifold state. Three independent linear heads then predict the x, y, z coordinates of the Cα atom. Training uses MSE loss in Ångström space.
| Property | AlphaFold2 / ESMFold | MARINA (TakensFold) |
|---|---|---|
| Context mechanism | Multi-head attention (Evoformer) | Exponential delay coordinates |
| Complexity per position | O(N²) | O(log N) |
| Memory footprint | O(N) growing | O(1) fixed circular buffer |
| Positional encodings | Yes (multiple forms) | None — implicit in delays |
| Attention | Central mechanism | None |
| Hardware requirement | GPU cluster (hundreds of TPUs) | CPU sufficient (i7, 32 GB RAM) |
| Training data scale | Hundreds of millions of sequences | ~300–400 proteins (proof of concept) |
| Interpretability | Attention weights (indirect) | Manifold geometry (direct) |
| Open reproducibility | Partial | Full (Mozilla Public License 2.0) |
A proof-of-concept trained from scratch on modest hardware — demonstrating that the MARINA architecture can reconstruct coherent protein geometry from residue sequences without attention.
The model reconstructs coherent backbone geometry from the amino-acid sequence alone. The N-terminal region shows elevated error due to greater conformational freedom of termini; the remainder of the chain is predicted at ~0.5 Å.
Protein-ligand affinity is reframed as multiscale correspondence between two construction signals — the amino-acid sequence (protein) and the SMILES string (ligand). Both are observed symbolic time series from which hidden geometric constraints are reconstructed by Takens-style delay embeddings.
Standard protein models learn correlations between sequences and known structures. MARINA instead reconstructs the attractor geometry of the folding dynamical system. Repeated training exposure does not add statistical redundancy — it deepens geometric basin structure in phase space, directly analogous to the "memory fibre" phenomenon observed in the language modelling experiments. The projection matrix Wp and manifold trajectories offer direct geometric interpretability: rows of Wp reveal learned temporal scales; phase-space analysis can probe attractor stability and mutation effects.
Protein folding is reframed as phase-space reconstruction using Takens' delay embedding theorem. Rather than relying on attention or statistical pattern matching, the MARINA model reconstructs the folded geometry directly from exponential delay coordinates of the amino-acid sequence. Achieves 1.01 Å overall RMSD on 1A7S (227 residues), trained on modest hardware. Complete implementation released as open-source code.
A Takens-based programme for sequence-to-structure and affinity modelling. The central argument: binding affinity is not a local property of a ligand touching a binding pocket. It is a measured scalar imposed on a multiscale correspondence between two construction signals — a protein sequence and a ligand SMILES string. Takens-style delay embeddings offer a principled method for reconstructing hidden geometric constraints from both signals.
Kevin R. Haylett, "Takens-Based Transformer for Protein Structure Prediction: A Proof-of-Concept Implementation with Open-Source Code," Selected Communications (May 2026), code available at https://github.com/KevinHaylett/takens-protein-prediction
The complete codebase enables full reproducibility on consumer hardware. All training and inference commands are documented in the repository README.
pip install torch numpy pandas \
matplotlib biopython
python pipeline/pdb_to_csv_batch.py
python pipeline/pdb_to_training.py
python train.py
python inference.py
Mozilla Public License 2.0 · Results folder includes 1A7S example outputs
The same MARINA architecture applied to language: replacing attention with exponential delay coordinates for O(N) complexity text generation. Three experiments across general, factual, and creative domains.
Takens Language Site →The philosophical framework underlying this work. Meaning, mathematics, and measurement treated as finite, relational, and dynamical — symbols as finite marks embedded in trajectories of use, not Platonic abstractions.
Visit FiniteMechanics.com →Essays and updates on geometric thinking, dynamical systems, protein structure, and the broader TBT research programme.
Read on Substack →Kevin R. Haylett, PhD is an independent researcher with 25+ years of experience in medical engineering, neural networks, and nonlinear dynamical systems. His work applies dynamical systems theory to language models, protein structure, and the foundations of AI — emphasising finite measurement, geometric interpretability, and reproducibility on modest hardware.
TakensFold is one application of the broader Takens-Based Transformer programme, which has also been applied to language modelling and time-series tasks. The goal is not to replace existing large-scale methods, but to demonstrate that attractor reconstruction offers a principled, interpretable, and computationally efficient alternative route through the problem space.
For more background: A Journey Through Rhythms — from Heartbeats to Language Models
Manchester, UK · kevin.haylett@gmail.com