Skip to main content

The problem with plaintext memory

Plaintext agent memory stores facts by position. Context compression destroys position. A file like:
User's GPU is RTX5090. Token budget is 1024. Current task is training LFM2.5.
becomes after compression:
User works on ML projects with GPU hardware.
Exact values are lost or degraded. Spectral Memory encodes them spectrally instead.

Encoding

Each fact is encoded as an amplitude-shift keyed (ASK) signal on a dedicated carrier frequency. All 40 channels are superimposed into a single composite waveform, then quantized to 512 token IDs drawn from the Hermes3 vocabulary. The result is a [MEM]...[/MEM] block — 512 tokens of opaque signal:
[MEM]magistrate yi GOP unprotected veng cascade mirror...[/MEM]
This block is injected into the agent’s system prompt at session start.

Decoding

When the agent needs a fact, the fine-tuned Hermes3-3B reader model performs implicit frequency decomposition: given the 512-token signal and a natural-language question, it returns the encoded value directly.
[MEMORY]{512 FDM tokens}[/MEMORY]
Question: What GPU is the user using?
Answer: RTX5090
No explicit Fourier transform. No dictionary lookup. No semantic search. The value is read directly from the spectral structure of the token sequence — a capability learned during fine-tuning on 100K+ encode/decode pairs.

Fast path vs. slow path

The /decode endpoint has two paths:
  • Fast path (default): reads from persistent server-side state. ~0ms. Used for routine agent queries.
  • Slow path (use_model=true): sends the .mem block to the Hermes3 inference endpoint. ~400ms. Used to verify spectral encoding integrity or when state is unavailable.

Research basis

Spectral Memory is built on results from Frequency-Division Multiplexed In-Context Memory Encoding for Language Models (NeurIPS 2025). The fine-tuned Hermes3-3B model achieves 100% decision accuracy and 99.9% extra-channel accuracy across 1–5 hop reasoning chains with no compositionality gap.