Memory

Memory that
doesn't forget.*

A new architecture for AI agent memory. No vector database. No retrieval-augmented generation. No wiki to organize, no files to triage, no context-window tetris. Persistent, exact, and built to scale with your life — not against it.

Where the field is

Every existing AI memory system is a workaround.

If you've built with current AI infrastructure, you've used one of these. They each fail in their own way.

RAG (Retrieval-Augmented Generation)

Chunk your documents, embed each chunk into a vector, search by similarity to the query, stuff the top-K hits into the prompt. Brittle on relational reasoning, hallucinates retrievals when no chunk is similar enough, slow at scale, and fundamentally statistical — you get probabilistically-relevant chunks, not the actual memory you wanted.

Vector databases

Pinecone, Weaviate, Chroma, Qdrant. The infrastructure under most "AI memory" products. Embedding-based search inherits all of RAG's failure modes plus operational ones: re-indexing costs, dimension lock-in, similarity thresholds you have to hand-tune for every dataset. Storing memory you can't deterministically retrieve.

Wikis and file dumps

"Just put everything in Notion / Obsidian / a markdown folder; the AI can search it." Now you're managing the organization yourself. You've moved the problem from the model to your own attention. The system requires you to remember where you put things to remember things at all.

Context-window stuffing

Paste the entire conversation history into every API call. Works for short threads. Hits the token wall fast, costs proportionally to history length, and the quality of attention degrades with context size — even on million-token models, the middle of the context is where information goes to die.

"Long-term memory" research products

Mem0, Letta, MemGPT, ReasoningBank, Memp, MIND, and a dozen academic systems. Each captures a useful idea — extraction, hierarchical chunking, uncertainty-driven retrieval, procedural memory. But each is a partial answer, none has been integrated into a sovereign-on- device architecture, and none of them solve the underlying problem: AI memory built on top of a statistical retrieval layer is a workaround, not a primitive.

What we built

A memory primitive, not a memory product.*

Alpenglow's memory is built on our own substrate — a proprietary mathematical foundation that lets us treat memory as a first-class operation, not a search problem bolted on top of a model.

Every interaction your agent has, every document it reads, every piece of working state it accumulates, every learned procedure — all of it gets inscribed directly into the substrate. Recall isn't a similarity search. It's a deterministic operation. The memory you stored is the memory you get back.

We're keeping the specifics of how this works deliberately out of public materials. The capability is the claim; the implementation is patent-protected and not for marketing. What we will commit to publicly is the behavior:

  • Exact recall. When your agent retrieves a memory, it gets the actual stored memory — not a probabilistic approximation.
  • Persistent across sessions. Conversations from a year ago are as accessible as the one you had this morning.
  • Persistent across devices. Memory follows your account from Mac to iPhone to wherever else you run Alpenglow. Same memory, every surface.
  • Persistent across model swaps. Switch from Claude to GPT to a local model. Your agent's memory doesn't reset.
  • Scales without degradation. Performance and recall quality are constant whether you have a hundred memories or a hundred million.
  • Stays on your device. Memory inscriptions are written to your local substrate vault, encrypted at rest. Federation contributes anonymized signals, not your content (see how federation works).

What this lets you do

Things that are awkward today, ordinary here.

"What did we decide about X six months ago?"

Ask. The agent has the context — the actual conversation, not a paraphrase reconstructed from chunks.

"Has this idea come up before?"

The agent knows. Across every conversation, every document, every drafted note. No "I don't have access to past conversations" disclaimers.

"Pick up where we left off — in any thread, any time."

Continuity is the default, not a feature you have to engineer around with note-taking and prompt prefacing.

"Build on what you've already learned."

Successful procedures, learned preferences, patterns the agent figured out the hard way — they accumulate. Your agent at month twelve isn't the same agent it was on day one. It's smarter for having lived through everything you've worked on.

On scale

N = 1.00000000.*

That's the recall accuracy in our internal testing. Not "very high." Not "asymptotically approaching one." Exactly one, across every scale we've measured. Inject millions of synthetic memories, query for any of them at any depth, retrieve the exact memory back. Every time.

Retrieval depth is constant-bounded — the operational cost doesn't grow with how much memory has accumulated. There is no "scan the database" step. The architecture doesn't have one to optimize away.

We're being deliberately careful with the framing here. We may still need to tune how some categories of memory get flagged or how some classes of inscription get organized as real-world patterns surface during BETA. That's storage convention work, not retrieval-architecture work. The retrieval primitive itself is measured, bounded, and exact.

What we're claiming: in testing, recall accuracy is 1.00000000 and retrieval cost is constant-bounded across every scale we can generate. What we're not claiming: that no real-world workload will ever require us to evolve storage conventions, indexing heuristics, or how specific memory types are flagged. Those are above-the-architecture concerns, not architecture limits.

Where memory lives

On your device. Always.

The substrate vault is local. Memory inscriptions never get uploaded to a server we control. Cross-device sync moves memory directly device-to-device over an end-to-end encrypted channel — our cloud relays nothing.

Federation does flow back to us, but as anonymized signals, not your content — patterns that improve everyone's agents, decoupled from any individual identity or memory. See how federation works for the specifics.

Try the memory.

Get on the BETA. Spend a few days with it. Notice when it remembers things you didn't expect it to. That's where the difference is.

Join the BETA