Architektur

Neotoma's architecture is built on three foundations: append-only observation logs for immutability, deterministic reducers for consistent state composition, and schema-bound entity types for structural guarantees.

Memory evolves deterministically. Given the same observations, Neotoma produces the same entity snapshots. Every state change is versioned with full provenance. Nothing mutates silently; nothing overwrites implicitly.

This means you can inspect any entity at any point in time, diff two versions, and replay the full sequence of changes that produced the current state.

State Layer + Operational Layer(s)#

Neotoma is the state layer: a deterministic, event-sourced, reducer-driven world model. Anything sitting above Neotoma is an operational layer: agents, pipelines, orchestration systems, custom applications. The boundary is a single, simple invariant.

State layer (Neotoma). Stores state, signals state changes, enforces determinism and immutability. Never decides, infers, or acts.
Operational layer(s). Read truth and write back via observations. May reason, plan, decide, and execute side effects. The artifacts of those activities, including plans, decisions, constraints, preferences, and rules, are themselves state and live in Neotoma.

Two-tier diagram

Operational systems (agents, pipelines, custom code) sit above one shared state layer.

Operational Layer(s)

agents (Claude, Cursor, ChatGPT, ...) · pipelines / orchestrators · custom apps

↓ reads truth via retrieval

↓ writes via observations

Neotoma: State Layer

observations → reducers → entity snapshots → memory graph

↑ emits substrate signals

(webhook / SSE; report-only, never strategy)

Strategy artifacts, including plans, decisions, constraints, preferences, and rules, are entities in the state layer. The act of strategizing happens in operational layers; the outputs are inert state.

How state flows#

canalización de estado de neotoma

Fuente

entidades estructuradas · MCP · CLI · API

registrar

Observaciones

hechos granulares + procedencia

reducir

Instantáneas de entidad

verdad actual · versionada

relacionar

Grafo de memoria

entidades · relaciones · línea temporal

↻ reproducir · inspecciona cualquier estado pasado

State flow

How structured writes become durable entities, relationships, and timeline state.

Structured payloads (entities JSON via MCP / CLI / REST)

↓ record observations

Observations (granular facts with provenance)

↓ reduce (deterministic)

Entity snapshots (current truth, versioned)

↓ relate

Memory graph (entities + relationships + timeline)

How data enters Neotoma#

Data enters through store with a structured entities array (MCP, CLI, or REST). Observations are created from that payload; there is no server-side file interpretation or LLM extraction pipeline.

Structured ingestion

Callers pass typed JSON entities. Neotoma validates against schema, deduplicates, and records observations with full provenance. No LLM runs inside the store path.

The agent's own reasoning (or your app) produces the structured data. Chat, tool output, and agent-extracted facts all land here.

The agent is the author; Neotoma is the ledger.

The agent decides what to store; Neotoma ensures it is schema-valid, deduplicated, and provenance-tracked. There is no hidden LLM between the caller and the data layer.

How retrieval works in Neotoma#

Retrieval is not one thing. Neotoma supports three co-available retrieval modes; pick the one whose shape matches the question, and combine them when needed. The structured store is the source of truth; semantic search and graph traversal are layered indices over it.

Structured queries (primary). Look up entities by canonical identity, type, or schema field via retrieve_entity_by_identifier and retrieve_entities. Strongly consistent, schema-aware, deterministic.
Entity semantic search. Vector search runs over structured entity snapshots, scoped by entity type and structural filters. Bounded-eventual (~10s embed lag), but unlike retrieval-only memory it is grounded in versioned, deduplicated entities rather than free text fragments.
Graph traversal. retrieve_related_entities and retrieve_graph_neighborhood walk typed relationships across entities (n-hop). Use when the question is about connections (“what tasks are tied to this contract?”) rather than text similarity.

Retrieval-only memory systems search free-text chunks and return whatever embeds nearest. Neotoma searches over the same structured rows you wrote, with full provenance back to source. The result of any retrieval is a real entity you can inspect, diff, and replay, not a snippet you have to trust.

Guarantees#

Deterministic reduction. Same observations always produce the same entity snapshot. No ordering sensitivity, no hidden state.
Full provenance. Every field traces to a source, timestamp, and store operation. You can always answer "where did this value come from?"
Immutable history. Observations are append-only. Corrections add new observations; they do not erase previous ones.
Timeline replay. Reconstruct entity state at any past point. Diff versions. Audit what changed and why.
Schema-bound storage. Entity types have schemas. New fields extend the schema incrementally; nothing is untyped at rest.

Three foundations#

Foundation	What it means
Privacy-first	User-controlled memory, end-to-end encryption and row-level security, never used for training. Nothing is stored unless you approve it; no background scanning or implicit captures. Your data remains yours.
Deterministic	Same input always produces same output. Schema-first extraction, hash-based entity IDs, full provenance. No hallucinations or probabilistic behavior.
Immutable and verifiable	Every observation is append-only; history cannot be rewritten. Hash-based entity IDs ensure tamper-evident records and a full provenance chain from any state to its source.
Cross-platform	Works with ChatGPT, Claude, Cursor, and Claude Code via MCP. One memory system across tools; no platform lock-in.

These enable: immutable audit trail and time-travel queries, cryptographic integrity, event-sourced history, entity resolution across documents and agent data, timeline generation, structured ingestion via MCP/CLI/API, and persistent memory without context-window limits.

How agents remember#

Every agent follows a mandatory loop: retrieve context, store the conversation and entities, extract structured facts, then respond. Storage completes before any reply.

→Retrieve. Bounded query for entities implied by the message.

→Store. Persist conversation and extracted entities in one call.

→Extract. Facts become typed entities with relationships.

→Respond. Reply only after storage completes.

Invariant: responding before storing is forbidden.

See agent instructions for full behavioral requirements.

What this is not#

Neotoma is not a RAG pipeline or embedding-first retrieval layer. Its core is structured, schema-based, and deterministic. Optional similarity search is available when an embedding provider is configured (via OPENAI_API_KEY), but retrieval falls back to keyword matching without it.

It is not an app, agent, or workflow engine. It is the lowest-level canonical source of truth for structured data (documents and agent-created data), exposed to AI tools via Model Context Protocol (MCP).

Retrieval layers can read from Neotoma. Neotoma governs what they read.

Core terminology#

Term	Definition
State layer	Neotoma's role: a deterministic, immutable structured memory substrate that other layers read and write.
Entity	Canonical representation of a person, company, or other object with a deterministic ID.
Entity snapshot	Current truth for an entity; computed by merging all observations about that entity.
Observation	Granular fact extracted from source; reducers merge observations into entity snapshots.
Source	Raw data (file, text, URL, or structured JSON) stored with content-addressed deduplication.
Provenance	Origin tracking (source, timestamp, user, interpretation) so every value traces back to its source.
Memory graph	The graph of source, observations, entities, relationships, and events with typed edges.
Reducer	Deterministic function that merges observations into an entity snapshot; same observations always yield the same snapshot.
Relationship	Typed connection between two entities (e.g. SETTLES, REFERS_TO, PART_OF).
Entity type	Classification (e.g. person, company, invoice) that determines the entity schema and resolution rules.
Storing	Writing to Neotoma via the unified store path: entities, raw file bytes (sources), or both together.
Retrieving	Querying entities, entity snapshots, observations, and related data from the memory graph.

Interfaces#

Neotoma exposes three interfaces. All three use the same OpenAPI-backed operations, so the same guarantees apply regardless of how you interact with the system.

MCP Server

For AI agents (Claude, Cursor, Codex). Agents store and retrieve via tool calls.

MCP reference →

CLI

For developers. Init, store, retrieve, inspect, and manage from the terminal.

CLI reference →

REST API

For apps and integrations. OpenAPI-first; every operation is an HTTP endpoint.

API reference →

Core principles#

Deterministic. Same input, same output. No probabilistic behavior at the data layer.
Schema-first. Entity types have schemas; extraction is structured, not freeform.
Explainable. Every value traces to a source and operation. No opaque transformations.
Entity-unified. Hash-based canonical IDs resolve duplicates across all data.
Timeline-aware. Date fields generate timeline events automatically.
Cross-platform. MCP, CLI, and REST API expose the same contract.
Privacy-first. User-controlled. Never used for training. Encryption at rest.
Immutable. Observations are append-only. History is never rewritten.
Provenance. Every fact links to its source, timestamp, and ingestion operation.
Explicit control. Nothing updates memory implicitly. The user decides what goes in.
Four-layer model. Structured payloads → Observations → Entity snapshots → Memory graph.

Developer preview status#

The developer preview exposes the core contract only: CLI for humans, MCP for agents, OpenAPI as the single source of truth.

What is guaranteed (even in preview)

No silent data loss: operations either succeed and are recorded or fail with explicit errors.
Explicit, inspectable state mutations: every change is a named operation with visible inputs; state is reconstructable from the audit trail.
Auditable operations: full provenance; CLI and MCP map to the same underlying contract.
Same contract for CLI and MCP: both use the same OpenAPI-backed operations.

What is not guaranteed yet

Stable schemas
Deterministic extraction across versions
Long-term replay compatibility
Backward compatibility

Breaking changes should be expected.