<!-- Full-page Markdown export (rendered HTML → GFM). Source: https://neotoma.io/ar/primitives/entity-snapshots Generated: 2026-04-27T12:50:35.236Z --> # Entity snapshots An entity snapshot is the deterministic reducer output for one entity, the system's current best answer to 'given every observation we have, what is the truth right now?' Snapshots are derived, cached, and recomputed; observations are the durable ground truth. Every snapshot field carries provenance back to the observation that produced it, and snapshots optionally carry an embedding for semantic search. Source → Interpretation → Observation → Snapshot. Entity snapshots are the rightmost layer of the truth model, the merged view derived from observations, with provenance and embeddings attached. ## Schema[#](#schema) entity\_snapshots table (Postgres / hosted) SQL / TS Schema or pattern reference for this primitive. CREATE TABLE entity\_snapshots ( entity\_id TEXT PRIMARY KEY, entity\_type TEXT NOT NULL, schema\_version TEXT NOT NULL, snapshot JSONB NOT NULL, computed\_at TIMESTAMPTZ NOT NULL, observation\_count INTEGER NOT NULL, last\_observation\_at TIMESTAMPTZ NOT NULL, provenance JSONB NOT NULL, user\_id UUID NOT NULL, embedding vector(1536) ); | Field | Type | Purpose | | --- | --- | --- | | entity\_id | TEXT | Foreign key to entities.id and PK, at most one snapshot per entity | | entity\_type | TEXT | Mirrors entities.entity\_type so reads avoid the join | | schema\_version | TEXT | Schema version this snapshot was computed against | | snapshot | JSONB | The merged, current truth for the entity, computed by the reducer | | provenance | JSONB | Map field → observation\_id; one entry per snapshot field, drives 'where did this come from?' views | | observation\_count | INTEGER | Number of observations the snapshot was computed from | | last\_observation\_at | TIMESTAMPTZ | Timestamp of the newest observation included in the snapshot | | computed\_at | TIMESTAMPTZ | Wall-clock time of the most recent reducer run | | embedding | vector(1536) | Optional embedding of the snapshot for semantic similarity search; partial ivfflat index covers non-null rows | | user\_id | UUID | Owner; mirrors entities.user\_id for RLS | ## Deterministic by construction[#](#deterministic) Same observations + same schema + same reducer config ⇒ same snapshot, byte-for-byte (modulo computed\_at). Re-running the reducer never randomly changes a field. This is what lets Neotoma replay historical state, audit truth, and detect non-determinism in custom reducers. ◆ ## Provenance map[#](#provenance-map) provenance is a JSONB object whose keys are snapshot field names and whose values are the observation\_id that produced each value. From there the chain is fully resolvable: observation → source (raw bytes) and observation → interpretation (model, prompt, schema version). Every snapshot field has exactly one provenance entry, no field is unsourced. ◆ ## When the reducer runs[#](#lifecycle) The reducer recomputes a snapshot when its observation set changes: a new observation arrives, a reinterpretation completes, an entity merge rewrites observations from a loser entity to a winner, or a schema upgrade requires recomputation against a new schema\_version. Reads never trigger recomputation, snapshots are cached state. ◆ ## Embeddings and vector parity[#](#embeddings) Snapshots optionally carry a 1536-dimensional embedding for semantic similarity search. The cosine ivfflat index is partial (lists=100) and only covers rows where embedding is not null. In local SQLite mode the column is mirrored into a sqlite-vec virtual table (entity\_embeddings\_vec) plus a join table (entity\_embedding\_rows) so KNN queries get the same shape as hosted. ◆ ## Merge deletes the loser, recomputes the winner[#](#merge-and-recompute) When two entities are merged, the loser's snapshot is deleted (the loser entity has no more observations pointing at it), and the winner's snapshot is recomputed deterministically from the union of observations. Reads filter merged entities by default, so the loser disappears from default views without any retroactive rewriting of history. ## Invariants[#](#invariants) Every entity snapshot satisfies the following constraints: MUST - Have exactly one row per entity (PK on entity\_id) - Be byte-for-byte reproducible from observations + schema + reducer config (modulo computed\_at) - Carry a provenance entry for every field in snapshot - Stamp schema\_version, observation\_count, last\_observation\_at, and computed\_at on every recomputation - Be filtered by user ownership on every read path MUST NOT - Be edited directly by clients or agents, every change is the reducer reacting to new observations - Survive an entity merge on the merged-from side, the loser's snapshot is deleted, the winner's is recomputed - Be treated as durable ground truth, observations are durable, snapshots are derived - Carry a value in snapshot without a corresponding provenance entry ## Related[#](#related) - [Entity snapshots subsystem doc](https://github.com/markmhendrickson/neotoma/blob/main/docs/subsystems/entity_snapshots.md) , Schema, computation, provenance, embeddings, merge interaction - [Entities](/primitives/entities) , Canonical entity row this snapshot is derived for - [Observations](/primitives/observations) , Atoms of ground truth the reducer composes - [Reducer](https://github.com/markmhendrickson/neotoma/blob/main/docs/subsystems/reducer.md) , Architecture, merge strategies, converters - [Vector ops](https://github.com/markmhendrickson/neotoma/blob/main/docs/subsystems/vector_ops.md) , Embedding generation, ivfflat tuning, sqlite-vec parity - [Determinism doctrine](https://github.com/markmhendrickson/neotoma/blob/main/docs/architecture/determinism.md) , Why snapshot reproducibility matters ## Where to go next[#](#more) - [All primitive record types](/primitives) , index of sources, interpretations, observations, relationships, and timeline events - [Architecture](/architecture) , how the primitives compose into Neotoma's deterministic state - [Terminology](/terminology) , canonical glossary of terms used across Neotoma docs