Observations

An observation is a granular, source-specific statement about an entity at a point in time. Observations are the only thing the reducer reads to compute an entity snapshot, and they are the only place ground truth lives. Snapshots are derived; observations are durable.

Source → Interpretation → Observation → Snapshot. Observations are the third layer, the immutable atoms of truth. Reducers merge them deterministically into snapshots.

Schema#

Observation shape (conceptual)

SQL / TS

Schema or pattern reference for this primitive.

interface Observation {
  id: string;
  entity_id: string;
  entity_type: string;
  schema_version: string;
  source_id: string;          // which raw artifact
  interpretation_id: string;  // which extraction run (NULL for structured writes)
  observed_at: Date;
  specificity_score: number;  // how specific this observation is
  source_priority: number;    // 0 = AI, 100 = structured, 1000 = correction
  fields: Record<string, unknown>;
  user_id: string;
}

Field	Type	Purpose
`entity_id`	`string`	The entity this observation is about (resolved during ingestion, user-scoped)
`schema_version`	`string`	Active entity schema version at extraction time
`source_id`	`string`	What raw content produced this observation
`interpretation_id`	`string \| null`	Which extraction run produced this; NULL for structured store_structured writes
`observed_at`	`Date`	When this observation was made (or extracted from the source)
`specificity_score`	`number`	Reducer tie-break: how specific this observation is for its fields
`source_priority`	`number`	0 (AI) / 100 (structured agent) / 1000 (user correction). Corrections always win.
`fields`	`object`	The actual granular facts, whatever the schema admits at this version

The three-layer truth model#

Source carries raw bytes. Interpretation carries the audit trail of how those bytes were read. Observations carry granular facts that link back to both. Snapshots are deterministically composed from all observations for an entity by the reducer. This separation is what lets Neotoma reinterpret without rewriting history and replay snapshots at any point in time.

Immutability and provenance#

Observations are never mutated. Every observation links to its source_id and interpretation_id, so any field on any snapshot can be traced back to the exact bytes and extractor that produced it. Reinterpretation, correction, and re-ingest always produce new observations, the audit trail grows monotonically.

Source priority and merging#

When two observations disagree about the same field, the reducer picks the winner using (source_priority, specificity_score, observed_at). User corrections write at priority 1000 and always win. Structured agent writes are at 100. AI interpretations are at 0. This priority is deterministic and surfaceable: the Inspector shows which observation produced each snapshot field.

Where observations come from#

Three writers create observations: structured store_structured calls (source_priority 100, interpretation_id NULL), AI interpretation pipelines on completion (source_priority 0, interpretation_id set), and explicit user corrections via correct() (source_priority 1000). All writes flow through service_role on the MCP server.

Invariants#

Every observation satisfies the following constraints:

MUST

Link to a non-null source_id and a resolved entity_id
Be immutable, corrections and reinterpretations create new observations, not edits
Carry a source_priority that determines reducer tie-breaks deterministically
Pass attribution policy enforcement before write

MUST NOT

Be mutated after creation
Be created without a corresponding sources row (structured writes still own a synthetic source)
Use confidence directly in reducer merge logic, that is reserved for source_priority and specificity_score
Be returned across user boundaries; every read filters through source ownership

Observation architecture , Three-layer model, lifecycle, snapshot computation, provenance
Sources , Raw artifact each observation links back to
Interpretations , Extraction run each AI-produced observation belongs to
Relationships , Edges follow the same observation-snapshot pattern
Versioned history , Why immutable observations matter for agent memory

Where to go next#

All primitive record types , index of sources, interpretations, observations, relationships, and timeline events
Architecture , how the primitives compose into Neotoma's deterministic state
Terminology , canonical glossary of terms used across Neotoma docs