Observations

An observation is a granular, source-specific statement about an entity at a point in time. Observations are the only thing the reducer reads to compute an entity snapshot, and they are the only place ground truth lives. Snapshots are derived; observations are durable.

Source → Interpretation → Observation → Snapshot. Observations are the third layer, the immutable atoms of truth. Reducers merge them deterministically into snapshots.

Schema#

Observation shape (conceptual)

SQL / TS
Schema or pattern reference for this primitive.
interface Observation { id: string; entity_id: string; entity_type: string; schema_version: string; source_id: string; // which raw artifact interpretation_id: string; // which extraction run (NULL for structured writes) observed_at: Date; specificity_score: number; // how specific this observation is source_priority: number; // 0 = AI, 100 = structured, 1000 = correction fields: Record<string, unknown>; user_id: string; }
FieldTypePurpose
entity_idstringThe entity this observation is about (resolved during ingestion, user-scoped)
schema_versionstringActive entity schema version at extraction time
source_idstringWhat raw content produced this observation
interpretation_idstring | nullWhich extraction run produced this; NULL for structured store_structured writes
observed_atDateWhen this observation was made (or extracted from the source)
specificity_scorenumberReducer tie-break: how specific this observation is for its fields
source_prioritynumber0 (AI) / 100 (structured agent) / 1000 (user correction). Corrections always win.
fieldsobjectThe actual granular facts, whatever the schema admits at this version

The three-layer truth model#

Source carries raw bytes. Interpretation carries the audit trail of how those bytes were read. Observations carry granular facts that link back to both. Snapshots are deterministically composed from all observations for an entity by the reducer. This separation is what lets Neotoma reinterpret without rewriting history and replay snapshots at any point in time.

Immutability and provenance#

Observations are never mutated. Every observation links to its source_id and interpretation_id, so any field on any snapshot can be traced back to the exact bytes and extractor that produced it. Reinterpretation, correction, and re-ingest always produce new observations, the audit trail grows monotonically.

Source priority and merging#

When two observations disagree about the same field, the reducer picks the winner using (source_priority, specificity_score, observed_at). User corrections write at priority 1000 and always win. Structured agent writes are at 100. AI interpretations are at 0. This priority is deterministic and surfaceable: the Inspector shows which observation produced each snapshot field.

Where observations come from#

Three writers create observations: structured store_structured calls (source_priority 100, interpretation_id NULL), AI interpretation pipelines on completion (source_priority 0, interpretation_id set), and explicit user corrections via correct() (source_priority 1000). All writes flow through service_role on the MCP server.

Invariants#

Every observation satisfies the following constraints:

MUST

  • Link to a non-null source_id and a resolved entity_id
  • Be immutable, corrections and reinterpretations create new observations, not edits
  • Carry a source_priority that determines reducer tie-breaks deterministically
  • Pass attribution policy enforcement before write

MUST NOT

  • Be mutated after creation
  • Be created without a corresponding sources row (structured writes still own a synthetic source)
  • Use confidence directly in reducer merge logic, that is reserved for source_priority and specificity_score
  • Be returned across user boundaries; every read filters through source ownership

Where to go next#

  • All primitive record types , index of sources, interpretations, observations, relationships, and timeline events
  • Architecture , how the primitives compose into Neotoma's deterministic state
  • Terminology , canonical glossary of terms used across Neotoma docs