<!--
  Full-page Markdown export (rendered HTML → GFM).
  Source: https://neotoma.io/de/primitives/interpretations
  Generated: 2026-04-27T12:50:28.370Z
-->
# Interpretations

An interpretation is a versioned attempt to extract structured information from a single source. It exists as a first-class record so the system can audit how data was extracted, reinterpret without rewriting history, and track extraction quality over time. Structured agent writes (already-structured payloads) skip interpretations entirely.

Source → Interpretation → Observation → Snapshot. Interpretations are the second layer, they record which model, prompt, and schema version produced which observations.

## Schema[#](#schema)

interpretations table (Postgres / hosted)

SQL / TS

Schema or pattern reference for this primitive.

CREATE TABLE interpretations ( id UUID PRIMARY KEY DEFAULT gen\_random\_uuid(), source\_id UUID NOT NULL REFERENCES sources(id), interpretation\_config JSONB NOT NULL, status TEXT NOT NULL DEFAULT 'pending', error\_message TEXT, extracted\_entities JSONB DEFAULT '\[\]', confidence NUMERIC(3,2), unknown\_field\_count INTEGER NOT NULL DEFAULT 0, extraction\_completeness TEXT DEFAULT 'unknown', started\_at TIMESTAMPTZ, completed\_at TIMESTAMPTZ, created\_at TIMESTAMPTZ DEFAULT NOW(), archived\_at TIMESTAMPTZ, user\_id UUID NOT NULL );

| Field | Type | Purpose |
| --- | --- | --- |
| id | UUID | Referenced by observations.interpretation\_id for full provenance |
| source\_id | UUID | The source that was interpreted |
| interpretation\_config | JSONB | Audit log of model, model\_version, extractor\_type, prompt\_version, temperature, schema\_version |
| status | TEXT | State machine: pending → running → completed | failed |
| confidence | NUMERIC(3,2) | Aggregate model self-reported confidence in \[0.00, 1.00\], advisory, not authoritative |
| unknown\_field\_count | INTEGER | Count of extracted fields that did not match the active schema and were routed to raw\_fragments |
| extraction\_completeness | TEXT | complete / partial / unknown, coverage signal for the source |
| archived\_at | TIMESTAMPTZ | Set when a newer interpretation supersedes this one; the row stays queryable |

## interpretation\_config is an audit log, not a replay contract[#](#config-not-replay)

interpretation\_config captures the model, prompt, extractor type, and schema version active at run start. Re-running with the same config can produce different outputs because LLM weights drift, network conditions affect tokenisation, and tools the extractor calls may themselves be non-deterministic. What Neotoma guarantees is that whichever output happened is permanently linked to the config that produced it.

◆

## Status state machine[#](#state-machine)

pending (created, not started) → running (started\_at set) → completed | failed (terminal). Terminal states never transition back; reruns create new rows. confidence, unknown\_field\_count, extraction\_completeness, and completed\_at are written on the terminal transition.

◆

## Reinterpretation creates new rows, never mutates[#](#reinterpretation)

Reinterpretation always creates a new interpretation and new observations. The prior interpretation gets archived\_at marked but its observations remain queryable in observation history. The reducer chooses between competing observations using source\_priority, specificity\_score, and observed\_at; corrections (priority 1000) always win.

◆

## Quality signals[#](#quality-signals)

unknown\_field\_count flags schema drift, sustained spikes mean the schema is missing real-world fields and should be evolved via update\_schema\_incremental. extraction\_completeness (complete/partial/unknown) is set by the extractor at run end. confidence is advisory only, the reducer MUST NOT use it for merge decisions.

## Invariants[#](#invariants)

Every interpretation satisfies the following constraints:

MUST

-   Carry a non-null source\_id, interpretation\_config, and user\_id
-   Capture model / extractor / prompt / schema version in interpretation\_config at run start
-   Be immutable in identifying fields after write, only status, timing, quality, and archived\_at change
-   Pass attribution policy enforcement before write

MUST NOT

-   Be mutated in a way that retroactively changes which observations a row produced
-   Be hard-deleted (use archived\_at) outside of explicit user-initiated source deletion
-   Be created without a corresponding sources row
-   Be assumed deterministic for replay, only audit-log linkage to config is guaranteed

## Related[#](#related)

-   [Interpretations subsystem doc](https://github.com/markmhendrickson/neotoma/blob/main/docs/subsystems/interpretations.md) , Full schema, status lifecycle, quality signals
-   [Sources](/primitives/sources) , Raw artifact every interpretation points back to
-   [Observations](/primitives/observations) , Granular facts produced by completed interpretations
-   [MCP spec, reinterpret](https://github.com/markmhendrickson/neotoma/blob/main/docs/specs/MCP_SPEC.md) , reinterpret(source\_id, interpretation\_config?) tool
-   [Implementation](https://github.com/markmhendrickson/neotoma/blob/main/src/services/interpretation.ts) , src/services/interpretation.ts, create / status transitions

## Where to go next[#](#more)

-   [All primitive record types](/primitives) , index of sources, interpretations, observations, relationships, and timeline events
-   [Architecture](/architecture) , how the primitives compose into Neotoma's deterministic state
-   [Terminology](/terminology) , canonical glossary of terms used across Neotoma docs