<!--
  Full-page Markdown export (rendered HTML → GFM).
  Source: https://neotoma.io/bn/schemas/registry
  Generated: 2026-04-27T12:50:38.778Z
-->
# Schema registry

The schema registry is the table that holds every versioned entity schema in Neotoma. It is config-driven by design: domain-specific schemas (contact, invoice, task, …) live as data, not code, so schemas can evolve at runtime without redeploys. Every schema row pairs a field-by-field schema\_definition with a reducer\_config that controls how observations merge into the entity snapshot.

Read on every observation write, every snapshot recomputation, and every schema-projection filter. Sits between the storage layer (sources/observations) and the deterministic reducer.

## Schema[#](#schema)

schema\_registry table (Postgres / hosted)

SQL / TS

Schema or pattern reference for this concept.

CREATE TABLE schema\_registry ( id UUID PRIMARY KEY DEFAULT gen\_random\_uuid(), entity\_type TEXT NOT NULL, schema\_version TEXT NOT NULL, schema\_definition JSONB NOT NULL, reducer\_config JSONB NOT NULL, active BOOLEAN DEFAULT true, created\_at TIMESTAMPTZ DEFAULT NOW(), user\_id UUID REFERENCES auth.users(id), scope TEXT DEFAULT 'global' CHECK (scope IN ('global', 'user')), UNIQUE(entity\_type, schema\_version) );

| Field | Type | Purpose |
| --- | --- | --- |
| entity\_type | TEXT | Domain type label (contact, invoice, task, conversation\_message, …) |
| schema\_version | TEXT | Semantic version (1.0.0, 1.1.0, 2.0.0); unique per entity\_type |
| schema\_definition | JSONB | Field map: name → { type, required?, validator?, converters?, description? } |
| reducer\_config | JSONB | Per-field merge\_policies the reducer uses to compose observations into snapshots |
| active | BOOLEAN | Exactly one active row per entity\_type (per scope) at a time; new writes pick this up immediately |
| scope | TEXT | global (shared) or user (per-user override that wins when caller's user\_id matches) |
| user\_id | UUID | Set when scope = 'user'; lets one tenant evolve their schema without affecting others |

## Schema definition format[#](#definition-format)

schema\_definition is a JSONB object with a single fields key. Each field carries a type (string | number | date | boolean | array | object), an optional required flag, an optional validator function name, an optional preserveCase flag for canonicalization, an optional description, and an optional converters list for deterministic type coercion (e.g. nanosecond timestamp → ISO 8601 date). The shape is intentionally narrow, schemas describe data, they do not run code.

◆

## Field type converters[#](#converters)

Converters reconcile real-world data (numeric timestamps, stringified booleans, nested arrays) with the declared field type without losing the original value. A converter is one of a small registry of named, deterministic functions (timestamp\_nanos\_to\_iso, string\_to\_number, …). Successful conversions land in observations under the schema-typed field; the original value is mirrored into raw\_fragments with reason converted\_value\_original so reprocessing remains lossless.

◆

## Global vs user-specific schemas[#](#user-specific)

Schemas resolve user-specific first, global second. A user-specific schema row (scope = 'user', user\_id = caller) lets a tenant pilot new fields or stricter validators without affecting other users. When a user-specific pattern proves out across many users with consistent types, it can be promoted to a global schema via reconciliation.

◆

## Auto-enhancement from raw\_fragments[#](#auto-enhancement)

Unknown fields encountered at extraction time go to raw\_fragments. With auto-enhancement enabled, the system analyses fragment frequency, type consistency, and source diversity, then promotes high-confidence fields (≥95% type consistency, ≥2 sources, ≥3 occurrences by default) into the active schema as a minor version bump. Field blacklists, name validators, and idempotency guards keep noise out.

◆

## Service interface[#](#service-interface)

register() inserts a new (entity\_type, schema\_version) row. activate() flips active = true on one version and false on the others within the same scope. updateSchemaIncremental() is the safe upgrade path: pass fields\_to\_add and/or fields\_to\_remove, optionally bump the version, optionally migrate historical raw\_fragments. loadActiveSchema() is the read used by ingestion and the reducer.

## Invariants[#](#invariants)

MUST

-   Carry a non-null entity\_type, schema\_version, schema\_definition, and reducer\_config
-   Have at most one active row per (entity\_type, scope, user\_id) combination
-   Be referenced by every observation via observation.schema\_version (immutable on observations)
-   Be the single source of truth for both validation and reducer merge policies
-   Validate every converter against CONVERTER\_REGISTRY before registration

MUST NOT

-   Mutate schema\_definition or reducer\_config in place, register a new schema\_version instead
-   Allow more than one active version per entity\_type within the same scope
-   Carry merge logic (that lives in the reducer), only declarative merge\_policies
-   Be edited from outside the schema registry service

## Related[#](#related)

-   [Schema registry doc](https://github.com/markmhendrickson/neotoma/blob/main/docs/subsystems/schema_registry.md) , Full table definition, definition format, service interface
-   [Merge policies](/schemas/merge-policies) , How reducer\_config drives deterministic snapshot merging
-   [Storage layers](/schemas/storage-layers) , Three-layer storage: raw\_text, properties, raw\_fragments
-   [Versioning & evolution](/schemas/versioning) , Semver rules, breaking changes, schema snapshot exports
-   [Schema definitions (code)](https://github.com/markmhendrickson/neotoma/blob/main/src/services/schema_definitions.ts) , src/services/schema\_definitions.ts, current source of truth in code

## Where to go next[#](#more)

-   [All schema concepts](/schemas) , registry, merge policies, storage layers, versioning
-   [Primitive record types](/primitives) , sources, observations, snapshots, and the rest of Neotoma's atoms
-   [Schema management workflows](/schema-management) , CLI commands for listing, validating, and evolving schemas