<!--
Full-page Markdown export (rendered HTML → GFM).
Source: https://neotoma.io/de/schemas/registry
Generated: 2026-04-27T12:50:39.119Z
-->
# Schema registry
The schema registry is the table that holds every versioned entity schema in Neotoma. It is config-driven by design: domain-specific schemas (contact, invoice, task, …) live as data, not code, so schemas can evolve at runtime without redeploys. Every schema row pairs a field-by-field schema\_definition with a reducer\_config that controls how observations merge into the entity snapshot.
Read on every observation write, every snapshot recomputation, and every schema-projection filter. Sits between the storage layer (sources/observations) and the deterministic reducer.
## Schema[#](#schema)
schema\_registry table (Postgres / hosted)
SQL / TS
Schema or pattern reference for this concept.
CREATE TABLE schema\_registry ( id UUID PRIMARY KEY DEFAULT gen\_random\_uuid(), entity\_type TEXT NOT NULL, schema\_version TEXT NOT NULL, schema\_definition JSONB NOT NULL, reducer\_config JSONB NOT NULL, active BOOLEAN DEFAULT true, created\_at TIMESTAMPTZ DEFAULT NOW(), user\_id UUID REFERENCES auth.users(id), scope TEXT DEFAULT 'global' CHECK (scope IN ('global', 'user')), UNIQUE(entity\_type, schema\_version) );
| Field | Type | Purpose |
| --- | --- | --- |
| entity\_type | TEXT | Domain type label (contact, invoice, task, conversation\_message, …) |
| schema\_version | TEXT | Semantic version (1.0.0, 1.1.0, 2.0.0); unique per entity\_type |
| schema\_definition | JSONB | Field map: name → { type, required?, validator?, converters?, description? } |
| reducer\_config | JSONB | Per-field merge\_policies the reducer uses to compose observations into snapshots |
| active | BOOLEAN | Exactly one active row per entity\_type (per scope) at a time; new writes pick this up immediately |
| scope | TEXT | global (shared) or user (per-user override that wins when caller's user\_id matches) |
| user\_id | UUID | Set when scope = 'user'; lets one tenant evolve their schema without affecting others |
## Schema definition format[#](#definition-format)
schema\_definition is a JSONB object with a single fields key. Each field carries a type (string | number | date | boolean | array | object), an optional required flag, an optional validator function name, an optional preserveCase flag for canonicalization, an optional description, and an optional converters list for deterministic type coercion (e.g. nanosecond timestamp → ISO 8601 date). The shape is intentionally narrow, schemas describe data, they do not run code.
◆
## Field type converters[#](#converters)
Converters reconcile real-world data (numeric timestamps, stringified booleans, nested arrays) with the declared field type without losing the original value. A converter is one of a small registry of named, deterministic functions (timestamp\_nanos\_to\_iso, string\_to\_number, …). Successful conversions land in observations under the schema-typed field; the original value is mirrored into raw\_fragments with reason converted\_value\_original so reprocessing remains lossless.
◆
## Global vs user-specific schemas[#](#user-specific)
Schemas resolve user-specific first, global second. A user-specific schema row (scope = 'user', user\_id = caller) lets a tenant pilot new fields or stricter validators without affecting other users. When a user-specific pattern proves out across many users with consistent types, it can be promoted to a global schema via reconciliation.
◆
## Auto-enhancement from raw\_fragments[#](#auto-enhancement)
Unknown fields encountered at extraction time go to raw\_fragments. With auto-enhancement enabled, the system analyses fragment frequency, type consistency, and source diversity, then promotes high-confidence fields (≥95% type consistency, ≥2 sources, ≥3 occurrences by default) into the active schema as a minor version bump. Field blacklists, name validators, and idempotency guards keep noise out.
◆
## Service interface[#](#service-interface)
register() inserts a new (entity\_type, schema\_version) row. activate() flips active = true on one version and false on the others within the same scope. updateSchemaIncremental() is the safe upgrade path: pass fields\_to\_add and/or fields\_to\_remove, optionally bump the version, optionally migrate historical raw\_fragments. loadActiveSchema() is the read used by ingestion and the reducer.
## Invariants[#](#invariants)
MUST
- Carry a non-null entity\_type, schema\_version, schema\_definition, and reducer\_config
- Have at most one active row per (entity\_type, scope, user\_id) combination
- Be referenced by every observation via observation.schema\_version (immutable on observations)
- Be the single source of truth for both validation and reducer merge policies
- Validate every converter against CONVERTER\_REGISTRY before registration
MUST NOT
- Mutate schema\_definition or reducer\_config in place, register a new schema\_version instead
- Allow more than one active version per entity\_type within the same scope
- Carry merge logic (that lives in the reducer), only declarative merge\_policies
- Be edited from outside the schema registry service
## Related[#](#related)
- [Schema registry doc](https://github.com/markmhendrickson/neotoma/blob/main/docs/subsystems/schema_registry.md) , Full table definition, definition format, service interface
- [Merge policies](/schemas/merge-policies) , How reducer\_config drives deterministic snapshot merging
- [Storage layers](/schemas/storage-layers) , Three-layer storage: raw\_text, properties, raw\_fragments
- [Versioning & evolution](/schemas/versioning) , Semver rules, breaking changes, schema snapshot exports
- [Schema definitions (code)](https://github.com/markmhendrickson/neotoma/blob/main/src/services/schema_definitions.ts) , src/services/schema\_definitions.ts, current source of truth in code
## Where to go next[#](#more)
- [All schema concepts](/schemas) , registry, merge policies, storage layers, versioning
- [Primitive record types](/primitives) , sources, observations, snapshots, and the rest of Neotoma's atoms
- [Schema management workflows](/schema-management) , CLI commands for listing, validating, and evolving schemas