Neotoma
Building pipelines mode

When you're building pipelines

Entity resolution by inference. Corrections that don't stick. Memory regressions you absorb because the architecture won't.

One of three operational modes of the same person: someone building an operating system for their own AI agents. Operating mode, Infrastructure debugging mode.

You enter this mode when you're wiring together multi-step agent pipelines. You ship agents that work in demos. In production, entity resolution drifts, memory regresses, and when something goes wrong you can't trace why. Half your engineering effort goes toward compensating for a memory layer that doesn't hold its shape. You're an inference babysitter. Neotoma puts you on solid ground.

Escaping

Inference babysitter — absorbing variance the architecture doesn't handle

Into

Builder who ships on solid ground

Compensating for memory → building on top of it

Tax you pay

Prompt engineering workarounds, dedup hacks, memory regression fixes

What you get back

Product velocity, shipping confidence, roadmap ambition

Same question, different outcome

Without a state layer, agents return stale or wrong data. With Neotoma, every response reads from versioned, schema-bound state.

Cross-session memory
without state layer
Continue the onboarding workflow for Acme Corp.
No onboarding workflow found. Starting fresh.
with state layer
Continue the onboarding workflow for Acme Corp.
Resuming step 4 of 7. Last update: 2 hours ago.

Agent starts from zero every session

The agent accumulated context across a multi-turn workflow, then the session ended. Next session, everything was gone: no accumulated facts, no entity history, no continuity.

Multi-agent coordination
without state layer
What's the latest company summary for Apex?
Apex: 12 employees, Series A. (stale from research agent)
with state layer
What's the latest company summary for Apex?
Apex: 45 employees, Series B. Merged from 2 agent sources.

Two agents, conflicting state

A research agent and a writing agent both updated the same entity. Without versioned writes, one silently overwrote the other, and the final output mixed stale and current data.

Pipeline debugging
without state layer
Why did the pipeline recommend vendor B?
Recommendation based on available data.
with state layer
Why did the pipeline recommend vendor B?
Based on observation #3021 (cost matrix v3) at step 2 of 5.

Can't trace output back to source

An orchestration pipeline produced an incorrect recommendation. Without provenance links, the team couldn't determine which step introduced the error or what data it was based on.

Entity resolution
without state layer
Get all open tasks for client 'J. Martinez'.
Found 2 tasks for 'Jose Martinez', 1 for 'J. Martinez'.
with state layer
Get all open tasks for client 'J. Martinez'.
3 open tasks for Jose Martinez (canonical ID: ent_8f2a).

Duplicate entities, divergent state

Multiple agent sessions created separate records for the same real-world entity. Without canonical resolution, downstream agents reasoned over duplicates with conflicting attributes.

Why this happens

No shared state for agents: token-based or conversation-only; no cross-session, cross-agent state
No provenance: cannot trace agent decisions or outputs to source data
No deterministic layer: need reproducible, explainable state for eval, debug, and compliance
Fragmented context across orchestration steps and multi-agent workflows
Sensitive client data flows through external memory services with no storage or access audit

Failure modes without a memory guarantee

Silent state mutation between agent sessions
Non-replayable pipelines; can't reconstruct agent reasoning
Context loss across orchestration steps and agent handoffs
Evaluation gaps; no audit trail linking outputs to source facts
Client data in third-party memory with no access audit
Framework-specific memory; no portability across agent tools

Agents forget between sessions

Token-based and conversation-only memory resets every session. Agents can't accumulate facts, track entity evolution, or reference decisions from prior runs. Each session starts from scratch or depends on brittle prompt injection.

No provenance means no trust

When an agent produces an output, there's no way to trace it back to the source data that informed it. Debugging, evaluation, and compliance all require knowing why the agent said what it said, and most memory systems can't answer that.

State mutates silently across pipeline steps

In multi-step orchestration, one agent's write can silently overwrite another's. Without versioned, hash-based state evolution, pipelines produce non-reproducible results that can't be replayed or audited.

Client data in memory services you don't control

Agent workflows route customer data through external memory APIs with no visibility into storage location, access controls, or retention policies. When a client asks "where is my data stored?", most agent architectures can't answer.

Memory locked to one framework

LangChain memory doesn't port to CrewAI. CrewAI state doesn't port to custom orchestration. Every framework ships its own memory layer with its own API, its own storage format, and no interoperability. Switching frameworks means starting state management over.

AI needs

What you need from your AI tools, and what current tools don't provide.

How Neotoma solves this

Neotoma removes the tax your team pays compensating for unreliable memory. Entities resolve once and persist. State evolves through versioned, auditable transitions. Every fact traces to provenance. Cross-session state with full audit trail.

What actually changes

You stop compensating for memory and start building on top of it. New features compound instead of regressing. You add a capability and it works across sessions because the state it depends on persists.

You ship to more users and the entity graph gets richer, not messier, because schema constraints and merge rules handle what used to be manual cleanup. Your roadmap shifts from memory regression fixes to new capabilities.

A customer reports an issue and you trace it to a specific observation in thirty seconds. You start trusting your own system enough to build ambitiously on it.

How is this different from what you're already using?

Framework-native memory

LangChain, CrewAI, and custom frameworks each ship their own memory abstraction. None are portable. None version state. None provide provenance. Switching frameworks means starting state management over.

RAG / vector retrieval

Retrieval finds things at query time by similarity. It doesn't persist canonical entities, maintain provenance, or guarantee the same result twice. Retrieval and a truth layer solve different problems and will coexist.

Provider-hosted memory

ChatGPT memory, Claude memory: conversation-scoped, provider-bound, non-deterministic. No cross-platform access, no correction mechanism, no audit trail.

Retrieval and state are different paradigms, not a feature gap. Embedding-based search and agentic search both optimize for flexible, on-demand access. A truth layer optimizes for consistency and verifiability. If your agents need to reason over canonical entities across sessions, not just find relevant context within one, you need a state layer underneath the retrieval.

Data types for better remembrance

The entity types you'll store most often.

agent_session

Session state, context windows, and accumulated facts across agent runs

action

Agent actions with inputs, outputs, and provenance links

pipeline

Multi-step orchestration workflows with step-level audit

evaluation

Eval results, benchmarks, and regression tracking

audit_event

Immutable log of state transitions and entity mutations

tool_config

Agent tool configurations, MCP server bindings, and runtime parameters

entity_graph

Resolved entities with typed relationships and temporal evolution

runbook

Operational procedures and agent behavioral rules

When you don't need this

For single-session, stateless agent tasks (one-shot summarization, code generation, document Q&A), retrieval is sufficient and simpler. Neotoma is for agents that accumulate facts across sessions, resolve entities, track commitments, and need to explain their reasoning after the fact.

Other modes

The same person operates in multiple modes. The tax differs; the architecture that removes it is the same.

In building mode, the tax is prompt engineering workarounds, dedup hacks, and memory regression fixes. Neotoma removes that tax and gives you a deterministic, provenance-backed substrate to build on. Same architecture removes the tax in every mode.

Built because retrieval kept re-inferring entities that should have been resolved once and persisted.

Deep dive: Why agent memory needs more than RAG