When you're building pipelines
Your agent guesses entities every session. Corrections don't persist. Memory regressions ship because the architecture can't prevent them.
You wire together multi-step agent pipelines. They work in demos. In production, entities drift, memory regresses, and when something goes wrong you can't trace why. Half your effort goes toward compensating for a memory layer that doesn't hold its shape. Neotoma puts you on solid ground.
Escaping
Babysitting inference — absorbing variance the architecture doesn't handle
Into
Builder who ships on solid ground
Tax you pay
Prompt workarounds, dedup hacks, memory regression fixes
What you get back
Product velocity, shipping confidence, roadmap ambition
Same question, different outcome
Without a state layer, agents return stale or wrong data. With Neotoma, every response reads from versioned, schema-bound state.
Agent starts from zero every session
The agent built up context over a multi-turn workflow, then the session ended. Next session, everything was gone - no accumulated facts, no continuity.
Two agents, conflicting state
A research agent and a writing agent both updated the same record. One silently overwrote the other, and the final output mixed stale and current data.
Can't trace output back to source
A pipeline produced an incorrect recommendation. Without provenance, there was no way to determine which step introduced the error or what data it was based on.
Duplicate records, divergent state
Multiple sessions created separate records for the same person. Downstream agents reasoned over duplicates with conflicting details.
Why this happens
Failure modes without a memory guarantee
Agents forget between sessions
Conversation-scoped memory resets every session. Agents can't accumulate facts, track how records evolve, or reference decisions from prior runs. Each session starts from scratch.
No way to trace why the agent said what it said
When an agent produces an output, there's no link back to the data that informed it. Debugging, evaluation, and compliance all require knowing why - and most memory systems can't answer that.
State changes silently across pipeline steps
In multi-step pipelines, one agent's write can silently overwrite another's. Without versioned state, pipelines produce results that can't be replayed or compared.
Data in memory services you can't audit
Agent workflows route data through external memory APIs with no visibility into storage, access controls, or retention. When someone asks "where is my data?", most agent setups can't answer.
Memory locked to one framework
LangChain memory doesn't port to CrewAI. CrewAI state doesn't port to custom orchestration. Every framework ships its own memory with its own API and no interoperability.
AI needs
What you need from your AI tools, and what current tools don't provide.
- Memory that persists across sessions and agents
- Same input, same output - no silent changes between runs
- Every agent output traces back to the facts it was based on
- Entity resolution so agents reason over canonical records, not duplicates
- Audit trail for debugging and compliance across pipeline steps
How Neotoma solves this
Neotoma removes the tax you pay compensating for unreliable memory. Entities resolve once and persist. Every fact traces back to its source. New features compound instead of regressing.
Persistent, versioned memory
Every state change is versioned. Same inputs always produce the same output. No silent overwrites - pipelines can be replayed and compared.
Full provenance and audit trail
Every record and relationship links back to where it came from. Ask "where did this come from?" and get a traceable chain from output to source.
Open source, no lock-in
MIT-licensed. No vendor lock-in, no proprietary memory format. Your data is yours - stored locally, accessible from any MCP-compatible tool, portable across frameworks.
Entity resolution across sessions
Agents accumulate facts across sessions without creating duplicates. Canonical IDs, typed relationships, and timelines survive restarts, handoffs, and pipeline re-runs.
What actually changes
You stop compensating for memory and start building on top of it. New features compound instead of regressing. You add a capability and it works across sessions because the state it depends on persists.
The records graph gets richer as usage grows, not messier - constraints and resolution rules handle what used to be manual cleanup. Your roadmap shifts from memory regression fixes to new capabilities.
When something goes wrong, you trace it to a specific record in thirty seconds. You start trusting your own system enough to build ambitiously on it.
How is this different from what you're already using?
Framework-native memory
LangChain, CrewAI, and custom frameworks each ship their own memory. None are portable across tools. None version state. None provide provenance. Switching frameworks means starting over.
Retrieval / vector search
Retrieval finds relevant context at query time by similarity. It doesn't persist canonical records, track provenance, or guarantee the same result twice. Retrieval and a persistent memory layer solve different problems.
Provider-hosted memory
ChatGPT memory, Claude memory: conversation-scoped, provider-bound, non-auditable. No cross-tool access, no correction mechanism, no trail.
Retrieval and persistent memory are different paradigms. Retrieval optimizes for flexible, on-demand access. Persistent memory optimizes for consistency and verifiability. If your agents need to reason over canonical records across sessions - not just find relevant context within one - you need a persistent layer underneath retrieval.
Data types for better remembrance
The entity types you'll store most often.
agent_session
Session state and accumulated facts across agent runs
action
Agent actions with inputs, outputs, and provenance links
pipeline
Multi-step workflows with step-level audit
evaluation
Eval results, benchmarks, and regression tracking
audit_event
Immutable log of state transitions and corrections
tool_config
Agent tool configurations and runtime parameters
entity_graph
Resolved records with typed relationships and temporal evolution
runbook
Operational procedures and agent behavioral rules
When you don't need this
For single-session, stateless tasks (one-shot summarization, code generation, document Q&A), retrieval is sufficient. Neotoma is for agents that accumulate facts across sessions, resolve entities, track commitments, and need to explain their reasoning after the fact.
Other modes
The same person operates in multiple modes. The tax differs; the architecture that removes it is the same.
The tax is prompt workarounds, dedup hacks, and memory regressions. Neotoma removes that tax and gives you a persistent, provenance-backed substrate to build on.
Built because retrieval kept re-guessing entities that should have been resolved once and persisted.
Deep dive: Why agent memory needs more than RAG