The @neotoma/agent package is the protocol-enforcing agent harness SDK for Neotoma. It wraps
@neotoma/client
with the canonical store-first turn protocol so developers building agents with any LLM provider get correct Neotoma memory behavior without learning the interaction rules by hand.
Provider-agnostic. This package does not call any LLM. Compose it around any agent loop — Claude, OpenAI, custom HTTP, whatever.
Install
npm install @neotoma/agent @neotoma/client
When to reach for this package
- You are building an agent that should persist its turns to Neotoma.
- You want correct
conversation/conversation_messageshapes,PART_OFandREFERS_TOedges, and deterministic idempotency keys by construction — not by hand. - You do not want your agent code to learn the Neotoma turn lifecycle.
If you only need to write a single ad-hoc record, use @neotoma/client directly. If you are wiring a host harness (Claude Code, Cursor, OpenCode, Codex), use the dedicated hook plugin — it already depends on @neotoma/agent under the hood.
Quick start (provider-agnostic)
import { HttpTransport } from "@neotoma/client";
import { withMemory } from "@neotoma/agent";
const wrapped = withMemory(
async (userMessage, ctx) => {
// ctx.retrieved gives you entities Neotoma already knows that match the
// user message. Use them however you want — build the prompt, ground the
// reply, mention them to the model. Up to you.
return await yourAgent(userMessage, ctx.retrieved);
},
{
transport: new HttpTransport({
baseUrl: process.env.NEOTOMA_URL!,
token: process.env.NEOTOMA_TOKEN!,
}),
conversationId: "conv-2026-05-20",
platform: "my-agent",
}
);
const { assistantMessage } = await wrapped("What do we know about Acme Corp?");
On every call, the wrapper:
- Bounded retrieval — extracts identifiers from the user message and looks them up in Neotoma (best-effort; falls back to empty set on failure).
- Stores the user message as a
conversation_messagePART_OFthe conversation, withREFERS_TOedges to retrieved entities. - Invokes your agent with the retrieved set available on
ctx.retrieved. - Stores the assistant reply the same way, with
REFERS_TOedges to cited entities. - Idempotency keys are deterministic per turn — re-runs collapse onto one observation.
Example: Claude (Anthropic SDK)
import Anthropic from "@anthropic-ai/sdk";
import { HttpTransport } from "@neotoma/client";
import { withMemory } from "@neotoma/agent";
const anthropic = new Anthropic();
const transport = new HttpTransport({
baseUrl: process.env.NEOTOMA_URL!,
token: process.env.NEOTOMA_TOKEN!,
});
const claudeWithMemory = withMemory(
async (userMessage, ctx) => {
const systemContext = ctx.retrieved.length
? `Known entities relevant to this message:\n${JSON.stringify(ctx.retrieved, null, 2)}`
: "";
const response = await anthropic.messages.create({
model: "claude-opus-4-7",
max_tokens: 1024,
system: systemContext,
messages: [{ role: "user", content: userMessage }],
});
const block = response.content[0];
return block.type === "text" ? block.text : "";
},
{ transport, conversationId: "claude-session-1", platform: "claude-api" }
);
const { assistantMessage } = await claudeWithMemory("Tell me about Acme Corp.");
Example: OpenAI
import OpenAI from "openai";
import { HttpTransport } from "@neotoma/client";
import { withMemory } from "@neotoma/agent";
const openai = new OpenAI();
const transport = new HttpTransport({
baseUrl: process.env.NEOTOMA_URL!,
token: process.env.NEOTOMA_TOKEN!,
});
const gptWithMemory = withMemory(
async (userMessage, ctx) => {
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [
{
role: "system",
content: ctx.retrieved.length
? `Known entities relevant to this message:\n${JSON.stringify(ctx.retrieved, null, 2)}`
: "You are a helpful assistant.",
},
{ role: "user", content: userMessage },
],
});
return response.choices[0].message.content ?? "";
},
{ transport, conversationId: "gpt-session-1", platform: "openai" }
);
const { assistantMessage } = await gptWithMemory("Tell me about Acme Corp.");
Explicit lifecycle control
For streaming, multi-step tool loops, or any case where withMemory is too coarse, use NeotomaMemory directly:
import { NeotomaMemory } from "@neotoma/agent";
const memory = new NeotomaMemory({ transport, conversationId, platform: "my-agent" });
const opened = await memory.openTurn({ turnId, userMessage });
// ... run any agent loop, possibly multiple LLM calls and tool invocations,
// using opened.retrieved as context ...
await memory.closeTurn({
turnId,
assistantMessage,
refersTo: opened.retrievedEntityIds,
});
What this package gives you over @neotoma/client
@neotoma/client is the low-level REST/local transport — it knows nothing about turn lifecycles. @neotoma/agent is the protocol layer that enforces:
- Bounded retrieval before write.
conversation/conversation_messageentity shapes.PART_OFedges from messages to conversation.REFERS_TOedges from messages to retrieved / cited entities.- Deterministic idempotency keys per
(conversation_id, turn_id, role).
These are easy to get wrong by hand. The SDK gets them right by construction.
Coverage vs. MCP instructions
@neotoma/agent enforces the mechanical protocol described in the
agent instructions
— turn ordering, entity shapes, relationships, idempotency. It does not enforce behavioral policies (QA reflection, display-rule output, issue-filing consent, PII stripping in issue bodies, custom entity extraction beyond the user / assistant message pair). Those remain agent-side concerns; see the agent instructions for the full contract.
See also
- Python SDK — the same protocol layer for Python agents.
- Agent instructions — full behavioral contract for agents.
- MCP reference — when not using the SDK directly.