TypeScript SDK (@neotoma/agent)

The @neotoma/agent package is the protocol-enforcing agent harness SDK for Neotoma. It wraps

@neotoma/client

with the canonical store-first turn protocol so developers building agents with any LLM provider get correct Neotoma memory behavior without learning the interaction rules by hand.

Provider-agnostic. This package does not call any LLM. Compose it around any agent loop — Claude, OpenAI, custom HTTP, whatever.

Install

npm install @neotoma/agent @neotoma/client

When to reach for this package

  • You are building an agent that should persist its turns to Neotoma.
  • You want correct conversation / conversation_message shapes, PART_OF and REFERS_TO edges, and deterministic idempotency keys by construction — not by hand.
  • You do not want your agent code to learn the Neotoma turn lifecycle.

If you only need to write a single ad-hoc record, use @neotoma/client directly. If you are wiring a host harness (Claude Code, Cursor, OpenCode, Codex), use the dedicated hook plugin — it already depends on @neotoma/agent under the hood.

Quick start (provider-agnostic)

import { HttpTransport } from "@neotoma/client";
import { withMemory } from "@neotoma/agent";

const wrapped = withMemory(
  async (userMessage, ctx) => {
    // ctx.retrieved gives you entities Neotoma already knows that match the
    // user message. Use them however you want — build the prompt, ground the
    // reply, mention them to the model. Up to you.
    return await yourAgent(userMessage, ctx.retrieved);
  },
  {
    transport: new HttpTransport({
      baseUrl: process.env.NEOTOMA_URL!,
      token: process.env.NEOTOMA_TOKEN!,
    }),
    conversationId: "conv-2026-05-20",
    platform: "my-agent",
  }
);

const { assistantMessage } = await wrapped("What do we know about Acme Corp?");

On every call, the wrapper:

  1. Bounded retrieval — extracts identifiers from the user message and looks them up in Neotoma (best-effort; falls back to empty set on failure).
  2. Stores the user message as a conversation_message PART_OF the conversation, with REFERS_TO edges to retrieved entities.
  3. Invokes your agent with the retrieved set available on ctx.retrieved.
  4. Stores the assistant reply the same way, with REFERS_TO edges to cited entities.
  5. Idempotency keys are deterministic per turn — re-runs collapse onto one observation.

Example: Claude (Anthropic SDK)

import Anthropic from "@anthropic-ai/sdk";
import { HttpTransport } from "@neotoma/client";
import { withMemory } from "@neotoma/agent";

const anthropic = new Anthropic();
const transport = new HttpTransport({
  baseUrl: process.env.NEOTOMA_URL!,
  token: process.env.NEOTOMA_TOKEN!,
});

const claudeWithMemory = withMemory(
  async (userMessage, ctx) => {
    const systemContext = ctx.retrieved.length
      ? `Known entities relevant to this message:\n${JSON.stringify(ctx.retrieved, null, 2)}`
      : "";

    const response = await anthropic.messages.create({
      model: "claude-opus-4-7",
      max_tokens: 1024,
      system: systemContext,
      messages: [{ role: "user", content: userMessage }],
    });

    const block = response.content[0];
    return block.type === "text" ? block.text : "";
  },
  { transport, conversationId: "claude-session-1", platform: "claude-api" }
);

const { assistantMessage } = await claudeWithMemory("Tell me about Acme Corp.");

Example: OpenAI

import OpenAI from "openai";
import { HttpTransport } from "@neotoma/client";
import { withMemory } from "@neotoma/agent";

const openai = new OpenAI();
const transport = new HttpTransport({
  baseUrl: process.env.NEOTOMA_URL!,
  token: process.env.NEOTOMA_TOKEN!,
});

const gptWithMemory = withMemory(
  async (userMessage, ctx) => {
    const response = await openai.chat.completions.create({
      model: "gpt-4o",
      messages: [
        {
          role: "system",
          content: ctx.retrieved.length
            ? `Known entities relevant to this message:\n${JSON.stringify(ctx.retrieved, null, 2)}`
            : "You are a helpful assistant.",
        },
        { role: "user", content: userMessage },
      ],
    });
    return response.choices[0].message.content ?? "";
  },
  { transport, conversationId: "gpt-session-1", platform: "openai" }
);

const { assistantMessage } = await gptWithMemory("Tell me about Acme Corp.");

Explicit lifecycle control

For streaming, multi-step tool loops, or any case where withMemory is too coarse, use NeotomaMemory directly:

import { NeotomaMemory } from "@neotoma/agent";

const memory = new NeotomaMemory({ transport, conversationId, platform: "my-agent" });

const opened = await memory.openTurn({ turnId, userMessage });

// ... run any agent loop, possibly multiple LLM calls and tool invocations,
// using opened.retrieved as context ...

await memory.closeTurn({
  turnId,
  assistantMessage,
  refersTo: opened.retrievedEntityIds,
});

What this package gives you over @neotoma/client

@neotoma/client is the low-level REST/local transport — it knows nothing about turn lifecycles. @neotoma/agent is the protocol layer that enforces:

  • Bounded retrieval before write.
  • conversation / conversation_message entity shapes.
  • PART_OF edges from messages to conversation.
  • REFERS_TO edges from messages to retrieved / cited entities.
  • Deterministic idempotency keys per (conversation_id, turn_id, role).

These are easy to get wrong by hand. The SDK gets them right by construction.

Coverage vs. MCP instructions

@neotoma/agent enforces the mechanical protocol described in the

agent instructions

— turn ordering, entity shapes, relationships, idempotency. It does not enforce behavioral policies (QA reflection, display-rule output, issue-filing consent, PII stripping in issue bodies, custom entity extraction beyond the user / assistant message pair). Those remain agent-side concerns; see the agent instructions for the full contract.

See also