Debugging mode

When you're debugging infrastructure

Two runs. Same inputs. Different state. No replay, no diff, no explanation.

You're in this mode when something breaks and you need to understand why. Your observability stack watches everything except the thing that matters: what the agent believed and why. Debugging means reconstructing truth from scattered logs and hoping to reproduce the issue. Neotoma gives you replayable state so debugging takes minutes, not days.

Escaping

Log archaeologist — reverse-engineering truth from logs

Into

Platform engineer with replayable state

Manual orchestration → declarative trust

Tax you pay

Writing glue (checkpoint logic, custom diffing, state serialization)

What you get back

Debugging speed, platform design time, sleep

Same question, different outcome

Without a state layer, agents return stale or wrong data. With Neotoma, every response reads from versioned, schema-bound state.

Reproducibility
without state layer
Replay yesterday's ingestion pipeline.
Pipeline completed. 3 entity conflicts unresolved.
with state layer
Replay yesterday's ingestion pipeline.
Pipeline replayed deterministically. State matches v47.

Same pipeline, different results

Two runs of the same pipeline with identical inputs returned different state. Without versioned state, there was no way to detect or explain the drift.

Visibility
without state layer
What changed on entity acme-config since deploy?
No changes detected.
with state layer
What changed on entity acme-config since deploy?
2 mutations: field 'rate_limit' updated at 14:32, 'region' at 14:38.

Invisible overwrite, broken downstream

An upstream agent silently overwrote a shared record. Downstream consumers read stale data and produced incorrect output. The change was invisible.

Compliance & audit
without state layer
Trace output of eval run 2841 to source.
Source data unavailable. Log retention expired.
with state layer
Trace output of eval run 2841 to source.
Output traces to observations #4091, #4092. Full chain available.

Missing provenance, failed audit

An evaluation needed to trace an agent's output to its source data. Without an immutable log, the trail had to be reconstructed manually from scattered logs.

State reconstruction
without state layer
Reconstruct agent state at 03:12 UTC crash.
State unavailable. Last checkpoint: 22:00 UTC.
with state layer
Reconstruct agent state at 03:12 UTC crash.
State reconstructed from 847 observations. Timeline to 03:12 ready.

Can't reconstruct state after failure

A production agent crashed mid-run. In-memory state was lost. Without an append-only log, there was no way to reconstruct what the agent knew at the time of failure.

Why this happens

Same inputs produce different state across runs — no way to detect why
State changes are invisible to your observability stack
Debugging means reading logs and guessing at what the agent believed
No trail connecting state changes to the pipeline step that caused them
Memory locked to one vendor's runtime; switching means starting over
Agent state routed through services with no data residency guarantees

Failure modes without a memory guarantee

Agent runs aren't reproducible
State mutates invisibly between sessions
Can't trace output back to source data
State drifts depending on processing order
No proof of data residency for compliance
State layer locked to one vendor

Agent runs aren't reproducible

Two runs with identical inputs produce different results. State changes between sessions are invisible - no versioned history to compare, no log to replay. Debugging means reading logs and guessing.

State changes are invisible

Values overwrite in place. When a record changes, the previous value is gone. No diff, no provenance, no way to know which agent or pipeline step introduced the change.

No audit trail for compliance or evaluation

Evaluation needs to compare outputs against known-good state. Compliance needs to trace decisions to source data. Without an immutable log, neither is possible without manual reconstruction.

Memory locked to one vendor's runtime

Each agent runtime ships its own memory: none portable, none interoperable. Switching frameworks means rebuilding state management from scratch.

No data residency guarantees

Agent state flows through third-party APIs with no contractual guarantee about where it's stored or who can access it. For teams with compliance obligations, opaque provider memory is a gap that manual audits cannot close.

If you can't replay an agent run, you can't debug it. If you can't debug it, you can't iterate. Neotoma makes agent state inspectable, diffable, and replayable - so your debugging cycle is minutes, not days.

AI needs

What you need from your AI tools, and what current tools don't provide.

How Neotoma solves this

Neotoma replaces the glue you write by hand — checkpoint logic, state serialization, custom diffing — with primitives that just work. Replay any timeline, diff any state, trace any output to its source.

What actually changes

You stop writing glue. Checkpoint logic, state serialization, custom diffing, retry handlers - the guarantees you've been hand-rolling become primitives you build on.

When something fails, you query the timeline instead of reconstructing it from logs. Post-mortems take thirty minutes because provenance answers "what changed and when" directly.

The job shifts from reactive firefighting to proactive platform design.

Key differences

How your needs differ from Building pipelines:

  • Focus: infrastructure and platform layer - below application agents, above compute
  • Adoption motion: evaluate guarantees first, then standardize across teams
  • Success metric: reproducible runs and auditability, not just better agent outputs

Data types for better remembrance

The entity types you'll store most often.

agent_session

Session state with versioned context and accumulated facts

action

Agent actions with inputs, outputs, timestamps, and provenance

pipeline

Multi-step workflows with step-level state tracking

evaluation

Eval results, benchmarks, and regression tracking

audit_event

Immutable log of state transitions and corrections

tool_config

Agent tool configurations and runtime parameters

entity_graph

Resolved records with typed relationships and temporal evolution

runbook

Operational procedures and agent behavioral rules with version history

When you don't need this

If your agents are stateless request-response (no accumulated context, no record tracking), standard logging and tracing are sufficient. Neotoma is for when agents accumulate state across sessions and pipeline steps, and you need that state to be reproducible, traceable, and auditable.

Other modes

The same person operates in multiple modes. The tax differs; the architecture that removes it is the same.

The tax is writing glue: checkpoint logic, custom diffing, state serialization. Neotoma removes that tax and gives you primitives to build on instead.

Built because I hit every failure mode on this page while running a twelve-server agent stack against a production monorepo.

Deep dive: Building structural barriers that incumbents can't copy