Three integration shapes — one-shot batch (/process), lifecycle-scoped batch (sessions.start → end with messages), or real-time turn-by-turn (sessions.start → turn → end). All three give you Sonzai's behavioral intelligence without giving up control of the chat loop.

Choosing Your Integration Shape

There are three ways to feed conversations into Sonzai. The first two are batch (you send a transcript after the conversation); the third is real-time (you submit each turn as it happens). Pick exactly one per conversation — chaining them runs extraction twice on the same messages.

A. /process — one-shot batch

Single call. Auto-creates a session if you don't pass one. Best for external LLM transcripts, benchmarks, and any flow without a long-lived session lifecycle.

B. sessions.start → end({ messages }) — lifecycle batch

Open a session, do your full conversation off-platform, then close with the transcript on .end(). Use when you want explicit session boundaries, async polling, or session-scoped tools — but still ingest in one shot.

C. sessions.start → turn() × N → end() — real-time

Open a session and submit each exchange via .turn() as the conversation happens. Sync mood lands inline (~300–500ms); deeper extraction runs asynchronously 5–15s later. Best for chat companions, voice AI, and agent frameworks.

	A. `/process`	B. `sessions.end({ messages })`	C. `sessions.turn()` × N
Calls per conversation	1	2 (`start` + `end`)	2 + N (`start` + N × `turn` + `end`)
Sonzai in the hot path?	No	No	Yes — `.context()` and `.turn()` flank each turn
Context per turn	Pre-session only (optional `getContext` call)	Pre-session only (optional `getContext` call)	Fresh, query-specific via `.context()`
Extraction timing	Whole transcript, inline	Whole transcript, inline (or async on tenants where enabled)	Per-turn — sync mood inline, deeper extraction 5–15s later
Lifecycle ownership	Implicit (auto-session)	Explicit	Explicit
Best for	External transcripts, benchmarks, no-lifecycle ingest	Explicit boundaries + async processing, session-scoped tools, batch ingest	Chat companions, voice AI, agent frameworks

A and B are functionally equivalent for fact extraction — both extract facts and side-effects from the full transcript inline. The only differences are lifecycle ergonomics (B gives you an explicit session and supports async polling) and call count.

C is a different shape: Sonzai is part of every turn instead of seeing the conversation only at the end.

Don't mix shapes within one conversation

Calling .turn() per turn (C) and .end({ messages }) with the same transcript (B) extracts the same messages twice. Pick one shape per conversation. The pattern docs below show C and B/A separately.

The rest of this section groups A and B together as Pattern 2: Post-Session Processing (since they share the same "extract a transcript at the end" semantics) and treats C as Pattern 1: Memory Middleware (real-time turn submission).

What runs when — extraction is light, consolidation is automatic

/turn, /process, and sessions.end are intentionally lightweight. They extract facts and a session summary from the transcript and persist them — that's it. The expensive work (cross-session dedup, clustering, diary deepening, decay) is scheduled automatically by the platform and is rate-limited so it doesn't run on every call.

Layer	When it runs	Triggered by	Cost
Sync mood update (Pattern 1 `/turn` only)	Inline, ~300–500ms	Your `.turn()` call	Light — one short LLM call
Background extraction (facts, personality, habits)	5–15 seconds after `/turn`	Automatic — no caller action	Light — one LLM call per chunk
Fact extraction + session summary (batch)	Inline, on every `/process` or `sessions.end({ messages })`	Your call	Light — one LLM call per chunk
Post-session consolidation (dedup, crossref, bundle precompute, pattern detection)	~8 hours after the session ends	Automatic	Medium
Daily consolidation + diary	Once per day	Automatic schedule	Medium
Deep consolidation (wakeup/habit dedup, decay, cluster reconcile, weekly summaries)	Daily / weekly	Automatic schedule	Heavy

This means you can call /turn per turn (Pattern 1), or /process once at the end (Pattern 2), without paying for heavy consolidation each time. The platform de-duplicates and consolidates in the background.

Practical implication

Don't try to "save calls" by skipping /turn between turns. Each call only does sync mood + queues deferred extraction (cheap). Skipping it means losing per-turn behavioral signal. The expensive consolidation runs on its own schedule no matter how many times you call.

Where to next

Pattern 1: Memory Middleware (real-time)

Per-turn integration for chat companions, voice AI, and agent frameworks. Includes tool calling and multimodal/image handling.

Pattern 2: Post-Session Batch Processing

One-shot ingest via /process or lifecycle-scoped via sessions.end({ messages }). For tutoring, fitness, CRM, journaling, and any flow that doesn't need Sonzai in the hot path.

Endpoint Walkthrough

Reference for sessions.start, session.context, session.turn, /process, sessions.end, and the read endpoints (memory, mood, personality, goals, habits, notifications).

Knowledge Base & Limitations

How the KB shows up in standalone mode and what's not supported vs. managed mode.

Standalone Memory Layer

Choosing Your Integration Shape

What runs when — extraction is light, consolidation is automatic

Where to next

On this page