三种集成形态 —— 一次性批处理（/process）、生命周期批处理（sessions.start → end 携带 messages），或实时逐轮（sessions.start → turn → end）。三者都能让你拿到 Sonzai 的行为智能，同时不放弃对对话循环的控制。

选择集成形态

There are three ways to feed conversations into Sonzai. The first two are batch (you send a transcript after the conversation); the third is real-time (you submit each turn as it happens). Pick exactly one per conversation — chaining them runs extraction twice on the same messages.

A. /process — one-shot batch

Single call. Auto-creates a session if you don't pass one. Best for external LLM transcripts, benchmarks, and any flow without a long-lived session lifecycle.

B. sessions.start → end({ messages }) — lifecycle batch

Open a session, do your full conversation off-platform, then close with the transcript on .end(). Use when you want explicit session boundaries, async polling, or session-scoped tools — but still ingest in one shot.

C. sessions.start → turn() × N → end() — real-time

Open a session and submit each exchange via .turn() as the conversation happens. Sync mood lands inline (~300–500ms); deeper extraction runs asynchronously 5–15s later. Best for chat companions, voice AI, and agent frameworks.

	A. `/process`	B. `sessions.end({ messages })`	C. `sessions.turn()` × N
Calls per conversation	1	2 (`start` + `end`)	2 + N (`start` + N × `turn` + `end`)
Sonzai in the hot path?	No	No	Yes — `.context()` and `.turn()` flank each turn
Context per turn	Pre-session only (optional `getContext` call)	Pre-session only (optional `getContext` call)	Fresh, query-specific via `.context()`
Extraction timing	Whole transcript, inline	Whole transcript, inline (or async on tenants where enabled)	Per-turn — sync mood inline, deeper extraction 5–15s later
Lifecycle ownership	Implicit (auto-session)	Explicit	Explicit
Best for	External transcripts, benchmarks, no-lifecycle ingest	Explicit boundaries + async processing, session-scoped tools, batch ingest	Chat companions, voice AI, agent frameworks

A and B are functionally equivalent for fact extraction — both extract facts and side-effects from the full transcript inline. The only differences are lifecycle ergonomics (B gives you an explicit session and supports async polling) and call count.

C is a different shape: Sonzai is part of every turn instead of seeing the conversation only at the end.

Don't mix shapes within one conversation

Calling .turn() per turn (C) and .end({ messages }) with the same transcript (B) extracts the same messages twice. Pick one shape per conversation. The pattern docs below show C and B/A separately.

The rest of this section groups A and B together as Pattern 2: Post-Session Processing (since they share the same "extract a transcript at the end" semantics) and treats C as Pattern 1: Memory Middleware (real-time turn submission).

各层何时运行 —— 抽取轻量，整合自动

/turn, /process, and sessions.end are intentionally lightweight. They extract facts and a session summary from the transcript and persist them — that's it. The expensive work (cross-session dedup, clustering, diary deepening, decay) is scheduled automatically by the platform and is rate-limited so it doesn't run on every call.

Layer	When it runs	Triggered by	Cost
Sync mood update (Pattern 1 `/turn` only)	Inline, ~300–500ms	Your `.turn()` call	Light — one short LLM call
Background extraction (facts, personality, habits)	5–15 seconds after `/turn`	Automatic — no caller action	Light — one LLM call per chunk
Fact extraction + session summary (batch)	Inline, on every `/process` or `sessions.end({ messages })`	Your call	Light — one LLM call per chunk
Post-session consolidation (dedup, crossref, bundle precompute, pattern detection)	~8 hours after the session ends	Automatic	Medium
Daily consolidation + diary	Once per day	Automatic schedule	Medium
Deep consolidation (wakeup/habit dedup, decay, cluster reconcile, weekly summaries)	Daily / weekly	Automatic schedule	Heavy

This means you can call /turn per turn (Pattern 1), or /process once at the end (Pattern 2), without paying for heavy consolidation each time. The platform de-duplicates and consolidates in the background.

Practical implication

Don't try to "save calls" by skipping /turn between turns. Each call only does sync mood + queues deferred extraction (cheap). Skipping it means losing per-turn behavioral signal. The expensive consolidation runs on its own schedule no matter how many times you call.

下一步去哪

Pattern 1: Memory Middleware (real-time)

Per-turn integration for chat companions, voice AI, and agent frameworks. Includes tool calling and multimodal/image handling.

Pattern 2: Post-Session Batch Processing

One-shot ingest via /process or lifecycle-scoped via sessions.end({ messages }). For tutoring, fitness, CRM, journaling, and any flow that doesn't need Sonzai in the hot path.

Endpoint Walkthrough

Reference for sessions.start, session.context, session.turn, /process, sessions.end, and the read endpoints (memory, mood, personality, goals, habits, notifications).

Knowledge Base & Limitations

How the KB shows up in standalone mode and what's not supported vs. managed mode.

独立记忆层

选择集成形态

各层何时运行 —— 抽取轻量，整合自动

下一步去哪

On this page