Pattern 5: Standalone Memory (Batch)
One call after the conversation is done. Ship the full transcript to /process (or sessions.end with messages) and let Sonzai extract facts, mood, personality, and habits in the background.
You own the entire conversation. Sonzai never sees it in real time. When the conversation ends — call wraps, support case closes, journaling session finishes — you POST the transcript once and Sonzai's extractor turns it into facts, mood updates, personality drift, habit detection, and proactive-outreach signal. Best for tutoring, fitness, CRM, voice calls, journaling, and any flow where Sonzai in the hot path is undesirable or impossible.
When to use this
- Latency budget can't tolerate a per-turn
/turnround-trip. - The transcript already exists (recorded calls, Gong/Zoom exports, journal entries).
- You want bulk ingest after the fact — replay logs, migrate users, benchmark agent quality.
When to switch
- You want fresh per-turn context — Pattern 4: Standalone Realtime.
- You're happy ceding the LLM call too — Pattern 1: Managed Runtime.
Architecture
┌─────────────┐ ┌──────────────────┐ ┌──────────────┐
│ Your App │ │ Sonzai API │ │ Your LLM │
└──────┬──────┘ └────────┬─────────┘ └──────┬───────┘
│ │ │
│ GET /context │ │
│────────────────────>│ (optional pre-session │
│ <── user profile ──│ personalization) │
│ │ │
│ ══ Your conversation (Sonzai not involved) ═════════│
│ │ │ │
│ Chat ──────────────┼──────────────────────>│ │
│ <── reply ─────────┼───────────────────────│ │
│ [N turns, your loop, your tools] │ │
│ │ │ │
│ ════════════════════════════════════════════════════════│
│ │ │
│ /process or sessions.end({ messages }) │
│────────────────────>│── extract facts, │
│ (full transcript) │ personality, mood, │
│ │ habits, interests │
│ <── extractions ───│ (Sonzai LLM) │
│ │ │
│ Use insights │ │
│ (push notif, │ │
│ dashboard, │ │
│ exercises, …) │ │
└─────────────────────┴───────────────────────┘
End-to-end snippet
The simplest path is /process: one call, auto-creates the session,
returns the generated session_id for correlation. Use the explicit
sessions.start → end({ messages }) lifecycle when you need
session-scoped tools, durations, or async polling.
import { Sonzai } from "@sonzai-labs/agents";
const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });
async function ingestTranscript(
agentId: string,
userId: string,
transcript: { role: "user" | "assistant" | "tool"; content: string; tool_calls?: any[] }[],
) {
// One call. Auto-creates a session. Tool messages allowed.
const result = await sonzai.agents.process(agentId, {
userId,
messages: transcript,
provider: "gemini", // optional override
model: "gemini-3.1-flash-lite-preview", // optional override
});
// result.session_id is the auto-created session id.
// Pull extractions from the read endpoints when ready:
const memory = await sonzai.agents.memory.list(agentId, { userId });
const mood = await sonzai.agents.getMood(agentId, { userId });
return { sessionId: result.session_id, memory, mood };
}Pick one trigger, not both
/process and sessions.end({ messages }) are functionally equivalent
for batch ingest — both extract facts and side effects from the full
transcript inline. Don't do both for the same transcript or
extraction runs twice. Use /process for the simple one-call shape.
Use sessions.start + sessions.end({ messages }) when you want
explicit lifecycle, async polling, or session-scoped tools.
What runs when
/process and sessions.end are intentionally lightweight: extract
facts and a session summary inline (one LLM call per chunk). The
expensive cross-session work (dedup, clustering, diary, decay) is
scheduled automatically by the platform — you don't pay for it on
every call.
Where to next
Pattern 4: Standalone Memory (Real-Time)
You own the LLM and the chat loop. Sonzai owns memory, mood, personality, and relationships. Per-turn — sessions.start → loop of session.context() + your LLM + session.turn() → sessions.end.
Personality System
Create agents with distinct personalities and watch them evolve through interaction.