Pattern 1: Managed Agent Runtime
The highest-level path. You hand Sonzai a user message, Sonzai returns a streamed reply, and memory, mood, personality, and tool execution all happen on our side.
You point your app at client.agents.chat (or open an explicit session with
sessions.start → chat turns → sessions.end). Sonzai assembles the system
prompt from the agent's identity, recalls relevant memories, runs the LLM,
streams tokens back, executes any registered tools, and updates state — all
in a single call. You write the least code in this pattern. It is the
right default for chat companions, support agents, and anything where Sonzai
owning the full agent loop is acceptable.
When to use this
- You want one HTTP call per turn and zero memory plumbing.
- You're happy letting Sonzai pick (or accept your override of) the LLM provider.
- You want personality, mood, memory, voice, KB search, and proactive notifications to all "just work" without orchestrating them yourself.
When to switch
- Your own LLM is non-negotiable — switch to Pattern 4: Standalone Realtime.
- Conversation already happens off-platform (recorded calls, transcripts, batch ingest) — switch to Pattern 5: Standalone Batch.
- Inside Claude Desktop / Cursor / an MCP-compatible IDE — switch to Pattern 2: MCP.
- Already using OpenClaw — switch to Pattern 3: OpenClaw.
Architecture
┌─────────────┐ ┌───────────────────────────────────┐
│ Your App │ │ Sonzai │
└──────┬──────┘ └──────────────────┬────────────────┘
│ │
│ agents.chat({ messages }) │
│───────────────────────────────>│ • assemble context
│ │ (memory, mood,
│ │ personality, KB,
│ │ relationship)
│ │ • run LLM (your choice
│ │ of provider/model)
│ │ • execute registered
│ │ tools (if any)
│ <── SSE stream ───────────────│ • write back: facts,
│ tokens + done │ mood, personality,
│ │ goals, habits
│ │
│ (optional) sessions.end │
│───────────────────────────────>│ • consolidate, dedup,
│ │ diary, clustering
End-to-end snippet
The simplest complete flow: open an explicit session, drive a streaming chat, end the session.
import { Sonzai } from "@sonzai-labs/agents";
const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });
const AGENT_ID = "agent-uuid";
const USER_ID = "user-123";
const SESSION_ID = crypto.randomUUID();
// 1. Start an explicit session (optional — agents.chat will auto-create one
// if you don't, but explicit sessions let you scope tools and lifecycle).
await sonzai.agents.sessions.start(AGENT_ID, {
userId: USER_ID,
sessionId: SESSION_ID,
});
// 2. Drive turns. Sonzai owns context assembly, the LLM call, tool exec,
// and writeback. You stream the reply straight to your UI.
for await (const event of sonzai.agents.chatStream({
agent: AGENT_ID,
sessionId: SESSION_ID,
userId: USER_ID,
messages: [{ role: "user", content: "Hi! How's your day going?" }],
language: "en",
})) {
process.stdout.write(event.choices?.[0]?.delta?.content ?? "");
}
// 3. End the session — triggers fact extraction + consolidation.
await sonzai.agents.sessions.end(AGENT_ID, {
userId: USER_ID,
sessionId: SESSION_ID,
totalMessages: 2,
});Skip the explicit session
If you don't call sessions.start, Sonzai opens one on the first
agents.chat call and closes it on idle. The session ID still flows
through to extracted facts. Use the explicit lifecycle when you need
session-scoped tools, predictable boundaries, or replay semantics.