Skip to main content

Pattern 1: Managed Agent Runtime

The highest-level path. You hand Sonzai a user message, Sonzai returns a streamed reply, and memory, mood, personality, and tool execution all happen on our side.

You point your app at client.agents.chat (or open an explicit session with sessions.start → chat turns → sessions.end). Sonzai assembles the system prompt from the agent's identity, recalls relevant memories, runs the LLM, streams tokens back, executes any registered tools, and updates state — all in a single call. You write the least code in this pattern. It is the right default for chat companions, support agents, and anything where Sonzai owning the full agent loop is acceptable.

When to use this

  • You want one HTTP call per turn and zero memory plumbing.
  • You're happy letting Sonzai pick (or accept your override of) the LLM provider.
  • You want personality, mood, memory, voice, KB search, and proactive notifications to all "just work" without orchestrating them yourself.

When to switch

Architecture

┌─────────────┐     ┌───────────────────────────────────┐
│  Your App   │     │            Sonzai                 │
└──────┬──────┘     └──────────────────┬────────────────┘
     │                                │
     │  agents.chat({ messages })     │
     │───────────────────────────────>│  • assemble context
     │                                │     (memory, mood,
     │                                │      personality, KB,
     │                                │      relationship)
     │                                │  • run LLM (your choice
     │                                │      of provider/model)
     │                                │  • execute registered
     │                                │      tools (if any)
     │  <── SSE stream ───────────────│  • write back: facts,
     │      tokens + done             │      mood, personality,
     │                                │      goals, habits
     │                                │
     │  (optional) sessions.end       │
     │───────────────────────────────>│  • consolidate, dedup,
     │                                │      diary, clustering

End-to-end snippet

The simplest complete flow: open an explicit session, drive a streaming chat, end the session.

import { Sonzai } from "@sonzai-labs/agents";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });
const AGENT_ID  = "agent-uuid";
const USER_ID   = "user-123";
const SESSION_ID = crypto.randomUUID();

// 1. Start an explicit session (optional — agents.chat will auto-create one
//    if you don't, but explicit sessions let you scope tools and lifecycle).
await sonzai.agents.sessions.start(AGENT_ID, {
userId:    USER_ID,
sessionId: SESSION_ID,
});

// 2. Drive turns. Sonzai owns context assembly, the LLM call, tool exec,
//    and writeback. You stream the reply straight to your UI.
for await (const event of sonzai.agents.chatStream({
agent:     AGENT_ID,
sessionId: SESSION_ID,
userId:    USER_ID,
messages:  [{ role: "user", content: "Hi! How's your day going?" }],
language:  "en",
})) {
process.stdout.write(event.choices?.[0]?.delta?.content ?? "");
}

// 3. End the session — triggers fact extraction + consolidation.
await sonzai.agents.sessions.end(AGENT_ID, {
userId:        USER_ID,
sessionId:     SESSION_ID,
totalMessages: 2,
});

Skip the explicit session

If you don't call sessions.start, Sonzai opens one on the first agents.chat call and closes it on idle. The session ID still flows through to extracted facts. Use the explicit lifecycle when you need session-scoped tools, predictable boundaries, or replay semantics.

Where to next

On this page