Advance Time

Advance Time compresses real-world time into simulated time for an agent. Useful for character AI that needs in-game time to pass faster than real time, game loops that simulate days of agent state in seconds, or anywhere you want to see what the agent would be like after a period of elapsed time — without actually waiting for it.

What you can build with it

Character AI / visual novel time skips — the protagonist sleeps for 8 hours; advance agent time by 8 hours and get the diary entry and mood changes that would have happened overnight
Tamagotchi and life-sim game loops — in-game days pass faster than real time; call advanceTime each tick to keep agent state (mood, memory, habits) in sync with the game clock
Tutorial onboarding — show a new user what their companion will "remember" after a week by fast-forwarding through a sample history before they send their first real message
Deterministic replay — reproduce the exact agent state after X hours at any time, for debugging, snapshotting, or building a save/load system
Eval and benchmarking — compress long-running scenarios into fast test runs (see Also useful for evaluation below)

Quickstart

Advance an agent's clock by 24 hours and inspect the diary entry generated for that simulated day.

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

const result = await client.workbench.advanceTime({
agentId: "agent_abc",
userId:  "user_123",
simulatedHours: 24,
});

// result.diary_entries contains any diary entries generated during the window
console.log(result.days_processed);   // 1
console.log(result.diary_entries);    // [{ content: "...", created_at: "..." }]

Core concepts

What fires when time advances

A single advanceTime call runs the full production background worker fleet for each complete 24-hour day in the window, then resolves any proactive wakeups due within it. Concretely:

Diary generation — one diary entry per simulated day, written from the agent's perspective
Mood decay — emotional state drifts toward the agent's baseline at the rate it would in real time
Memory consolidation — facts, events, and commitments are consolidated and deduplicated as they normally would be overnight
Constellation extraction — personality signals extracted from conversation history are processed on schedule
Scheduled wakeups — any wakeup whose scheduled_at falls inside the advance window fires with its intent

Pass simulatedHours: 25 (one day plus a sliver) when you need the weekly consolidation gate to tick over.

Deterministic state transitions

Given the same agent state at the start, the same advanceTime call produces the same output. There is no randomness seeded from wall-clock time. This makes Advance Time suitable for save/load, replay, and regression testing.

Async mode for long windows

For advances that would exceed a proxy read timeout (Cloudflare's limit is ~100 s, which corresponds to roughly 4–5 simulated days depending on agent complexity), pass runAsync: true. The API returns immediately with a job descriptor; poll getAdvanceTimeJob until the status is terminal.

// Kick off a long advance asynchronously
const job = await client.workbench.advanceTime({
agentId: "agent_abc",
userId:  "user_123",
simulatedHours: 168, // one week
runAsync: true,
}) as { job_id: string; status: string };

console.log(job.job_id, job.status); // "job_01HX...", "running"

// Poll until done (30-minute TTL in Redis)
let state = await client.workbench.getAdvanceTimeJob(job.job_id);
while (state.status === "running") {
await new Promise(r => setTimeout(r, 2000));
state = await client.workbench.getAdvanceTimeJob(job.job_id);
}

console.log(state.status); // "succeeded"
console.log(state.result); // full AdvanceTimeResponse

Time granularity

The smallest meaningful unit is one full 24-hour simulated day. Background jobs (diary, consolidation, constellation) run once per day. Sub-day advances (e.g. simulatedHours: 8) still process wakeups and mood decay but will not generate a diary entry unless a full day boundary is crossed.

Full API

Method	Returns	Description
`advanceTime(options)`	`AdvanceTimeResponse` — or `{ job_id, status }` when async	Advance simulated time. Key fields: `days_processed`, `diary_entries`, `wakeups_fired`, consolidation counters.
`getAdvanceTimeJob(jobId)`	`{ job_id, status, result?, error? }`	Poll an async advance-time job. `status` is `"running"`, `"succeeded"`, or `"failed"`. Job state has a 30-minute TTL.

advanceTime options

Field	Type	Required	Description
`agentId` / `agent_id`	`string`	Yes	Agent UUID
`userId` / `user_id`	`string`	Yes	User ID — must match the ID used during chat
`simulatedHours` / `simulated_hours`	`number`	Yes	Hours to advance
`simulatedBaseOffsetHours` / `simulated_base_offset_hours`	`number`	No	Hours already processed by prior calls in the same gap (default `0`)
`runAsync` / `run_async` / `async`	`boolean`	No	Return a job descriptor immediately instead of blocking (default `false`)

Combines with other features

With Scheduled Reminders — fast-forwarding pending reminders

Any schedule whose next_fire_at falls within the advance window fires automatically. Advance 48 hours and two daily reminders will have fired — their intents processed, messages generated, and state updated — exactly as if real time had passed.

// Create a daily 09:00 reminder
await client.schedules.create("agent_abc", "user_123", {
  cadence: { simple: { frequency: "daily", times: ["09:00"] }, timezone: "UTC" },
  intent:  "check in on how the user is feeling",
  check_type: "reminder",
});

// Advance 48 hours — both 09:00 fires trigger inside the window
const result = await client.workbench.advanceTime({
  agentId: "agent_abc",
  userId:  "user_123",
  simulatedHours: 48,
});

console.log(result.wakeups_fired); // 2

With Memory / Diary — replay compressed history

When time advances, a diary entry is generated for each simulated day. The agent "remembers" what happened during the gap — emotional tone, recurring themes, relationship developments — the same way it would after real days of conversation. Use this to give a new user a companion that already feels lived-in, or to let a character "grow" between chapters of a story.

With Wakeups — time-travel scheduled wakeups

Any wakeup scheduled with a scheduled_at inside the advance window fires during the advance, including its LLM-generated proactive message. This lets you test wakeup copy and timing without waiting for the real clock to reach the fire time.

Tutorials

Advance Time is a primitive that chains with scheduled reminders, wakeups, and memory. There is no standalone end-to-end tutorial yet. See the linked Mind Layer pages below for how it combines with other features.

Next steps

Scheduled Reminders — schedules fire automatically during a time advance
Memory — diary entries are generated for each simulated day in the window
Evaluation — if you are using Advance Time for benchmarking, see the eval workflow

Also useful for evaluation

If you are running a benchmark suite, advanceTime lets you compress long-running scenarios into fast test runs. Advance a simulated week in seconds, inspect the diary entries and mood state, then score the result. Pair with the evaluation workflow to measure agent behavior quality after arbitrary amounts of simulated elapsed time.

Agent Insights

As the agent talks to a user over time, it builds up a derived view of who they are — what they care about, what they're working toward, who's in their life, and how their mood trends. Agent Insights exposes that derived state as readable (and for some signals, writable) endpoints. These are not things you author; the context engine extracts them automatically from conversations.

Automatic — no setup required

All insight signals are produced by the context engine during and after each conversation. You do not need to call any write endpoint to populate them — they fill in on their own. The read endpoints on this page let you surface what the agent has learned.

What you can build with it

Personalized dashboards — show the user exactly what the agent has learned about them, building transparency and trust
Weekly wrap-ups — "here's what's on your mind this week" compiled from diary entries and top interests
Relationship-aware UX — surface the list of people the agent knows about so users can review or correct them
Goal-tracking integrations — sync agent-tracked goals to external task managers or CRMs after they are detected
Engagement health scoring — aggregate breakthroughs, mood trend, and habit streaks into a single user health metric

Quickstart

Fetch habits, goals, and interests for a user in one pass.

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

const agentId = "agent_abc";
const userId  = "user_123";

const [habits, goals, interests] = await Promise.all([
client.agents.listHabits(agentId, { userId }),
client.agents.listGoals(agentId, { userId }),
client.agents.getInterests(agentId, { userId }),
]);

console.log("habits:",    habits.habits.length);
console.log("goals:",     goals.goals.length);
console.log("interests:", interests.interests.length);

Core concepts

Derived, not authored. These signals are extracted from conversation text by the context engine. You do not push them in; the agent surfaces them automatically as it talks.

Per-instance scoping. Pass instanceId (TS/Python) or instanceID (Go) to filter results to a specific agent instance — useful when an agent is deployed in multiple scenarios or chat contexts for the same user.

Write endpoints for some signals. Goals and habits can be explicitly created, updated, or deleted when your application needs to drive a specific state (e.g., seeding a goal when a user starts onboarding, or marking a goal achieved after a purchase event). Interests, relationships, diary, constellation, and breakthroughs are read-only.

Read latency. Derived signals update at conversation turn-end, not in real time during a turn. Reads immediately after a chat call may not yet reflect the latest turn.

Full API

Read endpoints

Method	Go	Returns	Description
`listHabits(agentId, { userId?, instanceId? })`	`ListHabits`	`HabitsResponse`	Extracted recurring behaviors
`listGoals(agentId, { userId?, instanceId? })`	`ListGoals`	`GoalsResponse`	Active and achieved goals
`getInterests(agentId, { userId?, instanceId? })`	`GetInterests`	`InterestsResponse`	Topics and themes the user cares about
`getRelationships(agentId, { userId?, instanceId? })`	`GetRelationships`	`RelationshipsResponse`	People mentioned across conversations
`getDiary(agentId, { userId?, instanceId? })`	`GetDiary`	`DiaryResponse`	Agent-authored diary entries per session
`getConstellation(agentId, { userId?, instanceId? })`	`GetConstellation`	`ConstellationResponse`	Memory clusters (nodes, edges, insights)
`listBreakthroughs(agentId, { userId?, instanceId? })`	`ListBreakthroughs`	`BreakthroughsResponse`	Significant emotional or relationship milestones

Write endpoints (goals and habits)

Goals and habits support full CRUD. All other insight types are read-only.

Method	Go	Description
`createGoal(agentId, opts)`	`CreateGoal`	Seed a goal before or during a workflow
`updateGoal(agentId, goalId, opts)`	`UpdateGoal`	Change status, priority, or description
`deleteGoal(agentId, goalId, opts?)`	`DeleteGoal`	Soft-delete (abandon) a goal
`createHabit(agentId, opts)`	`CreateHabit`	Manually seed a habit
`updateHabit(agentId, habitName, opts)`	`UpdateHabit`	Update strength or description
`deleteHabit(agentId, habitName, opts?)`	`DeleteHabit`	Remove a habit

Habits

Habits are recurring behaviors the context engine detects across conversations — things like "user meditates in the morning" or "user reviews their tasks every Sunday." Each habit has a strength (0-1) that rises with observations and a formed flag that is set once the habit is considered stable.

const habits = await client.agents.listHabits("agent_abc", {
userId: "user_123",
});

for (const h of habits.habits) {
console.log(h.name, h.category, h.strength, h.formed);
}

Goals

Goals represent what the user is working toward. They are extracted automatically from conversation intent — "I want to run a 5K by June" becomes a goal with a type, title, and priority. Goals have a status field: active, achieved, or abandoned.

// Read
const goals = await client.agents.listGoals("agent_abc", { userId: "user_123" });
for (const g of goals.goals) {
console.log(g.title, g.status, g.priority);
}

// Seed a goal for a new workflow
const goal = await client.agents.createGoal("agent_abc", {
userId:      "user_123",
title:       "Complete onboarding",
description: "Finish all onboarding steps",
type:        "task",
priority:    1,
});

// Mark achieved after a business event
await client.agents.updateGoal("agent_abc", goal.goal_id, {
userId: "user_123",
status: "achieved",
});

Interests

Interests are topics and themes the context engine identifies as meaningful to the user — things like "machine learning", "hiking", or "Italian cooking." Unlike goals, interests have no lifecycle status; they accumulate over time.

const interests = await client.agents.getInterests("agent_abc", {
userId: "user_123",
});

for (const i of interests.interests) {
console.log(i.topic, i.category);
}

Relationships

Relationships are the people the user mentions across conversations — friends, family, colleagues, and others the agent has learned about. Each entry includes the person's name, their relationship to the user, and any context the agent has collected.

const rel = await client.agents.getRelationships("agent_abc", {
userId: "user_123",
});

for (const r of rel.relationships) {
console.log(r.name, r.relationship_type, r.context);
}

Diary

The diary contains agent-authored entries written at session end — reflections on what happened, what was learned, and how the relationship is evolving. Each entry is anchored to a session and a timestamp. Diary entries are the richest narrative signal available.

const diary = await client.agents.getDiary("agent_abc", {
userId: "user_123",
});

for (const entry of diary.entries) {
console.log(entry.created_at, entry.content);
}

Constellation

The constellation is the agent's knowledge graph for a user — a set of nodes (concepts, people, themes) and edges (relationships between them) that the context engine builds from recurring patterns across memory. Nodes have a significance score and a node_type.

const c = await client.agents.getConstellation("agent_abc", {
userId: "user_123",
});

for (const node of c.nodes) {
console.log(node.label, node.node_type, node.significance);
}

Breakthroughs

Breakthroughs are significant relationship or emotional milestones detected by the platform — moments where the agent's understanding of the user meaningfully deepened, or where a notable shift in the relationship dynamic was recorded.

const bt = await client.agents.listBreakthroughs("agent_abc", {
userId: "user_123",
});

for (const b of bt.items) {
console.log(b.type, b.description, b.timestamp);
}

Combines with other features

With Memory — insights are summaries over raw facts

Insight signals are derived summaries; the underlying evidence lives in memory. Fetch habits to learn what patterns exist, then use memory.search to pull the raw conversation facts behind one of them.

const habits = await client.agents.listHabits("agent_abc", { userId: "user_123" });
const topHabit = habits.habits[0];

// Find the raw memories that support this habit
const facts = await client.agents.memory.search("agent_abc", {
userId: "user_123",
query:  topHabit.name,
limit:  10,
});

console.log(`Found ${facts.results.length} facts supporting "${topHabit.name}"`);

With Emotions — mood + insights for a full user picture

getMood and these insight endpoints together form the agent's complete understanding of a user at a point in time. Fetch both to power a user-facing "how the agent sees you" view or a support dashboard.

const [mood, goals, diary] = await Promise.all([
client.agents.getMood("agent_abc", { userId: "user_123" }),
client.agents.listGoals("agent_abc", { userId: "user_123" }),
client.agents.getDiary("agent_abc", { userId: "user_123" }),
]);

console.log("Current mood:", mood.label);
console.log("Active goals:", goals.goals.filter(g => g.status === "active").length);
console.log("Diary entries:", diary.entries.length);

With Advance Time — replay insight formation

Advance Time fast-forwards the context engine's processing — generating new diary entries, decaying mood, and updating derived signals — without waiting real time. This is useful for simulating what the agent would know after a period of elapsed time, and for testing insight endpoints against a populated state.

// Advance 7 days to populate diary entries and update insights
const result = await client.workbench.advanceTime({
  agentId: "agent_abc",
  userId:  "user_123",
  simulatedHours: 168,
});

// Now read the insights that formed during that window
const diary = await client.agents.getDiary("agent_abc", { userId: "user_123" });
console.log("Diary entries after 7d:", diary.entries.length);

Tutorials

Memory — the raw facts behind insights; use memory.search to drill into any signal
Emotions — mood, mood history, and aggregate mood statistics
Personality — Big5 traits and personality evolution (a different kind of derived state)
Advance Time — fast-forward the agent's processing to simulate elapsed time

从这里开始

架构

系统概览

心智层是一个独立平台，将智能体的智能（人格、记忆、情绪）与您的应用逻辑分离。任何后端都可以通过 REST API 或官方 SDK 进行集成。

您的后端                        心智层平台
   |                                  |
   |--- 创建智能体 ------------------>|
   |<-- 智能体 ID + 档案 ------------|
   |                                  |
   |--- 聊天 (SSE 流式) ------------>|
   |    (消息 + 应用上下文)           |-- 构建上下文
   |<-- 流式 AI 响应 ----------------|-- 流式 AI 响应
   |                                  |-- 更新记忆、情绪、人格
   |<-- 主动通知 --------------------|   (自动完成，无需额外调用)

集成架构

典型部署包含三个协同工作的层级：

您的前端

面向用户的应用。向后端发送消息并渲染智能体响应。示例：React、Next.js、Vue、移动应用。

您的后端

处理认证、应用状态、用户会话和业务逻辑。通过 SDK、REST API、MCP 或 OpenClaw 插件调用心智层实现 AI 交互。示例：Express、Django、Go、OpenClaw。

Sonzai 心智层

拥有智能体的智能：人格、记忆、情绪、习惯、目标和关系。一次聊天调用即可完成上下文组装、AI 流式响应和对话后学习。示例：api.sonz.ai。

平台管理的内容

每次聊天调用时，平台会自动从人格、记忆、情绪和关系数据中组装相关上下文，然后生成 AI 响应。对话后的状态更新自动完成 — 无需额外的 API 调用。

上下文组装

人格、情绪、记忆、关系叙事和应用状态 — 每次请求自动组装。

记忆提取

从每次对话中自动提取事实、事件和承诺并存储。

情绪与人格演化

情绪和大五人格根据交互模式自然漂移。

主动通知

智能体可以在会话之间安排主动触达。通过轮询或 Webhook 接收。

数据所有权

心智层和您的后端各自拥有不同的数据：

心智层拥有

智能体人格档案
记忆事实和摘要
情绪状态（快乐、精力、平静、亲密）
人格演化历史
习惯和目标
关系叙事
知识库实体和图谱
自定义智能体状态

您的后端拥有

用户认证
业务逻辑和工作流
用户档案和偏好
应用数据和状态
计费和订阅
权限和访问控制
会话管理

会话生命周期

1. 用户开始聊天
 您的后端准备应用上下文（用户数据、偏好...）

2. 对话进行
 您的后端 ---> Chat SDK 调用（上下文 + 消息）
 用户 <--- 流式 AI 响应 token

3. 对话结束
 平台更新：记忆、情绪、人格、习惯、关系

4. 会话间隔期
 平台运行：后台整合、情绪衰减、主动唤醒

后台处理与自我改进

平台不仅响应 chat 调用——它运行一个持续的后台管道，使记忆保持准确性、行为状态保持连贯性，并随时间提升检索质量。每个循环都自动运行；你方面无需调度或配线。

周期	运行内容
每轮	重要性+置信度更新、情绪调整、性格微偏移、习惯观察、关联强化、来源锚定检查
每次会话结束	带验证的事实抽取、重复整合、下一会话预测、检索策略更新、模式学习、会话质量打分、话题转换审计
日常	记忆衰减（重要性、置信度、关系、习惯）、记忆树自组织和修剪、深度整合、聚类调和、目标整合、反思性日记、收敛检查
每周	叙事弧压缩、关联衰减、交叉引用检测、新智能体–用户对的预热、学习节奏检查
持续	自适应检索预算、记忆恢复、回归预测、后台兴趣研究、周期性事件检测、智能记忆选择

有关每个机制的完整说明——包括可逆去重、话题转换检测、性格漂移安全上限、突破和谨慎部署系统——请参阅智能体如何随时间变得更聪明。

SDK 集成点

使用官方 SDK 与平台的每个部分交互：

Agents：create, get, list, update
Chat：chat, chatStream (SSE)
Memory：seed, search, list, browse, timeline, listFacts, reset
Personality：get, update, history
Mood：get, history, aggregate
Knowledge Base：createSchema, insertFacts, bulkUpdate, search, recommendations, trends
Custom States：create, get, upsert, list, delete
Custom Tools：create, list, delete (agent-level and session-level)
Notifications：list, consume, history
User Priming：primeUser, batchImport, getMetadata, updateMetadata

集成模式

模式 1：托管智能体运行时

你将应用指向 client.agents.chat（或用 sessions.start → 聊天回合 → sessions.end 显式打开会话）。Sonzai 从智能体身份组装系统提示词，召回相关记忆，运行 LLM，流式返回 token，执行已注册的工具，并更新状态 — 全部在一次调用中完成。 这是代码量最少的模式。 适用于陪伴、客服智能体，以及任何可以让 Sonzai 拥有完整智能体循环的场景，是默认推荐的路径。

何时使用

你希望每回合一次 HTTP 调用，零记忆管线代码。
你接受 Sonzai 选择（或通过覆盖参数指定）LLM 提供商。
你希望人格、情绪、记忆、语音、知识库搜索、主动通知等"开箱即用"，不必自己编排。

何时切换

必须使用自己的 LLM — 切换到模式 4：独立实时。
对话已在平台外发生（录音、转录、批量摄入）— 切换到模式 5：独立批处理。
运行在 Claude Desktop / Cursor / 兼容 MCP 的 IDE 中 — 切换到模式 2：MCP。
已使用 OpenClaw — 切换到模式 3：OpenClaw。

架构

┌─────────────┐     ┌───────────────────────────────────┐
│  Your App   │     │            Sonzai                 │
└──────┬──────┘     └──────────────────┬────────────────┘
     │                                │
     │  agents.chat({ messages })     │
     │───────────────────────────────>│  • assemble context
     │                                │     (memory, mood,
     │                                │      personality, KB,
     │                                │      relationship)
     │                                │  • run LLM (your choice
     │                                │      of provider/model)
     │                                │  • execute registered
     │                                │      tools (if any)
     │  <── SSE stream ───────────────│  • write back: facts,
     │      tokens + done             │      mood, personality,
     │                                │      goals, habits
     │                                │
     │  (optional) sessions.end       │
     │───────────────────────────────>│  • consolidate, dedup,
     │                                │      diary, clustering

端到端示例

最简完整流程：显式打开会话、驱动流式聊天、结束会话。

import { Sonzai } from "@sonzai-labs/agents";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });
const AGENT_ID  = "agent-uuid";
const USER_ID   = "user-123";
const SESSION_ID = crypto.randomUUID();

// 1. Start an explicit session (optional — agents.chat will auto-create one
//    if you don't, but explicit sessions let you scope tools and lifecycle).
await sonzai.agents.sessions.start(AGENT_ID, {
userId:    USER_ID,
sessionId: SESSION_ID,
});

// 2. Drive turns. Sonzai owns context assembly, the LLM call, tool exec,
//    and writeback. You stream the reply straight to your UI.
for await (const event of sonzai.agents.chatStream({
agent:     AGENT_ID,
sessionId: SESSION_ID,
userId:    USER_ID,
messages:  [{ role: "user", content: "Hi! How's your day going?" }],
language:  "en",
})) {
process.stdout.write(event.choices?.[0]?.delta?.content ?? "");
}

// 3. End the session — triggers fact extraction + consolidation.
await sonzai.agents.sessions.end(AGENT_ID, {
userId:        USER_ID,
sessionId:     SESSION_ID,
totalMessages: 2,
});

跳过显式会话

如果不调用 sessions.start，Sonzai 会在第一次 agents.chat 时自动开启会话，并在空闲时关闭。会话 ID 仍会附加到提取的事实上。需要会话级工具、明确的边界或回放语义时再使用显式生命周期即可。

下一步

对话（深入参考）

流式、非流式、工具能力、语言与时区、实例。

会话

何时显式开启、end() 触发哪些整合、ID 如何流入事实。

集成指南

完整 SDK 参考：智能体生命周期、Webhook、前端代理模式、知识库、用户预热。

自定义工具

注册聊天期间 LLM 可调用的工具 — 内置能力、智能体工具、会话级工具。

集成模式

模式 2：MCP

心智层提供了一个托管的 Streamable HTTP MCP 端点： https://api.sonz.ai/mcp/memory/{agent_id}。把任意兼容 MCP 的客户端指向它，再带上 Sonzai API 密钥即可 — 34 个工具、4 个资源、 3 个引导式提示词。无需本地二进制、无需开放 SSE 端口、无需 Go 工具链。

何时使用

用户已经在 Claude Code、Cursor、Claude Desktop、ChatGPT 或其他 MCP 兼容客户端里。
你希望靠对话驱动 Sonzai，而不是写 SDK 代码。
你在原型阶段 — 选 create-companion 或 mind-layer-setup 引导提示词，完全不写代码也能跑起来。

何时切换

构建自己的产品 UI — 切换到模式 1：托管运行时。
想让 Sonzai 进入自己的 LLM 循环，而不是作为别人 LLM 的工具 — 切换到模式 4：独立实时。

架构

┌────────────────────────┐                   ┌──────────────────────┐
│  Claude Code · Cursor  │                   │   Sonzai Mind Layer  │
│  ChatGPT · VS Code     │                   │                      │
│  Claude Desktop        │                   │                      │
└──────────┬─────────────┘                   └──────────┬───────────┘
         │                                            │
         │  Streamable HTTP (JSON-RPC 2.0)            │
         │  • list_agents                             │
         │  • chat / start_session / end_session      │
         │  • search_memories / list_facts            │
         │  • get_personality / get_mood              │
         │  • generate_character / trigger_event      │
         │  • schedule_wakeup / list_notifications    │
         │                                            │
         ▼                                            │
 https://api.sonz.ai/mcp/memory/{agent_id}            │
 Authorization: Bearer sk-your-api-key                │
                                                      │
                                                      ▼
                                              Context Engine,
                                              AI Service, DBs

端到端示例

需要从项目仪表板取得项目 API 密钥并准备好一个 agent ID。选你的客户端，粘贴片段就完成配置。

# 一行命令注册托管 MCP 服务器：
claude mcp add --transport http sonzai \
https://api.sonz.ai/mcp/memory/AGENT_ID \
--header "Authorization: Bearer $SONZAI_API_KEY"

# 然后在任何 Claude Code 会话中直接说：
#   "Chat with agent 'Luna' and say 'I had a great day hiking today!'"
#   "Search Luna's memories about hiking adventures"
#   "Use mind-layer-setup with assistant_name 'Aria' …"

Streamable HTTP，而不是 SSE

2026 版 MCP 规范把 Streamable HTTP 标记为远程传输的标准选择。各主流客户端正在逐步弃用 SSE — 任何新的集成都应优先选择 HTTP。

把 API 密钥当成密码

Bearer Token 就是项目 API 密钥 — 它对项目下所有 agent 都有完整访问权。不要把它提交到公开仓库；多人协作时使用 per-developer 范围的配置。

下一步

MCP 集成（完整参考）

完整工具目录（Agent Management、Chat、Memory、Behavior、Sessions、Generation 共 34 个工具）、资源、引导式提示词、OAuth 流程，以及可选的本地二进制回退方案。

模式 1：托管运行时

如果你更愿意从代码而非聊天客户端驱动 Sonzai。

API 参考

MCP 工具背后的所有 REST 接口及其请求 / 响应 schema。

集成模式

模式 3：OpenClaw

OpenClaw 是一个开源对话式 AI 智能体框架，使用基于槽位的插件系统。决定每次系统提示词中包含哪些上下文的槽位叫 contextEngine。安装 @sonzai-labs/openclaw-context 后会以 "sonzai" 名称注册 Sonzai 上下文引擎 — 在 openclaw.json 中把它分配给该槽位，每次对话就会经过心智层，无需任何额外代码。

何时使用

你已经在 OpenClaw 上开发，或团队已经标准化在 OpenClaw 上。
你希望保留 OpenClaw 现有的聊天循环、遥测和工具插件 — 仅替换记忆/人格层。
你希望在每个回合的系统提示词里自动注入一段 <sonzai-context>，按优先级排序、按 token 预算裁剪。

何时切换

不在 OpenClaw 上 — 切换到模式 1：托管运行时（我们跑聊天）或模式 4：独立实时（你跑聊天）。
完全没有实时聊天 — 切换到模式 5：独立批处理。

架构

OpenClaw Runtime              SonzaiContextEngine            Sonzai Mind Layer
    |                                |                            |
    |-- bootstrap(sessionId) ------->|                            |
    |                                |-- resolve agent + session->|
    |                                |<-- session state ----------|
    |                                |                            |
    |-- assemble(messages, budget) ->|                            |
    |                                |-- fetch memory, mood,      |
    |                                |   personality, goals ----->|
    |                                |<-- ranked context blocks --|
    |<-- systemPromptAddition -------|   priority-ordered,        |
    |                                |   token-budget-trimmed     |
    |                                |                            |
    |  [LLM call w/ enriched prompt] |                            |
    |                                |                            |
    |-- afterTurn(sessionId) ------->|                            |
    |                                |-- send transcript -------->|
    |                                |   Mind Layer extracts      |
    |                                |   facts, updates mood,     |
    |                                |   evolves personality      |
    |                                |                            |
    |-- compact(sessionId) --------->|                            |
    |                                |-- merge short → long term->|

端到端示例

OpenClaw 插件本身只支持 JavaScript（OpenClaw 是 JS 框架）。Python 与 Go 分支展示等价的 B2B 配置流程：通过 SHA1 推导出确定性的 agent UUID 并写入 OpenClaw 配置 — 真正消费这份配置的运行时仍是 JS。

// 1. Install:
//    openclaw plugins install @sonzai-labs/openclaw-context
//    # or: npm install @sonzai-labs/openclaw-context
//
// 2. Run the setup wizard (interactive — asks for API key, agent name):
//    npx @sonzai-labs/openclaw-context setup
//
// 3. The wizard writes openclaw.json:
//    {
//      "plugins": {
//        "slots": { "contextEngine": "sonzai" },
//        "entries": {
//          "sonzai": {
//            "enabled": true,
//            "apiKey": "sk_your_api_key",
//            "agentId": "a1b2c3d4-..."
//          }
//        }
//      }
//    }
//
// 4. Start chatting — Sonzai is now the contextEngine:
//    openclaw chat
//
// For programmatic / B2B provisioning use the exported setup() helper:
import { setup } from "@sonzai-labs/openclaw-context";

const result = await setup({
apiKey:     "sk_your_api_key",
agentName:  "customer-support-bot",
configPath: "/path/to/openclaw.json",
});

console.log(result.agentId);  // deterministic UUID — safe to re-run
console.log(result.written);  // true — config file updated

幂等的配置流程

Agent ID 由 SHA1(tenantID + agentName) 推导。对相同 tenant + name 多次调用 setup()（或 Python/Go 等价实现）总是返回同一 agent — 每次部署重跑都安全。

下一步

OpenClaw 集成（完整参考）

生命周期钩子、配置 schema、会话键解析、token 预算裁剪、可在高级用法里直接使用的 SonzaiContextEngine 类。

记忆与上下文

contextEngine 实际注入了什么 — 事实召回、情绪、人格、关系。

模式 1：托管运行时

如果你想直接绕开 OpenClaw、让 Sonzai 拥有完整聊天循环。

集成模式

模式 5：独立记忆（批处理）

你拥有完整对话。Sonzai 永远看不到实时过程。当对话结束 — 通话挂断、客服工单关闭、日志会话写完 — 你一次性 POST 整段记录，Sonzai 的提取器会把它转成事实、情绪更新、人格漂移、习惯检测和主动外联信号。最适合辅导、健身、 CRM、语音通话、写日记，以及任何让 Sonzai 处于热路径既不可取也不可行的流程。

何时使用

延迟预算无法承受每个回合一次 /turn 往返。
记录已经存在（录音、Gong/Zoom 导出、日记内容）。
你想事后批量摄入 — 回放日志、迁移用户、给智能体质量做基准测试。

何时切换

你想要每回合的新鲜上下文 — 模式 4：独立实时。
可以接受连 LLM 调用也交给 Sonzai — 模式 1：托管运行时。

架构

┌─────────────┐     ┌──────────────────┐     ┌──────────────┐
│  Your App   │     │   Sonzai API     │     │   Your LLM   │
└──────┬──────┘     └────────┬─────────┘     └──────┬───────┘
     │                     │                       │
     │  GET /context       │                       │
     │────────────────────>│ (optional pre-session │
     │  <── user profile ──│  personalization)     │
     │                     │                       │
     │  ══ Your conversation (Sonzai not involved) ═════════│
     │                     │                       │              │
     │  Chat ──────────────┼──────────────────────>│             │
     │  <── reply ─────────┼───────────────────────│             │
     │  [N turns, your loop, your tools]            │             │
     │                     │                       │             │
     │  ════════════════════════════════════════════════════════│
     │                     │                       │
     │  /process or sessions.end({ messages })     │
     │────────────────────>│── extract facts,      │
     │  (full transcript)  │   personality, mood,  │
     │                     │   habits, interests   │
     │  <── extractions ───│   (Sonzai LLM)        │
     │                     │                       │
     │  Use insights       │                       │
     │  (push notif,       │                       │
     │   dashboard,        │                       │
     │   exercises, …)     │                       │
     └─────────────────────┴───────────────────────┘

端到端示例

最简路径是 /process：一次调用、自动建会话、返回生成的 session_id 便于关联。当你需要会话级工具、时长、或异步轮询时，再使用显式 sessions.start → end({ messages }) 生命周期。

import { Sonzai } from "@sonzai-labs/agents";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

async function ingestTranscript(
agentId: string,
userId: string,
transcript: { role: "user" | "assistant" | "tool"; content: string; tool_calls?: any[] }[],
) {
// One call. Auto-creates a session. Tool messages allowed.
const result = await sonzai.agents.process(agentId, {
  userId,
  messages: transcript,
  provider: "gemini",                          // optional override
  model:    "gemini-3.1-flash-lite-preview",   // optional override
});

// result.session_id is the auto-created session id.
// Pull extractions from the read endpoints when ready:
const memory = await sonzai.agents.memory.list(agentId, { userId });
const mood   = await sonzai.agents.getMood(agentId, { userId });

return { sessionId: result.session_id, memory, mood };
}

二选一，别两个都用

/process 与 sessions.end({ messages }) 在批量摄入语义上等价 — 两者都会基于完整记录同步抽取事实和副作用。对同一份记录不要两个都调，否则提取会跑两次。要"一次调用"的简单形态选 /process；需要显式生命周期、异步轮询或会话级工具时选 sessions.start + sessions.end({ messages })。

什么时候跑什么

/process 与 sessions.end 故意做得轻量：每次调用只抽取事实和会话摘要（每个分块一次 LLM 调用）。昂贵的跨会话工作（去重、聚类、日记、衰减）由平台按调度自动跑 — 你不会因为每次调用而被收这部分费用。

下一步

模式 2：会话后批处理（深入指南）

用例 — 辅导、健身、CRM 智能、语言学习、写日记 — 包含 /process 与显式生命周期两套完整代码。

端点逐一解析

/process、sessions.end 与读取端点（memory、mood、personality、goals、habits、notifications）的完整参考。

模式 4：独立实时

同样的记忆模型，但 Sonzai 出现在每个回合的热路径上，而不只在结尾。

集成模式

模式 4：独立记忆（实时）

你保留现有的聊天循环。每次调用 LLM 之前，向 Sonzai 拉取与用户消息相关的丰富上下文；LLM 回复之后，仅把这次交换提交给 session.turn()。情绪在 ~300–500 ms 内同步落地。更深入的提取 — 事实、人格漂移、习惯检测、目标更新 — 在背景里异步跑 5–15 秒。Sonzai 永远看不到你的工具执行，也永远不替你选模型。

适合陪伴、语音智能体、智能体框架（OpenAI Agents SDK、LangChain、LiveKit）以及任何在采用 Sonzai 之前就已经有生产级 LLM 循环的场景。

何时使用

你已经有生产环境里的 LLM 循环 —— 自定义工具、评估、提示词模板，或者锁定的特定提供商。
你需要每回合的新鲜上下文，而不是整段对话只拉一次。
你想要情绪、事实、人格、习惯、目标和关系信号 — 而不放弃对 LLM 选择和工具执行的控制权。

何时切换

一段对话来不及等每次 .turn() 同步返回 — 切换到模式 5：独立批处理。
可以接受 Sonzai 持有 LLM 调用 — 切换到模式 1：托管运行时，删掉绝大多数代码。

架构

┌─────────────┐     ┌──────────────────┐     ┌──────────────┐
│  Your App   │     │   Sonzai API     │     │   Your LLM   │
└──────┬──────┘     └────────┬─────────┘     └──────┬───────┘
     │                     │                       │
     │  sessions.start     │                       │
     │────────────────────>│ (prewarms memory)     │
     │  <── Session ───────│                       │
     │                     │                       │
     │  ─── Per turn ──────────────────────────── │
     │                     │                       │
     │  session.context()  │                       │
     │────────────────────>│                       │
     │  <── enriched ctx ──│                       │
     │    personality, mood│                       │
     │    memories, goals  │                       │
     │                     │                       │
     │  Your LLM loop ─────┼──────────────────────>│
     │  + your tools       │                       │
     │  <── reply ─────────┼───────────────────────│
     │                     │                       │
     │  sendToUser(reply)  (no waiting on Sonzai)  │
     │                     │                       │
     │  session.turn()     │                       │
     │────────────────────>│ ⇒ sync mood ~300ms    │
     │  <── mood, status ──│ ⇒ background extract  │
     │                     │   (5–15s)             │
     │                     │                       │
     │  ─── Repeat ────────────────────────────── │
     │                     │                       │
     │  session.end()      │                       │
     │────────────────────>│── consolidate         │
     │                     │   long-term memory    │
     └─────────────────────┴───────────────────────┘

端到端示例

最小可运行的循环：开会话；每个回合拉上下文、调你的 LLM、提交交换；结束时关闭会话。

import { Sonzai } from "@sonzai-labs/agents";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

async function runConversation(agentId: string, userId: string) {
const sessionId = `session-${Date.now()}`;
const history: { role: string; content: string }[] = [];

// Session handle bundles agentId/userId/sessionId + provider/model
// defaults so you don't repeat them on every call.
const session = await sonzai.agents.sessions.start(agentId, {
  userId,
  sessionId,
  provider: "gemini",
  model:    "gemini-3.1-flash-lite-preview",
});

async function turn(userMessage: string): Promise<string> {
  // 1. Pull fresh, query-relevant context BEFORE the LLM call.
  const ctx = await session.context({ query: userMessage });

  // 2. Your LLM, your tools — Sonzai is OUT of the loop here.
  const reply = await yourLLM.chat({
    system:   buildSystemPrompt(ctx),
    messages: [...history, { role: "user", content: userMessage }],
  });

  sendToUser(reply.content);

  // 3. Submit the exchange. Sync mood ~300ms; deeper extraction
  //    (facts, personality, habits) runs asynchronously 5–15s later.
  await session.turn({
    messages: [
      { role: "user",      content: userMessage },
      { role: "assistant", content: reply.content },
    ],
  });

  history.push({ role: "user",      content: userMessage });
  history.push({ role: "assistant", content: reply.content });
  return reply.content;
}

return { turn, end: () => session.end() };
}

// /context returns a flat object — read what you need, drop the rest.
function buildSystemPrompt(ctx: any): string {
const facts = (ctx.loaded_facts ?? []).map((f: any) => `- ${f.atomic_text}`).join("\n");
return [
  ctx.personality_prompt ?? "You are a helpful AI companion.",
  `Personality (Big5): ${JSON.stringify(ctx.big5 ?? {})}`,
  `Current mood: ${JSON.stringify(ctx.current_mood ?? {})}`,
  facts ? `Relevant memories:\n${facts}` : "",
].filter(Boolean).join("\n\n");
}

最关键的一步

在调用 LLM 之前始终调用 session.context(query=user_msg) — 每个回合都要。这是闭环的关键步骤。跳过它就意味着 LLM 在用过期状态工作，记忆层的价值也就崩了。

用 fetchNextContext 省一次往返

session.turn() 接受 fetchNextContext: { query: nextMessage } （Python：fetch_next_context={"query": ...}）。设置后，响应里会带回下一次 /context 的负载到 next_context，于是回合 N 还没结束，客户端就已经拿到了回合 N+1 的上下文。

下一步

模式 1：记忆中间件（深入指南）

工具调用、多模态/图像桥接、双输出提示词、把 Sonzai 知识库与记忆搜索暴露为 LLM 工具、轮询延迟提取。

端点逐一解析

sessions.start、session.context、session.turn、/process、sessions.end 与读取端点的完整参考。

模式 5：独立批处理

同样的数据模型，但你在结尾一次性提交整段对话，而不是每个回合单独提交。

思维层

对话

会话流程

每段对话遵循简单的三步生命周期：

1. 发送聊天请求   — 传入智能体 ID、用户 ID、应用上下文和消息
2. 接收流式响应  — 实时将 token 渲染给用户
3. 平台自动更新  — 记忆、情绪、关系和人格自动演化

聊天（非流式）

对于简单的请求-响应流程，使用标准聊天方法。

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: "sk-..." });

const response = await client.agents.chat({
agent: "agent-id",
messages: [{ role: "user", content: "Hello!" }],
userId: "user-123",
language: "en",
});

console.log(response.content);

聊天（流式）

以流式方式接收 token，获得更流畅的体验。底层使用服务器发送事件（SSE）。

for await (const event of client.agents.chatStream({
agent: "agent-id",
messages: [{ role: "user", content: "Tell me a story" }],
userId: "user-123",
language: "en",
timezone: "America/New_York",
})) {
process.stdout.write(event.choices?.[0]?.delta?.content ?? "");
}

SSE 格式

REST 端点发送 OpenAI 兼容的 SSE 数据块。每行以 data: 开头。流以 data: [DONE] 结束。

聊天选项

max_turns

限制智能体在单次聊天调用中产生的助手轮次数量。适用于控制多消息响应——智能体可能会发送后续消息。

for await (const event of client.agents.chatStream({
agent: "agent-id",
messages: [{ role: "user", content: "Tell me about your week" }],
userId: "user-123",
maxTurns: 3,   // 智能体最多产生 3 条助手消息
})) {
process.stdout.write(event.choices?.[0]?.delta?.content ?? "");
}

默认行为

未设置 max_turns 时，智能体根据上下文决定发送多少条消息。如果始终想要单条响应，将其设置为 1。

应用上下文

每次请求传入应用状态，让智能体在对话中可以引用它。平台不缓存此状态——每次聊天调用都要发送。

// 将每次请求的状态嵌入 compiledSystemPrompt——平台不缓存它。
const appState = [
"Department: Engineering",
"Current task: Q2 roadmap review",
"Open tickets: 12",
"Role: Senior Developer",
].join("\n");

for await (const event of client.agents.chatStream({
agent:    "agent-id",
messages: [{ role: "user", content: "What should I do next?" }],
userId:   "user-123",
compiledSystemPrompt: appState,
})) {
process.stdout.write(event.choices?.[0]?.delta?.content ?? "");
}

平台管理的状态更新

每次交互后，平台自动处理以下内容。无需额外的 API 调用。

记忆提取：从对话中提取事实、事件和承诺。
情绪更新：检测到的情感主题推动情绪维度变化。
人格演化：基于交互模式的渐进式大五变化。
习惯追踪：反复出现的模式成为被追踪的习惯。
关系更新：更新化学反应和关系叙事。
目标进展：记录活跃目标上的进展。

多智能体对话

使用对话 API 编排多个智能体之间的对话，或让智能体在没有直接用户消息的情况下响应共享场景上下文。

const response = await client.agents.dialogue("agent-id", {
userId: "user-123",
messages: [
  { role: "user", content: "Tell the group about your weekend plans" },
],
sceneGuidance: "A casual team standup meeting",
});

console.log(response.content);

工具调用

内部工具（记忆、状态）自动运行。你只需为终端用户能力配置选择性启用的工具：

选择性启用工具

generate_image：根据文本提示生成图像
generate_video：根据文本提示生成视频
generate_sound：生成音效
generate_music：生成背景音乐
generate_tts：文本转语音转换

通过 SDK 启用

for await (const event of client.agents.chatStream({
agent: "agent-id",
messages: [{ role: "user", content: "Show me a sunset!" }],
userId: "user-123",
tools: ["generate_image"],
})) {
if (event.type === "image") {
  console.log("Generated image:", event.imageUrl);
}
}

根据你产品的能力启用选择性工具。记忆和状态工具由平台管理。

实际应用

三类受众的聊天形式相同——区别在于你传入什么以及如何处理流式数据。

始终使用流式。 伴侣应该感觉是有生命的。token 一到就渲染，永远不要攒着一起发。

用 maxTurns 实现多消息回复。 如果你希望 Luna 连发两三条短消息（对伴侣聊天来说更自然），将 maxTurns 从默认值 1 调高。流式数据会发出独立的消息边界——在你的 UI 中将它们显示为独立气泡。

for await (const event of client.agents.chatStream({
  agent: "agent-id",
  userId: "user-123",
  messages: [{ role: "user", content: "I'm having a rough day." }],
  maxTurns: 3,
})) {
  if (event.type === "message_boundary") newBubble();
  else renderDelta(event);
}

不要传 instanceId。 大多数伴侣是一对一的；默认的每用户作用域正是你需要的。

Custom State

Custom State is simple structured per-user data the agent can read and modify during conversations. Use it for counters, flags, or any state your product tracks per user. Unlike memory (which the platform extracts from conversation text), Custom State is data you write explicitly from your backend — and the agent sees it immediately.

What you can build with it

Game loops — energy, currency, turn counters, progression flags
Feature flags — per-user toggles for experimental features
Session-scoped state — timers, streaks, active-quest identifiers
Progress markers — "completed onboarding", "has premium", "saw-tutorial-X"
Rate limits / quotas — message counts, daily-action remaining

Quickstart

Create an energy state for a user, starting at 100.

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

await client.agents.customStates.create("agent-id", {
key:    "energy",
value:  100,
scope:  "user",
userId: "user-123",
});

Core concepts

Typed values

Every state has a content_type that tells the platform how to interpret value:

`content_type`	Value	Example
`"text"` (default)	string	`"active"`, `"silver"`
`"json"`	any JSON-serializable type	`{ "score": 340, "tier": "silver" }`
`"binary"`	base64-encoded bytes	raw binary payloads

Scoping model

Global State

Per Instance — Shared across all users in an instance. Use for environment configuration, agent status, or global event flags.

Per-User State

Per Instance + User — Scoped to one user. Use for energy, currency, progress, preferences, and any per-player data.

Instances

All states are scoped to an instanceId — one deployment context of your agent (e.g. a workspace or game world). Omit instanceId to use the default instance. See Instances for details.

Agent reads and writes

When the agent has access to custom states, it reads current state at the start of each conversation via the get_custom_state tool — no prompt injection required. The agent can also update state during a conversation if you define a Custom Tool that calls your backend.

Distinct from Inventory

	Custom State	Inventory
Shape	Simple typed field	Structured item with a KB-linked schema
Use case	Counters, flags, strings	Items with multiple properties (medications, holdings, pets)
Schema	You define the key + content_type	Defined in your Knowledge Base
Best for	`energy: 80`, `tier: "gold"`	`{ name: "Metformin", dose_mg: 500, frequency: "twice daily" }`

Use Custom State for primitives and simple objects. Reach for Inventory when items have their own identity, multiple typed fields, and a shared schema across users.

Full API

Create

// Global state (shared across all users in an instance)
await client.agents.customStates.create("agent-id", {
key:         "current_status",
value:       "Processing requests",
scope:       "global",
contentType: "text",
instanceId:  "workspace-1",
});

// Per-user state
await client.agents.customStates.create("agent-id", {
key:         "energy",
value:       100,
scope:       "user",
contentType: "json",
userId:      "user-123",
});

Upsert (create or update by key)

Upsert creates the state if the key doesn't exist, or replaces the value if it does. Idempotent — safe to call on every update cycle from your backend.

await client.agents.customStates.upsert("agent-id", {
key:    "energy",
value:  80,
scope:  "user",
userId: "user-123",
});

Get by key

Retrieve a specific state by its composite key (key + scope + user_id + instance_id).

const state = await client.agents.customStates.getByKey("agent-id", {
key:    "energy",
scope:  "user",
userId: "user-123",
});

console.log(state.value);     // 80
console.log(state.updatedAt); // ISO timestamp

List

Return all states for an agent, optionally filtered by scope or user.

// All global states for an instance
const globals = await client.agents.customStates.list("agent-id", {
scope:      "global",
instanceId: "workspace-1",
});

// All per-user states for a specific user
const userStates = await client.agents.customStates.list("agent-id", {
scope:  "user",
userId: "user-123",
});

Update by state ID

Update a state you already have the state_id for. Only value and content_type can be changed.

await client.agents.customStates.update("agent-id", stateId, {
value: 60,
});

Delete

Delete by state ID or by composite key.

// Delete by key
await client.agents.customStates.deleteByKey("agent-id", {
key:    "energy",
scope:  "user",
userId: "user-123",
});

// Delete by state_id
await client.agents.customStates.delete("agent-id", stateId);

Method summary

Method	Returns	Description
`Create(ctx, agentID, opts)`	`*CustomState`	Create a new state entry
`Upsert(ctx, agentID, opts)`	`*CustomState`	Create or replace by composite key
`GetByKey(ctx, agentID, opts)`	`*CustomState`	Fetch one state by key + scope
`List(ctx, agentID, opts)`	`*CustomStateListResponse`	List states, filtered by scope / user
`Update(ctx, agentID, stateID, opts)`	`*CustomState`	Update value by state ID
`Delete(ctx, agentID, stateID)`	—	Delete by state ID
`DeleteByKey(ctx, agentID, opts)`	—	Delete by composite key

Combines with other features

With Custom Tools — tools that read and write state

Define a tool that lets the agent trigger a state change from inside a conversation. Your backend executes the tool call and calls upsert to apply the new value.

await client.agents.sessions.setTools("agent-id", "session-id", [
  {
    name: "spend_energy",
    description: "Deduct energy from the user. Call when the user takes an action that costs energy.",
    parameters: {
      type: "object",
      properties: {
        amount: { type: "number", description: "Energy to deduct (1–50)" },
      },
      required: ["amount"],
    },
  },
]);

// In your tool handler:
// 1. Receive externalToolCall { name: "spend_energy", arguments: { amount: 10 } }
// 2. Read current energy with getByKey
// 3. Upsert the new value
// 4. Return the result in the next chat message

With Inventory — when state is structured, use inventory

Custom State is the right tool for primitive values and simple flat objects: energy: 80, tier: "gold", onboarding_complete: true. When a piece of data has its own identity, multiple typed properties, and a shared schema across users — a medication, a stock holding, a pet — use Inventory instead.

Situation	Use
Single number or string per key	Custom State
A flag that is true/false	Custom State
A flat object with a few fields	Custom State
An item with a schema defined in the Knowledge Base	Inventory
A collection of items of the same type per user	Inventory

With Sessions — session-scoped vs persistent state

Custom State is persistent by default — it survives across sessions and is visible in every future conversation. If you need state that only exists for the duration of one conversation (a temporary form-fill context, a one-time confirmation token), scope it at the session level instead by passing it in the chat request's context fields rather than writing it as a Custom State.

Tutorials

Custom States walkthrough — end-to-end example: create, upsert, read during chat, trigger events on state changes

Next steps

Inventory

Structured per-user items with KB-linked schemas.

Custom Tools

Tools the agent can call during chat to read or modify state.

Sessions

Session-scoped vs persistent state, and session lifecycle.

Instances

Isolate state per workspace, environment, or deployment context.

Custom Tools

Custom Tools let the LLM invoke functions during inference. Sonzai handles sonzai_-prefixed built-in tools automatically. Custom tools are defined by you and executed by your backend — Sonzai surfaces the call as a side effect in the SSE stream.

Using your own LLM?

If you use standalone memory mode (BYO-LLM), Sonzai exposes tool schemas you can wire into your agent framework (LangChain, Vercel AI SDK, Gemini function calling, etc.). See the Tool Integration guide for details.

What you can build with it

Expressive companion actions — emote, change outfit, move scene, give a gift
Backend integrations — create_ticket, lookup_order, schedule_meeting
State mutations — tools that read or write Custom State on behalf of the agent
Approval-gated workflows — propose an action, your backend validates before executing
Context-sensitive tools — inject different tool sets per session depending on user role or screen

Quickstart

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

// 1. Register a tool for this session
await client.agents.sessions.setTools("agent-id", "session-id", [
{
  name: "check_status",
  description: "Return the current operational status. Call when the user asks about system health.",
  parameters: { type: "object", properties: {} },
},
]);

// 2. Chat — tool calls appear in sideEffects
const toolCalls: { name: string; arguments: Record<string, unknown> }[] = [];
for await (const event of client.agents.chatStream({
agent:  "agent-id",
userId: "user-123",
messages: [{ role: "user", content: "What's the current status?" }],
})) {
process.stdout.write(event.choices?.[0]?.delta?.content ?? "");
toolCalls.push(...(event.sideEffects?.externalToolCalls ?? []));
}

// 3. Execute and return results
const results = await Promise.all(toolCalls.map(c => myBackend.run(c.name, c.arguments)));
for await (const event of client.agents.chatStream({
agent:  "agent-id",
userId: "user-123",
messages: [
  { role: "user", content: "What's the current status?" },
  { role: "tool", content: results.join("\n") },
],
})) {
process.stdout.write(event.choices?.[0]?.delta?.content ?? "");
}

Core concepts

Built-In Tools (Capabilities)

Toggle platform-managed capabilities per agent. These are enabled at agent creation or updated via the capabilities API.

sonzai_memory_recall (Always On)

Searches stored memories during inference. Auto-injected into context.

sonzai_remember_name (Toggleable)

Persists the user's name for future conversations. On by default.

sonzai_web_search (Toggleable)

Live web search via Google. On by default.

sonzai_inventory (Toggleable)

Read user resource items and join with Knowledge Base data.

// Set capabilities at agent creation
const agent = await client.agents.create({
agentId: "your-stable-uuid",  // recommended — makes creation idempotent
name: "Luna",
big5: { openness: 0.75, conscientiousness: 0.6, extraversion: 0.8,
        agreeableness: 0.7, neuroticism: 0.3 },
toolCapabilities: {
  webSearch:       true,
  rememberName:    true,
  imageGeneration: false,
  inventory:       true,
},
});

// Or update capabilities on an existing agent
await client.agents.update("agent-id", {
toolCapabilities: {
  webSearch: false,
  inventory: true,
},
});

Reserved Prefix

The sonzai_ prefix is reserved. Your custom tools must not use it — the API will reject them.

`customTools` in agent capabilities

AgentCapabilities includes a customTools field — a snapshot of the agent-level custom tools currently registered. Use get_capabilities() to read them, or use the dedicated list_custom_tools() / createCustomTool() methods (shown in the Full API section below) to manage them.

// Read agent capabilities — includes current custom tools
const caps = await client.agents.getCapabilities("agent-id");
console.log(caps.customTools);  // CustomToolDefinition[] | null

// Register a new agent-level custom tool
await client.agents.createCustomTool("agent-id", {
name: "lookup_order",
description: "Look up an order by ID and return its status.",
parameters: {
  type: "object",
  properties: {
    order_id: { type: "string" },
  },
  required: ["order_id"],
},
});

Tool scoping

Type	Scope	Persistence	Managed Via
Built-in (`sonzai_`)	All instances	Platform-managed	SDK capabilities, Dashboard
Agent-level custom	All instances	Persistent	SDK, Dashboard
Session-level	Per session	Temporary	SDK (inline or setTools)

Full API

Custom Tools (Agent-Level)

Persistent tools stored with the agent and available in every chat, regardless of session or instance.

// Create a custom tool
await client.agents.createCustomTool("agent-id", {
name: "check_inventory",
description: "Check the user's current tasks and their statuses",
parameters: {
  type: "object",
  properties: {
    item_type: {
      type: "string",
      description: "Filter by category: active, pending, completed",
    },
  },
},
});

// List all custom tools
const tools = await client.agents.listCustomTools("agent-id");

// Update a tool's description or parameters
await client.agents.updateCustomTool("agent-id", "check_inventory", {
description: "Check and summarize the user's tasks by category",
});

// Delete a tool
await client.agents.deleteCustomTool("agent-id", "check_inventory");

Session-Level Tools (temporary)

Inject tools dynamically for a specific session. Session tools merge with agent-level tools — same-name session tools take precedence. Discarded when the session ends.

Option 1 — Set for an existing session

await client.agents.sessions.setTools("agent-id", "session-id", [
{
  name: "execute_action",
  description: "Execute an action from the agent's capabilities",
  parameters: {
    type: "object",
    properties: {
      action_name: { type: "string" },
      target:      { type: "string" },
    },
    required: ["action_name"],
  },
},
]);

Option 2 — Pass inline with the chat call

for await (const event of client.agents.chatStream({
agent:    "agent-id",
messages: [{ role: "user", content: "Check my tools" }],
userId:   "user-123",
toolDefinitions: [
  {
    name:        "check_inventory",
    description: "List the agent's active tools",
    parameters:  { type: "object", properties: {} },
  },
],
})) {
// handle events...
}

Handling Tool Calls

When the LLM decides to call a custom tool, it appears as a side effect in the SSE stream. Your backend executes the tool and returns the result in the next message.

1. Receive the tool call

const toolCalls: { name: string; arguments: Record<string, unknown> }[] = [];

for await (const event of client.agents.chatStream({
agent:    "agent-id",
messages: [{ role: "user", content: "What tasks do I have?" }],
userId:   "user-123",
})) {
// Stream content to the user
const content = event.choices?.[0]?.delta?.content;
if (content) process.stdout.write(content);

// Collect tool calls from side effects
const calls = event.sideEffects?.externalToolCalls ?? [];
toolCalls.push(...calls);
}

2. Execute and return results

// Execute your tool calls on your backend
const toolResults: string[] = [];
for (const call of toolCalls) {
const result = await myBackend.executeTool(call.name, call.arguments);
toolResults.push(result);
}

// Return results in the next chat message
for await (const event of client.agents.chatStream({
agent:    "agent-id",
userId:   "user-123",
messages: [
  { role: "user",  content: "What tasks do I have?" },
  { role: "tool",  content: toolResults.join("\n") },
],
})) {
process.stdout.write(event.choices?.[0]?.delta?.content ?? "");
}

In Practice

What you expose as tools differs sharply by use case — keep descriptions vivid and tightly scoped so the LLM invokes them naturally.

Tools are expressive actions. Things the character can DO in your app — emote, change outfit, move to a different scene, give a gift. Keep descriptions vivid so the LLM invokes them naturally.

await client.agents.sessions.setTools("agent-id", "session-id", [
  {
    name: "change_scene",
    description: "Move to a new location in the story. Use when the scene has run its course or a new chapter begins.",
    parameters: { type: "object", properties: { location: { type: "string" } }, required: ["location"] },
  },
]);

Don't include a handoff tool. Companions should never punt to a human — the relationship IS the product.

Combines with

Custom State — what tools often act on

Define a tool that lets the agent trigger a state change from inside a conversation. Your backend executes the tool call and calls upsert to apply the new value.

await client.agents.sessions.setTools("agent-id", "session-id", [
  {
    name: "spend_energy",
    description: "Deduct energy from the user. Call when the user takes an action that costs energy.",
    parameters: {
      type: "object",
      properties: {
        amount: { type: "number", description: "Energy to deduct (1–50)" },
      },
      required: ["amount"],
    },
  },
]);

// In your tool handler:
// 1. Receive externalToolCall { name: "spend_energy", arguments: { amount: 10 } }
// 2. Read current energy with getByKey
// 3. Upsert the new value
// 4. Return the result in the next chat message

Sessions — session-scoped vs persistent tools

Agent-level tools persist across all sessions. Session-level tools are injected at runtime and discarded when the session ends — use them when the available tool set depends on the current screen, user role, or conversation context.

Conversations — tool calls in the message stream

Tool calls appear as side effects in the SSE stream. See the Conversations page for the full event shape and streaming patterns.

Tutorials

Custom States walkthrough — end-to-end example that includes a spend_energy tool writing back to Custom State

Next steps

Custom State

Per-user counters, flags, and strings the agent reads and writes.

Sessions

Session lifecycle and session-scoped tool injection.

Conversations

Full SSE event shape and streaming patterns.

Tool Integration (BYO-LLM)

Wire Sonzai tools into your own agent framework.

思维层

情绪与心情

自动运行——无需配置

情绪、心情和目标均由上下文引擎自动管理。每段对话、每个应用事件和基于时间的衰减都无需你编写任何代码即可处理。

本页面的 API 和 SDK 调用用于读取状态（仪表盘、分析）或手动覆盖值——适用于你的应用需要驱动特定情绪状态或目标的场景，例如在应用内成就触发后推动情绪提升，或根据工作流里程碑设置目标。

情绪维度

每个智能体-用户对都有一个情绪状态，包含四个 0-100 分的维度：

快乐度 (0-100)：喜悦/悲伤谱系
能量值 (0-100)：活跃/迟钝程度
平静度 (0-100)：焦虑/平和状态
亲密度 (0-100)：对用户的温暖程度

情绪标签

整体情绪标签由各维度综合得出：

愉悦 (80-100)：异常快乐、热情洋溢
满足 (60-79)：总体积极、自在舒适
平静 (40-59)：均衡、沉稳
忧郁 (20-39)：略显低落、内敛
困扰 (0-19)：沮丧、焦虑

什么会影响情绪

1. 聊天交互

平台检测对话中的情感主题并自动调整情绪。常见主题：

joy_blooming — 关键词：happy, excited, wonderful — 效果：快乐度 ↑，能量值 ↑
creative_spark — 关键词：create, imagine, inspire — 效果：能量值 ↑，快乐度 ↑
brave_steps — 关键词：courage, try, challenge — 效果：能量值 ↑，平静度 ↑
growth_journey — 关键词：learn, grow, improve — 效果：快乐度 ↑，能量值 ↑
seeking_connection — 关键词：friend, together, share — 效果：亲密度 ↑，快乐度 ↑
feeling_overwhelmed — 关键词：stressed, anxious, worried — 效果：平静度 ↓，能量值 ↓
tender_heart — 关键词：love, care, miss — 效果：亲密度 ↑，快乐度 ↑
facing_fears — 关键词：scared, afraid, nervous — 效果：平静度 ↓，亲密度 ↑

2. 应用事件

传入应用上下文，平台自动更新情绪。

外出活动完成 — 快乐度 ↑，亲密度 ↑
成就解锁 — 快乐度 ↑，能量值 ↑
突破 — 快乐度 ↑，能量值 ↑，平静度 ↑
长时间离开后回归 — 亲密度 ↑，快乐度 ↑

3. 基于时间的衰减

情绪会随时间自然漂回由智能体人格派生的基线。无需任何操作——这是自动发生的。

获取情绪

读取智能体-用户对当前的情绪状态。

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: "sk-..." });

const mood = await client.agents.getMood("agent-id", {
userId: "user-123",
});

console.log(mood.label);      // "Content"
console.log(mood.happiness);  // 72
console.log(mood.energy);     // 65
console.log(mood.calmness);   // 80
console.log(mood.affection);  // 68

情绪历史

检索用户-智能体对在一段时间内的情绪历史。

const history = await client.agents.getMoodHistory("agent-id", {
userId: "user-123",
});

for (const snapshot of history.snapshots) {
console.log(snapshot.timestamp, snapshot.label, snapshot.happiness);
}

情绪聚合

获取某智能体在所有用户间的聚合情绪统计数据。

const agg = await client.agents.moodAggregate("agent-id");

console.log(agg.averageHappiness);
console.log(agg.averageEnergy);
console.log(agg.labelDistribution);

时光机

检索智能体在任意时间点的历史情绪快照。

const snapshot = await client.agents.getTimeMachine("agent-id", {
at: "2026-02-14T12:00:00Z",
userId: "user-123",
});

console.log(snapshot.mood);         // 该时间戳时的情绪状态
console.log(snapshot.personality);  // 该时间戳时的人格状态

星座群

星座群是平台自动检测到的相关记忆集群，形成有意义的模式。平台通过跨对话的反复主题识别这些集群。

const constellations = await client.agents.getConstellation("agent-id", {
userId: "user-123",
});

for (const cluster of constellations.clusters) {
console.log(cluster.theme, cluster.memories.length);
}

突破

突破是平台检测到的重大人格转变或情感里程碑。它们代表智能体的理解或与用户的关系发生了有意义的演变的时刻。

const breakthroughs = await client.agents.getBreakthroughs("agent-id", {
userId: "user-123",
});

for (const b of breakthroughs.items) {
console.log(b.type, b.description, b.timestamp);
}

目标

目标代表智能体正在为用户努力实现的事情——它们由上下文引擎在对话展开时自动检测和更新。你无需手动设置或管理它们。

手动目标管理的常见场景：

用户开始新工作流时播种目标（例如"完成入职"）
业务事件发生后（例如购买或里程碑）将目标标记为已完成
用户改变方向时放弃目标

// 目标自动管理——仅在需要时覆盖

// 列出当前目标
const goals = await client.agents.listGoals("agent-id", { userId: "user-123" });

// 手动创建目标（可选）
const goal = await client.agents.createGoal("agent-id", {
userId: "user-123",
title: "Onboarding",
description: "Complete onboarding checklist",
type: "task",
priority: 1,
});

// 业务事件发生后将目标标记为已完成
await client.agents.updateGoal("agent-id", goal.goal_id, {
userId: "user-123",
status: "achieved",
});

情绪如何影响响应

情绪状态自动包含在 AI 上下文中。智能体相应地调整语气：

高快乐度：更热情，使用感叹词，分享积极观察
低能量值：响应更简短，提到疲惫，较少主动探索话题
低平静度：可能表达担忧，寻求安慰，注意力集中时间较短
高亲密度：更多个人化语言，亲昵称呼，更深层的情感投入

实际应用

情绪对伴侣来说至关重要。对于任务型智能体，你基本上可以忽略这个系统。对于企业智能体，即使不用它驱动回复，读取端也可以作为情感信号。

情绪是产品的核心。 四维状态（快乐度、能量值、平静度、亲密度）让 Luna 感觉她正在经历某一天，而不只是在回复消息。在你的 UI 中展示它，让用户一眼就能读懂伴侣当前的情感状态。

const mood = await client.agents.getMood("agent-id", {
  userId: "user-123",
});

// 驱动 UI：调整背景色调、改变头像表情、选择环境音效
renderMoodIndicator(mood.happiness, mood.energy);

让对话驱动情绪变化。 情绪会根据每次对话的情感内容自动更新——你无需主动推送增量。如果你希望某个特定事件影响智能体，可以在下一轮对话中将其作为消息（例如"system"上下文）传入；平台会在回合后处理阶段提取该事件并应用情绪变化。

用时光机驱动叙事。 查询情绪历史来呈现 "三周前我们正处于低谷"的时刻——当用户感受到关系经历了不同阶段时，伴侣会获得深度。

Events & Multi-Agent Dialogue

Your backend knows things the agent doesn't: a user just levelled up, an order shipped, a milestone was hit. TriggerEvent lets you push those signals to an agent and get a tailored reaction — no user message required. Dialogue lets you orchestrate two agents talking to each other, turn by turn, so you can build NPC conversations, run evaluation simulations, or script automated specialist hand-offs.

Both primitives use the same enriched context pipeline as regular chat — the agent draws on memory, personality, and mood when it responds.

What you can build with it

Events

Level-up celebrations — your game backend detects a rank change and fires a level_up event; the agent congratulates the user in its own voice
Daily summaries — a cron job fires a daily_summary event with session stats in metadata; the agent writes a personalised recap
Achievement unlocks — trigger a proactive message the moment a user hits a milestone, so the agent's enthusiasm lands while the moment is fresh
External state changes — order shipped, appointment confirmed, subscription renewed; the agent reacts to your system events rather than waiting for the user to ask

Multi-Agent Dialogue

NPC interactions — two character AIs converse while the user watches; each agent stays in its own voice and draws on its own memory
Simulation runs — iterate N agents through scripted scenarios for offline evaluation without a real user in the loop
Specialist hand-offs — agent A poses a question to agent B and incorporates the answer before responding to the user

Quickstart — Trigger an event

Fire a level_up event with structured metadata. The agent generates a reaction and the platform queues it for delivery through the same channels as other proactive messages.

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

const result = await client.agents.triggerBackendEvent("agent_abc", {
userId: "user_123",
eventType: "level_up",
eventDescription: "The user just reached level 25 — a major milestone in the game.",
metadata: {
  new_level: "25",
  previous_level: "24",
  xp_total: "12500",
},
});

console.log(result.accepted); // true
console.log(result.event_id); // "evt_01HX..."

Quickstart — Run a dialogue

Dialogue is a per-agent call. To run a conversation between two agents, you orchestrate turns yourself: call agent A, append its response to the message history, call agent B with that updated history, and so on. Each agent independently draws on its own memory and personality.

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

// Seed the conversation — agent_b opens with the first message
const messages = [
{ role: "user" as const, content: "Tell me something interesting about the ancient ruins." },
];

// Turn 1 — agent_a responds
const turnA = await client.agents.dialogue("agent_a", {
userId: "user_123",
messages,
sceneGuidance: "Two NPCs are exploring ancient ruins together. Keep responses under 3 sentences.",
});

messages.push({ role: "assistant", content: turnA.response });

// Turn 2 — agent_b responds to what agent_a said
const turnB = await client.agents.dialogue("agent_b", {
userId: "user_123",
messages,
sceneGuidance: "Two NPCs are exploring ancient ruins together. Keep responses under 3 sentences.",
});

console.log("Agent A:", turnA.response);
console.log("Agent B:", turnB.response);

Core concepts

Events

EventType is free-form. There is no fixed enum. Common conventions used by tenants: "achievement", "daily_summary", "level_up", "order_shipped", "appointment_confirmed", "milestone". Pick names that are meaningful in your domain and stay consistent across your backend.

EventDescription is for the LLM. Write it as plain-English narration: "The user just cleared chapter 5 for the first time after 3 failed attempts." The agent's underlying model reads this and uses it to shape the reaction — be specific rather than terse.

Metadata is string-only. The metadata map accepts string → string pairs only. For nested or numeric data, either serialize into the event_description or flatten it with explicit keys ("xp_gained", "xp_total", "level_before", "level_after").

Messages field grounds the event in a prior conversation. If the event is closely tied to a conversation that just ended (for example, a daily_summary fired after a chat session), pass the recent messages. The platform uses them directly for context-sensitive generation — diary entries, summaries — instead of relying on lossy consolidation. Omit this field for cron-driven events that have no associated conversation.

TriggerEventResponse contains two fields:

accepted (bool) — whether the platform accepted the event for processing
event_id (string) — an opaque identifier for the queued event; store it if you want to correlate platform logs

Dialogue

Each call is per-agent. The dialogue method is scoped to a single agent: you pass an agentId and the current message history. To model a conversation between two agents, you manage the turn loop — append each response to the shared messages slice and alternate which agentId you call.

Messages carry the full context. Unlike chat, which manages conversation history server-side per session, dialogue expects you to pass the full message thread with every call. You control the window.

sceneGuidance steers both tone and constraints. Pass a brief instruction describing the scene and any constraints ("keep responses under 3 sentences", "the agents are rivals", "agent_a does not know about the treasure") so both sides stay in character.

requestType signals the call's purpose. An optional free-form tag ("npc_scene", "eval_round", "specialist_consult") that downstream analytics can use for filtering. Has no effect on generation.

DialogueResponse contains:

response (string) — the agent's generated text for this turn
side_effects — optional structured metadata emitted by the agent (tool calls, mood signals, etc.)

Full API

Method	Returns	Description
`triggerBackendEvent(agentId, opts)` · `trigger_backend_event(agent_id, ...)` · `TriggerEvent(ctx, agentID, opts)`	`TriggerEventResponse`	Fire a backend event and queue an agent reaction
`dialogue(agentId, opts)` · `dialogue(agent_id, ...)` · `Dialogue(ctx, agentID, opts)`	`DialogueResponse`	Generate one turn of agent dialogue

TriggerEventOptions / trigger_backend_event kwargs:

Field	Type	Required	Description
`userId` / `user_id` / `UserID`	`string`	Yes	The user this event belongs to
`eventType` / `event_type` / `EventType`	`string`	Yes	Free-form event name, e.g. `"level_up"`
`eventDescription` / `event_description` / `EventDescription`	`string`	No	Plain-English narration for the LLM
`metadata` / `Metadata`	`Record<string,string>` / `dict[str,str]` / `map[string]string`	No	Structured string-only metadata
`language` / `Language`	`string`	No	Locale override (e.g. `"ja"`)
`instanceId` / `instance_id` / `InstanceID`	`string`	No	Instance scope
`messages` / `Messages`	`ChatMessage[]`	No	Recent conversation that triggered this event

DialogueOptions / dialogue kwargs:

Field	Type	Required	Description
`userId` / `user_id` / `UserID`	`string`	No	User context for the agent
`messages` / `Messages`	`ChatMessage[]`	No	Full conversation history for this turn
`sceneGuidance` / `scene_guidance` / `SceneGuidance`	`string`	No	Instruction scoping tone and constraints
`requestType` / `request_type` / `RequestType`	`string`	No	Tag for analytics (e.g. `"eval_round"`)
`instanceId` / `instance_id` / `InstanceID`	`string`	No	Instance scope

Combines with other features

With Proactive Messaging — events as the dev-controlled push source

Proactive Messaging has three sources: Scheduled Reminders (recurring cadence), Wakeups (one-off timed), and TriggerEvent (your backend fires it when something happens). TriggerEvent is the push-based source you control directly — no schedule required, no timer running. When the event is accepted, the platform routes the generated reaction through the same delivery channels as the other two sources: SSE if the user has an active stream, the polling notifications API, or your registered webhook.

// Proactive triangle in code form:

// Source 1 — recurring schedule (time-based)
await client.schedules.create("agent_abc", "user_123", {
  cadence: { simple: { frequency: "daily", times: ["09:00"] }, timezone: "Asia/Tokyo" },
  intent: "morning check-in",
  check_type: "reminder",
});

// Source 2 — one-off wakeup (time-based)
await client.agents.scheduleWakeup("agent_abc", {
  user_id: "user_123",
  check_type: "appointment_reminder",
  intent: "remind the user about their dentist appointment",
  delay_hours: 2,
});

// Source 3 — TriggerEvent (you push it when something happens)
await client.agents.triggerBackendEvent("agent_abc", {
  userId: "user_123",
  eventType: "appointment_confirmed",
  eventDescription: "The user just confirmed their 3pm dentist appointment for tomorrow.",
});

With Conversations — Messages field grounds the event in context

When a TriggerEvent fires immediately after a chat session — for example, a daily_summary event at session end — pass the recent conversation messages in the messages field. The platform uses them directly as conversation history for context-sensitive generation (diary entries, personality updates) instead of relying on condensed consolidation summaries. The agent's reaction then references what was actually said rather than a lossy reconstruction.

// After a chat session ends, fire a daily_summary event with the full message history
const sessionMessages = [
  { role: "user", content: "I finally finished that project I was stressing about." },
  { role: "assistant", content: "That's huge! You've been working on that for weeks." },
  { role: "user", content: "Yeah. Feels good. Think I'll take the evening off." },
];

await client.agents.triggerBackendEvent("agent_abc", {
  userId: "user_123",
  eventType: "daily_summary",
  eventDescription: "Session ended. User shared a work win and plans to rest.",
  messages: sessionMessages, // grounds the summary in what was actually said
});

With Evaluation — Dialogue as a scoring harness

Run a judge agent and a subject agent in a dialogue loop to score the subject's responses without a real user. The judge poses questions, the subject answers, and you feed both transcripts to your evaluation rubric. This lets you evaluate agent quality at scale offline.

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

const JUDGE_AGENT    = "agent_judge";
const SUBJECT_AGENT  = "agent_subject";
const USER_ID        = "eval_run_001";

const messages = [
  { role: "user" as const, content: "I'm feeling really overwhelmed lately." },
];

// Subject responds to the user prompt
const subjectTurn = await client.agents.dialogue(SUBJECT_AGENT, {
  userId: USER_ID,
  messages,
  requestType: "eval_round",
});

messages.push({ role: "assistant", content: subjectTurn.response });

// Judge scores the subject's response
const judgeTurn = await client.agents.dialogue(JUDGE_AGENT, {
  userId: USER_ID,
  messages,
  sceneGuidance:
    "You are evaluating the previous assistant response for empathy and clarity. " +
    "Return a JSON object with keys: score (0–100), feedback (string).",
  requestType: "eval_judge",
});

console.log("Subject:", subjectTurn.response);
console.log("Judge verdict:", judgeTurn.response);

// Then score the exchange through the evaluation API
const evalResult = await client.agents.evaluate(SUBJECT_AGENT, {
  templateId: "empathy-rubric",
  messages,
});
console.log("Eval score:", evalResult.score);

Tutorials

Medication Reminders — end-to-end example using Schedule + TriggerEvent + Memory together
Scheduled Reminders walkthrough — covers cadence patterns that pair with events

Next steps

Proactive Messaging — the three sources and delivery channels
Conversations — regular agent chat mechanics
Scheduled Reminders — recurring cadence primitive
Evaluation — scoring agent responses

思维层

智能体生成

生成并创建智能体

一步到位：用自然语言描述智能体，获得一个完整配置好人格、传记和种子记忆的智能体。

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: "sk-..." });

const agent = await client.agents.generation.generateAndCreate({
name: "Luna",
description: "A cheerful and curious AI assistant who loves helping developers debug code.",
language: "en",
});

幂等更新

如果提供了 agentId 且智能体已存在，generateAndCreate 会更新现有智能体而不是创建副本。

生成角色档案

生成完整的人格档案而不创建智能体。适合在确认前预览。

const profile = await client.agents.generation.generateCharacter({
name: "Atlas",
description: "A stoic, wise mentor who speaks in metaphors and values patience above all.",
fields: ["big5", "dimensions", "preferences", "behaviors"],
});

生成传记

为现有智能体生成或重新生成传记。

const bio = await client.agents.generation.generateBio("agent-id", {
description: "A friendly barista who remembers every customer's order",
style: "warm and conversational",
});
console.log(bio.bio);

生成种子记忆

根据人格生成并可选地存储背景故事记忆。

const memories = await client.agents.generation.generateSeedMemories("agent-id", {
agentName: "Luna",
trueInterests: ["astronomy", "poetry", "hiking"],
trueDislikes: ["loud noises", "dishonesty"],
generateOriginStory: true,
generatePersonalizedMemories: true,
storeMemories: true,
});
console.log(memories.memories.length);

生成图像

使用 AI 从文本提示生成图像。

const image = await client.agents.generation.generateImage("agent-id", {
prompt: "A serene mountain landscape at sunset",
});
console.log(image.url);

实际应用

生成对伴侣（大规模角色创建）和 AI 员工（基于角色的智能体模板）最有价值。对于企业，生成通常是一次性的配置步骤。

将 generateAndCreate 作为你的入职流程。 让用户在文本框中描述他们的伴侣。调用 API。向他们展示生成的角色。如果他们不满意，重新生成。这是目前最好的首次使用体验。

const agent = await client.agents.generation.generateAndCreate({
  name: userInput.name,
  description: userInput.description,
  language: "en",
});

用 generateCharacter 预览后再确认。 如果你希望用户在保存之前审批人格档案，使用 generateCharacter 预览，然后只在确认时 create。

生成种子记忆以获得可信的背景故事。 一个在用户到来之前就"记得"往事的伴侣感觉更真实。

从这里开始

Sonzai 心智层

Sonzai 是 AI 智能体的心智层：一个托管平台，为任意智能体提供持久记忆、进化人格、情绪、关系和知识图谱。通过 REST、MCP 或 TypeScript、Python、Go 原生 SDK 集成。

安装

选择适合你技术栈的接入方式。所有方式都连接到同一个托管 API —— 你可以混合使用（例如后端用 Python，再让 Claude Desktop 通过 MCP 连接做运维）。

pip install sonzai

Python 3.11+。同步（Sonzai）和异步（AsyncSonzai）客户端同包发布。
TypeScript 支持 Node.js >=18、Bun 和 Deno。零运行时依赖。
Go 1.25+。仅依赖标准库。
所有 SDK 默认从环境变量 SONZAI_API_KEY 读取密钥。
使用 OpenClaw 路径前需先安装 OpenClaw 本身——详见 openclaw.ai（快速开始）。
完整指南：MCP · OpenClaw · REST API 参考。

还没有 API 密钥？前往 platform.sonz.ai 创建项目，然后看快速开始了解完整流程。

你在构建什么？

选择与你产品匹配的方向。每个快速入门只讲与你用例相关的功能，并明确标注哪些内容可以跳过。

AI Employees & Personal AI

Task-oriented agents that remember users across sessions.

Per-user memory, custom tools, knowledge base, task notifications. Skip mood and emotions if you don't need them.

Start here →

AI Companions

Character-driven agents with personality, mood, and evolving relationships.

Big Five personality, 4D mood, relationship tracking, proactive wakeups, voice. Everything Sonzai is known for.

Start here →

Enterprise AI Agents

Workflow-aware agents for CRM, support, internal tools, and compliance.

Multi-instance isolation, webhooks, project-scoped knowledge base, custom states, eval runs.

Start here →

核心能力

以下所有功能对三类用户均适用，但侧重点各有不同。使用每个页面上的实践案例标签页，直接跳转到你用例的示例。

记忆与上下文

跨会话的持久记忆 — 事实、事件、承诺和摘要。通过 SDK 进行播种、搜索和浏览。

全部用户

人格

基于大五人格（OCEAN）模型的行为维度、交互偏好和特质进化。

实时流式聊天。每次对话后自动更新记忆、情绪和人格。

全部用户

知识库

上传文档或推送结构化数据，构建智能体在对话中可搜索的知识图谱。

全部用户

自定义状态与工具

附加任意状态，注册 LLM 可调用的工具。

全部用户

多实例

一个智能体，多个隔离上下文 — 按用户、按工作空间、按环境。

智能体根据关系上下文、任务或 SLA 自动安排回访。

通过自然语言描述创建智能体，自动生成人格、简介和种子记忆。

全部用户

评估

使用评分标准评估智能体质量，运行多轮模拟，基准测试一致性。

全部用户

智能体如何随时间变得更聪明

自动学习的全貌：记忆衰减、整合、去重、检索策略学习、性格漂移、突破和影子部署系统。

全部用户

BYOK ── 自带提供商密钥

继续跑在 Sonzai 上，但把 OpenAI / Gemini / xAI / OpenRouter 的 token 计费记到自己账户。按项目，静态加密。

全部用户

集成

快速开始

创建项目、获取 API 密钥、启动智能体，10 分钟内开始对话。

架构

平台、编排器与你的后端如何协同工作。

集成指南

TypeScript、Python 和 Go 的端到端 SDK 集成。

API 参考

包含所有端点和 Schema 的完整 REST API 参考文档。

MCP 集成

将 Claude Desktop、Cursor 或任意支持 MCP 的客户端连接到 Sonzai。

OpenClaw 插件

OpenClaw 上下文引擎的即插即用插件。

供 AI 智能体使用

要把 Sonzai 文档喂给 AI 助手？每个页面都有复制给 LLM 按钮，下方的包已预先格式化，可直接摄入。

llms.txt

Terse index of the docs for LLM tools.

llms-full.txt

Full docs concatenated for LLM ingestion.

llms-companions.txt

Subset for AI Companion builders.

llms-employees.txt

Subset for AI Employee / Personal AI builders.

llms-enterprise.txt

Subset for Enterprise Agent builders.

每个页面还有原始 Markdown URL：在任意文档路径后加 .md。例如，/docs/en/memory.md 返回可直接粘贴到 LLM 或管道工具的纯 Markdown。

如何融入你的应用

你的后端处理业务逻辑和用户会话。心智层负责智能体的智能 — 人格、记忆、情绪和关系。通过 REST、MCP 或 SDK 连接；每次请求传递应用上下文；其余由平台管理。

阅读完整集成指南 →

扩展性

实例

什么是实例？

Instance 是智能体的一个部署上下文。智能体本身（人格、记忆、工具）是共享的 — 但自定义状态按实例隔离。

智能体 "Luna"
├── 实例: default          ← 省略 instanceId 时使用
├── 实例: ws-us-east       ← 美东工作区
├── 实例: ws-eu-west       ← 欧西工作区
└── 实例: ws-staging       ← 独立部署

每个实例拥有:
• 全局自定义状态（环境状态、配置）
• 按实例隔离的用户级自定义状态
• 与其他实例完全隔离

默认实例

每个智能体都有一个默认实例。如果您不传 instanceId 给聊天或状态操作，将使用默认实例。只有在需要同一智能体在多个隔离上下文中并行运行时，才需要多个实例。

列出实例

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: "sk-..." });

const instances = await client.agents.instances.list("agent-id");

for (const inst of instances) {
console.log(inst.instanceId, inst.name, inst.isDefault, inst.status);
}

创建实例

const instance = await client.agents.instances.create("agent-id", {
name: "Workspace US-East",
description: "US-East production workspace",
});

console.log(instance.instance_id); // 存储此值

获取实例

const instance = await client.agents.instances.get(
"agent-id",
"instance-id",
);

console.log(instance.name, instance.status, instance.isDefault);

更新实例

await client.agents.instances.update("agent-id", "instance-id", {
name: "Workspace US-East (Production)",
status: "active",    // "active" | "inactive"
});

重置实例

清除实例的所有自定义状态数据，但不删除实例本身。适用于在会话之间重置环境。

// 清除此实例范围内的所有自定义状态
await client.agents.instances.reset("agent-id", "instance-id");

删除实例

await client.agents.instances.delete("agent-id", "instance-id");

在聊天中使用实例

在聊天调用中传入 instanceId 将状态读取限定在该实例范围内。智能体会看到该实例的全局自定义状态和限定在该实例的用户级状态。

for await (const event of client.agents.chatStream({
agent: "agent-id",
messages: [{ role: "user", content: "What's the current status?" }],
userId:     "user-123",
instanceId: "ws-us-east",      // 将状态读取限定到此实例
})) {
process.stdout.write(event.choices?.[0]?.delta?.content ?? "");
}

实例数据模型

instanceId (string): 唯一实例标识符
agentId (string): 父级智能体 ID
name (string): 人类可读标签
description (string?): 可选描述
status (string): "active" 或 "inactive"
isDefault (boolean): 自动创建的默认实例为 true
createdAt (string): ISO 8601 时间戳
updatedAt (string): ISO 8601 时间戳

Inventory

Inventory is the place to store structured per-user data the agent should know about. Each item belongs to a single agent × user pair and follows a schema defined in your Knowledge Base, so the agent always has typed, queryable data rather than free-form text. When the agent adds an item it searches the KB by description to resolve and link the right node automatically.

What you can build with it

Medication adherence — track each drug, dose, and schedule per user (pairs with Scheduled Reminders)
Portfolio / holdings — stocks, crypto, collectibles with market-value joins via the KB
Pet care — pets per user, feeding, vet, and growth tracking
Goal tracking — user-defined goals with progress state
Plants / hobbies — anything that follows a "user has N things of type T" pattern

Quickstart

Add a medication to a user's inventory. The response includes an inventory_item_id (and the backward-compatible fact_id alias) you can use for direct updates or deletes later.

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

// Preferred: dedicated create route — no action field needed
const result = await client.agents.inventory.create("agent_abc", "user_123", {
item_type: "medication",
label: "Metformin",
description: "Metformin 500mg — biguanide for blood sugar control",
properties: {
  dose_mg: 500,
  frequency: "twice daily",
  with_food: true,
},
});

console.log(result.inventory_item_id); // "inv_01HX..." (preferred)
console.log(result.fact_id);           // "fact_01HX..." (backward compat alias)
console.log(result.status);            // "added" | "disambiguation_needed"

// Alternative: explicit-action route (still supported for backward compat)
// await client.agents.inventory.update("agent_abc", "user_123", { action: "add", item_type: "medication", ... });

KB resolution on add

When action is "add", the platform performs a natural-language search of the KB using description. If exactly one node matches, the item is linked automatically and the response includes kb_resolution. If there are multiple close matches, the response returns status: "disambiguation_needed" and a candidates list — surface these to the user or pick the best kb_node_id and re-submit.

label vs description

label is an optional short display name shown in dashboards and agent tool calls (e.g. "Metformin"). description is the longer text the platform uses for KB natural-language search (e.g. "Metformin 500mg — biguanide for blood sugar control"). If label is omitted, the platform falls back to the first segment of description for display purposes.

Core concepts

Items belong to users — every item is scoped to agent_id × user_id; no item is shared across users
Schema-driven shape — item_type references a KB schema that defines the valid property fields; the platform validates writes against it
Two write paths for adding items — use inventory.create({...}) (dedicated route, no action field) for cleaner code when you specifically want to add; use inventory.update({action: "add", ...}) (explicit-action route) when you handle add/update/remove through a single call site. Both hit equivalent server logic.
label vs description — label is a short display name for dashboards and agent UI (e.g. "Ibuprofen"); description is the longer text the KB search uses to resolve the right node (e.g. "anti-inflammatory pain reliever, 400mg"). Both are optional but providing both gives the clearest results.
KB resolution — on add, Sonzai searches the KB by description; on ambiguous matches it returns candidates and status: "disambiguation_needed" so you can resolve before committing
Query modes — "list" returns raw items, "value" joins with live KB market data and computes gain_loss, "aggregate" returns totals and grouped sums without listing every item

Full API

Method	Description
`inventory.create(agentId, userId, { item_type, label?, description?, kb_node_id?, properties?, project_id? })`	Preferred add path. Dedicated create endpoint — no `action` field needed. Returns `InventoryUpdateResponse` with `inventory_item_id`.
`inventory.update(agentId, userId, { action, item_type, label?, description?, kb_node_id?, properties?, project_id? })`	Explicit-action path. `action` is `"add"`, `"update"`, or `"remove"`. Use when you route all three write types through one call site.
`inventory.query(agentId, userId, { mode, item_type?, project_id?, filters?, sort_by?, sort_order?, aggregations?, limit?, offset?, cursor? })`	Query items in list, value, or aggregate mode.
`inventory.directUpdate(agentId, userId, factId, { properties })`	Update an item's properties by `fact_id`, bypassing KB re-resolution.
`inventory.directDelete(agentId, userId, factId)`	Delete an item by `fact_id`.
`inventory.batchImport(agentId, userId, { items: [{ item_type, description?, kb_node_id?, properties? }], project_id? })`	Import up to 1,000 items in one call.

Response shape

InventoryUpdateResponse:

{
  "status": "added",
  "inventory_item_id": "inv_01HX...",
  "fact_id": "fact_01HX...",
  "kb_resolution": {
    "resolved": true,
    "kb_node_id": "node_xyz",
    "kb_label": "Metformin 500mg",
    "kb_properties": { "drug_class": "biguanide" }
  }
}

inventory_item_id is the preferred identifier going forward. fact_id is included for backward compatibility — both refer to the same item and are interchangeable in all subsequent API calls (direct update, direct delete, schedule linkage).

When status is "disambiguation_needed", the response includes a candidates array instead of kb_resolution. Re-submit with the chosen kb_node_id set explicitly to bypass the search.

Combines with other features

With Knowledge Base — schemas shape items

The item_type field points to a KB entity schema that defines which properties are valid for that type. Create the schema once; all inventory writes for that type are validated against it.

// 1. Define the schema in the KB once
await client.knowledge.createSchema("proj_abc123", {
entity_type: "medication",
fields: [
  { name: "dose_mg",    type: "number", required: true },
  { name: "frequency",  type: "string", required: true },
  { name: "with_food",  type: "boolean", required: false },
],
});

// 2. Inventory writes for item_type "medication" are now validated
await client.agents.inventory.update("agent_abc", "user_123", {
action: "add",
item_type: "medication",   // <-- resolves to the schema above
description: "Metformin 500mg",
properties: { dose_mg: 500, frequency: "twice daily", with_food: true },
});

With Scheduled Reminders — live data at fire time

A schedule can reference an inventory_item_id. At each fire, the agent reads the item's current properties rather than a snapshot baked into the schedule definition. Updating the item's dosage automatically flows to the next reminder without touching the schedule itself.

// Add the item first
const { fact_id } = await client.agents.inventory.update("agent_abc", "user_123", {
action: "add",
item_type: "medication",
description: "Metformin 500mg",
properties: { dose_mg: 500, frequency: "twice daily" },
});

// Reference it in a schedule — agent reads live properties at each fire
await client.schedules.create("agent_abc", "user_123", {
cadence: {
  simple: { frequency: "daily", times: ["08:00", "20:00"] },
  timezone: "America/New_York",
},
intent: "remind the user to take their medication",
inventory_item_id: fact_id,
});

With Memory — inventory state in conversation context

During a conversation the agent can query the user's inventory to answer questions like "what medications am I taking?" directly. Inventory writes also generate memory facts that surface in future sessions, so the agent can reference holdings and items across conversations without a manual query.

// Agent answers from inventory mid-conversation
for await (const event of client.agents.chatStream("agent_abc", {
userId: "user_123",
messages: [{ role: "user", content: "What medications am I on?" }],
})) {
// The agent calls sonzai_inventory internally to fetch the user's items
// and answers from live data — no extra code needed.
process.stdout.write(event.choices?.[0]?.delta?.content ?? "");
}

Tutorials

Resource Inventory + Knowledge Base — full walkthrough with schema setup, upsert, bulk import, and portfolio queries
Medication Reminders — worked example combining Inventory + Scheduled Reminders + Memory

Next steps

Knowledge Base — the schema backbone that shapes inventory items
Scheduled Reminders — live inventory data injection at fire time
Memory — how inventory writes surface in chat context

Knowledge Analytics

Knowledge Analytics layers a ranking system on top of the Knowledge Base. Rules define scoring signals — per-user affinity for recommendations, aggregate velocity for trends — and readers fetch ranked results at query time with a single call. The graph backbone supplies the nodes and edges; analytics rules decide how to score and order them. The result is a reusable ranking layer that powers product recommendations, trending dashboards, and conversion tracking without building a separate data pipeline.

What you can build with it

Product recommendations — "top-N products for this user" based on user affinity signals
Trending topics — "what's rising this week across all users" via aggregate velocity scoring
Conversion dashboards — which KB nodes convert (browse to engage to buy) and at what rate
Per-segment ranking — different recommendation models for different user segments
Feedback loops — record converted recommendations to continuously sharpen scoring over time

Quickstart

Create a recommendation rule, fetch ranked results for a user, then record whether the user acted on a recommendation.

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });
const projectId = "proj_abc123";

// 1. Create a recommendation rule
const rule = await client.knowledge.createAnalyticsRule(projectId, {
rule_type: "recommendation",
name:      "product-affinity",
config:    { target_entity_type: "product", scoring: "affinity" },
enabled:   true,
schedule:  "0 * * * *",  // recompute hourly
});

// 2. Fetch top-5 recommendations for a user
const recs = await client.knowledge.getRecommendations(
projectId,
rule.rule_id,
"user_123",   // source_id — the user whose affinity to score against
5,
);

for (const rec of recs.recommendations) {
console.log(rec.target_id, rec.score);
}

// 3. Record that the user converted on the top result
await client.knowledge.recordFeedback(projectId, {
source_node_id: "user_123",
target_node_id: recs.recommendations[0].target_id,
rule_id:        rule.rule_id,
converted:      true,
score_at_time:  recs.recommendations[0].score,
});

Core concepts

Rule types — "recommendation" scores nodes per source (e.g. per user), returning a personalised top-N list. "trend" aggregates signals across all sources, returning global velocity rankings.
Config is rule-specific — the config object is a passthrough shape; its fields depend on the rule type and your scoring model. There is no fixed schema enforced by the SDK — pass whatever your rule implementation expects (e.g. target_entity_type, scoring, decay_factor).
Source and target semantics — recommendations take a source_id (typically a user node ID) and return ranked nodes of the target entity type. The source must exist as a node in the Knowledge Base graph.
Scheduled vs manual — rules can carry an optional cron schedule for batch recomputation (e.g. "0 * * * *" for hourly). Call RunAnalyticsRule at any time to trigger a manual run outside the schedule.
Feedback closes the loop — RecordFeedback writes a signal back against the source, target, and rule. Subsequent recomputation can weight nodes that historically converted higher, sharpening ranking over time. Use the action field to record fine-grained user intent: "converted" (user completed the action), "clicked" (user opened the recommendation), "dismissed" (user explicitly rejected it), or "ignored" (recommendation was shown but user did not interact). action: "converted" sets converted: true automatically so existing aggregate conversion queries continue to work without changes.

Full API

Method	Returns	Description
`createAnalyticsRule(projectId, { rule_type, name, config, enabled, schedule? })`	`KBAnalyticsRule`	Create a new analytics rule. `rule_type` is `"recommendation"` or `"trend"`.
`listAnalyticsRules(projectId)`	`KBAnalyticsRuleListResponse`	List all analytics rules for a project.
`getAnalyticsRule(projectId, ruleId)`	`KBAnalyticsRule`	Fetch a single rule by ID.
`updateAnalyticsRule(projectId, ruleId, { name?, config?, enabled, schedule? })`	`KBAnalyticsRule`	Update rule properties.
`deleteAnalyticsRule(projectId, ruleId)`	`void`	Delete a rule permanently.
`runAnalyticsRule(projectId, ruleId)`	`void`	Trigger an immediate manual run of the rule.
`getRecommendations(projectId, ruleId, sourceId, limit?)`	`KBRecommendationsResponse`	Fetch ranked recommendations for a source node. Returns `.recommendations` array of `{ target_id, target_type, score }`.
`getTrends(projectId, nodeId)`	`KBTrendsResponse`	Get trend aggregations for a specific node across all windows. Returns `.trends` array of `{ node_id, rule_id, window, value, direction }`.
`getTrendRankings(projectId, ruleId, rankingType, window, limit?)`	`KBTrendRankingsResponse`	Get the global leaderboard for a trend rule. `window` is a duration string such as `"7d"` or `"24h"`. Returns `.rankings` array with `rank` and `score`.
`getConversions(projectId, ruleId, segment?)`	`KBConversionsResponse`	Fetch conversion statistics for a rule, optionally filtered by segment key. Returns `{ shown_count, conversion_count, conversion_rate }` per segment.
`recordFeedback(projectId, { source_node_id, target_node_id, rule_id, converted, score_at_time, action? })`	`void`	Record whether a recommended node was acted on. `converted` is a boolean — `true` means the user engaged with the recommendation. `action` is an optional string enum: `"converted"`, `"dismissed"`, `"clicked"`, `"ignored"`. Passing `action: "converted"` also sets `converted: true` for backward-compatible aggregate queries.
`getStats(projectId)`	`KBStats`	General KB statistics (node counts, document counts, extraction tokens).

Python keyword arguments

The Python SDK exposes get_recommendations, get_trends, get_trend_rankings, get_conversions, and record_feedback using keyword-only arguments after project_id. For example: client.knowledge.get_recommendations(project_id, rule_id="...", source_id="...", limit=10).

Combines with other features

With Knowledge Base — the graph is the substrate

Analytics rules run over KB nodes and edges. Entity schemas define what types of nodes exist; rules score those nodes. The recommended pattern is to define your entity schema first, then create rules that target it.

// 1. Define a product schema in the KB
await client.knowledge.createSchema(projectId, {
entity_type: "product",
fields: [
  { name: "price",    type: "number", required: true },
  { name: "category", type: "string", required: true },
  { name: "in_stock", type: "boolean", required: true },
],
});

// 2. Push some product nodes
await client.knowledge.insertFacts(projectId, {
facts: [
  { entity_type: "product", label: "Razer Blade 16", properties: { price: 2999, category: "laptop", in_stock: true } },
  { entity_type: "product", label: "Razer DeathAdder V3", properties: { price: 79, category: "mouse", in_stock: true } },
],
});

// 3. Create a recommendation rule targeting the product entity type
const rule = await client.knowledge.createAnalyticsRule(projectId, {
rule_type: "recommendation",
name:      "product-recs",
config:    { target_entity_type: "product" },
enabled:   true,
});

With Inventory — per-user holdings drive per-user recommendations

Inventory writes create edges from a user node to the nodes they own. Those ownership edges flow into the recommendation model as affinity signals: items a user already owns inform which related nodes score highest.

// 1. User buys a product — record it in inventory
const { fact_id } = await client.agents.inventory.update("agent_abc", "user_123", {
action:      "add",
item_type:   "product",
description: "Razer DeathAdder V3",
properties:  { purchase_date: "2026-04-01" },
});

// 2. The inventory write creates a user→product edge in the KB graph.
//    The recommendation rule can now weight products related to the
//    DeathAdder higher for this user.
const recs = await client.knowledge.getRecommendations(
projectId,
rule.rule_id,
"user_123",
5,
);
// recs.recommendations may now include accessories or similar peripherals

With Agent Insights — conversation signals sharpen rankings

Agent Insights extract what users express interest in during conversations. Those interest signals can be passed into recommendation rule config as additional affinity weights, so a user who talks about budget peripherals gets different rankings than one who discusses high-end setups — without any explicit user input.

Tutorials

No dedicated Knowledge Analytics tutorial exists yet. The Knowledge Base tutorial covers schema setup and fact insertion — the prerequisite steps before creating analytics rules.

Next steps

Knowledge Base — the graph backbone; define schemas and push nodes before creating rules
Inventory — per-user holdings create user-to-node edges that feed the recommendation model
Organization Knowledge Base — analytics rules can also run over org-scoped KB nodes for shared ranking across all users

多人记忆

知识库

概览

知识库是一个与领域无关的系统，将你的数据转化为结构化知识图谱。AI 智能体在对话期间搜索这个图谱，以提供准确、有据可查的回答，而不是凭空生成。

而且这是多人协作的。智能体可以在对话中将学到的内容自主写回到项目知识库中，所有其他智能体在下一次会话中都能读取 — 形成一个像人类组织记忆一样累积复利的闭环公司大脑。此外，服务于一个团队的单个智能体可以跨用户携带带归属的记忆，因此它可以用与用户 B 交谈时收集的上下文来回应用户 A。详见智慧（Wisdom）与共享记忆（Shared Memory）。

项目（你的租户）
                            |
 +--------------------------+--------------------------+
 |                          |                          |
 v                          v                          v
智能体 A                   智能体 B                   智能体 C
 |                          |                          |
 |--- 写入已验证 ----------+                          |
 |    的事实                |                          |
 |                          |                          |
 |                  读取 + 为回答提供依据              |
 |                          |                          |
 |                          +--- 更新条目 ------------>|
 |                                                     |
 |                                          读取已强化的事实
 v                          v                          v
用户 X                     用户 Y                     用户 Z

 智能体之间：闭环。一个智能体学到的任何东西，项目内
 其他所有智能体都能立即获取。

 智能体内部：单个智能体也可以在它服务的用户之间共享
 记忆 — 带归属（sharedMemory）或不带归属（wisdom）。
 同一智能体、多位用户、共享上下文。

上传文档 / API 推送
     |
     v
提取实体 + 关系
     |
     v
构建知识图谱（去重节点 + 边）
     |
     v
运行分析规则（推荐、趋势）
     |
     v
智能体在对话中查询图谱

数据如何进入知识库

填充知识库有两种方式，外加一个可选能力，可以叠加在两者之上：

1. 手动上传。 通过 SDK 或仪表板上传 PDF、DOCX、Markdown 或纯文本文件。平台自动提取实体和关系并写入图谱。适用于你掌控的静态文档 — 手册、政策、产品说明、世界观。一次性，或在源变更时重新上传。→ 文档上传

2. 在增量变化时推送的 ETL 作业。 一次定义实体模式；让你的作业按计划、队列或变更数据捕获流调用 insertFacts 或 bulkUpdate。适用于实时上游真实源 — 数据库、价格数据流、CMS、爬虫 — 这样源一变 KB 就保持同步。Upsert 是幂等的：推同一标签两次会合并属性并增加版本号，因此同样的作业在任何节奏下重跑都是安全的。→ SDK：插入与搜索

+ 智能体自主编辑 (可选开关 — 按智能体或按项目全局启用／禁用)。打开 knowledgeBaseWrite 能力后，智能体获得 knowledge_create / knowledge_update / knowledge_delete 工具。在对话中智能体自己记录已验证的事实，每次写入都打上 source = "agent:<agent-id>" 的戳，并采用 compare-and-swap 更新语义，因此并发的管理员编辑不会被静默覆盖。当真相之源就是对话本身时使用 — 客服记录已验证的事故详情、客户成功捕获续约上下文、记录员智能体写会议记录等。

两条导入路径相互独立 — 用其中之一、两者、或都不用。自主编辑是按智能体的开关（或通过 default_agent_kb_write 设的项目级默认值），叠加在你已经在运行的任何导入路径之上。你始终保持掌控：每次智能体写入都在服务器端针对你的模式验证、按配额限制、范围限定到智能体自己的项目，且可逆 — 仅软删除，硬删除仍然只属于管理员。

手动上传                 增量变化时 ETL              对话中的智能体
 (PDF / DOCX / MD)        (insertFacts / bulkUpdate)  (knowledge_create / update / delete)
      |                          |                          |
      |                          |                          | 需要
      |                          |                          | knowledgeBaseWrite: true
      v                          v                          v
 +----------------------------------------------------------------+
 |                  项目知识图谱                                   |
 |   实体 + 关系 + 版本历史 + 审计追踪                            |
 +----------------------------------------------------------------+
                            |
                            v
            智能体在每次对话中通过
            knowledge_search 查询

两种数据导入方式

1. 文档上传

上传 PDF、DOCX、Markdown 或纯文本文件。平台自动处理文档，提取实体和关系，并将其添加到知识图谱中。

import fs from "fs";

const doc = await client.knowledge.uploadDocument(projectId, {
file: fs.createReadStream("product_catalog.pdf"),
fileName: "product_catalog.pdf",
});

console.log(doc.documentId, doc.status); // "processing"

2. API 推送（结构化数据）

以编程方式推送结构化数据——非常适合爬虫、库存信息流、价格追踪器或任何外部数据源。

await client.knowledge.insertFacts(projectId, {
source: "inventory_sync",
facts: [
  {
    entityType: "product",
    label: "Widget Pro X",
    properties: {
      price:    29.99,
      category: "electronics",
      inStock:  true,
      tags:     ["wireless", "bluetooth"],
    },
  },
],
relationships: [
  { fromLabel: "Widget Pro X", toLabel: "Electronics", edgeType: "belongs_to" },
],
});

插入更新语义

如果具有相同标签和类型的实体已存在，属性将被合并且版本号递增。每次变更都以来源和时间戳记录在版本历史中。

SDK：插入与搜索

使用 SDK 从你的服务器推送数据并执行搜索：

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: "sk-..." });

// 插入实体
await client.knowledge.insertFacts(projectId, {
source: "price_sync",
facts: [
  {
    entityType: "item",
    label: "Charizard Base Set",
    properties: { price: 450, trend: "+12%", condition: "PSA 10" },
  },
],
relationships: [
  { fromLabel: "Charizard Base Set", toLabel: "Fire Pokemon", edgeType: "is_type" },
],
});

// 搜索
const results = await client.knowledge.search(projectId, {
query: "fire pokemon under 500",
type: "item",
limit: 10,
});
for (const r of results.results) {
console.log(r.label, r.properties);
}

SDK：批量更新

使用 bulkUpdate 批量同步外部数据（价格、库存、统计数据）。只有变更的字段才会更新——适合频繁同步。

await client.knowledge.bulkUpdate(projectId, {
source: "price_sync",
updates: [
  { label: "Charizard Base Set", entityType: "item", properties: { price: 460 } },
  { label: "Blastoise Base Set", entityType: "item", properties: { price: 390 } },
],
});

实体模式

定义带类型字段的自定义实体类型。模式指导提取，让平台精确知道要寻找哪些字段。支持的类型：string、number、boolean、date、enum、array。

await client.knowledge.createSchema(projectId, {
entityType: "pokemon_card",
description: "Collectible trading cards",
fields: [
  { name: "price",     type: "number", required: true },
  { name: "condition", type: "enum",   enumValues: ["PSA 10", "PSA 9", "Raw"] },
  { name: "set",       type: "string", required: true },
  { name: "rarity",    type: "enum",   enumValues: ["Common", "Uncommon", "Rare", "Ultra Rare"] },
  { name: "tags",      type: "array" },
],
similarityConfig: {
  enabled: true,
  threshold: 0.7,
  fieldWeights: { price: 0.3, set: 0.4, rarity: 0.3 },
},
});

知识图谱

实体通过类型化关系相互连接。图谱支持任意实体类型和关系类型——完全与领域无关。

节点

具有类型、标签和灵活属性的实体。按标准化标签 + 类型去重。

边

节点之间的类型化关系（belongs_to、similar_to、located_in 等）。支持双向遍历。

相似性

在模式上启用后，系统会根据属性相似度分数自动在实体之间创建 similar_to 边。

版本历史

每次变更都带有来源、时间戳和旧值进行版本控制。完整的合规审计记录。

搜索

对所有实体标签和属性进行全文搜索。支持类型过滤、属性过滤和图谱遍历以包含关联实体。

const results = await client.knowledge.search(projectId, {
query:   "bluetooth speaker",
type:    "product",
limit:   10,
filters: { in_stock: true },
});

for (const r of results.results) {
console.log(r.label, r.score, r.properties);
// r.related[] 包含 1 跳遍历得到的关联节点
}

趋势分析

追踪实体属性在时间窗口（7天、30天、90天）内的变化。获取涨幅最大、跌幅最大、波动最大和均值最高的排名。

// 获取趋势排名
const trends = await client.knowledge.getTrendRankings(projectId, {
ruleId: ruleId,
type:   "top_gainers",   // "top_gainers" | "top_losers" | "most_volatile" | "highest_avg"
window: "30d",           // "7d" | "30d" | "90d"
limit:  10,
});

for (const item of trends.rankings) {
console.log(item.label, item.currentValue, item.changePercent);
}

智能体如何使用知识库

在对话期间，AI 智能体可以使用 knowledge_search 工具查询知识库。智能体从你的图谱中获取准确数据（包括实体属性、关系和推荐），而不是凭空生成事实。

用户："500 美元以下最好的卡牌是什么？"
     |
     v
智能体调用 knowledge_search("cards under 500")
     |
     v
知识库返回：Charizard（$450，+12%）、Blastoise（$380，+8%）
     |
     v
智能体："Charizard Base Set 售价 $450，本月涨幅 12%——
      500 美元以内的绝佳投资选择。"

有据可查的响应

基于知识库的响应建立在你的真实数据之上。智能体了解当前价格、库存水平、规格和关系，因为它实时查询图谱。

智慧（Wisdom）与共享记忆（Shared Memory）

专用页面

共享记忆有自己的完整文档页面 — 何时使用、如何启用和关闭、智能体获得哪些工具、如何用实时 API 探针验证它在工作、以及完整的隐私控制故事，请见 共享记忆。下面的摘要保留在这里，是为了让 KB 读者能在上下文中看到多人记忆这个钩子。

除了静态文档，与众多用户对话的智能体会沉淀出模式——重复的行为、共同的目标、稳定的偏好。Sonzai 通过两个互补的层级把这种跨用户的归纳能力暴露出来：智慧（wisdom）（去归属，默认开启）和共享记忆（shared memory）（带归属，需主动开启）。

Wisdom（去归属，默认开启）

当 wisdom 能力开启时（每个新建智能体都默认开启），平台每日运行一个推广任务：从每位用户的事实历史中抽取模式，做 k-匿名化处理，再通过 LLM 改写为去归属的知识。任何个体用户都无法被识别。每个智能体都能享受到「这种做法通常有效 / 这类话题常常出现」的好处，而不会泄露任何人说过什么。

这是你的免费归纳层。智能体无需调用任何工具——只要能力开启，wisdom 会自动出现在智能体的上下文中。

// 每个新建智能体的 wisdom 默认开启。如果某个具体智能体不需要跨用户归纳
//（例如单用户陪伴产品），可以在创建时或通过 updateCapabilities 显式关闭：
await client.agents.updateCapabilities("agent_abc", { wisdom: false });

默认开启，可选关闭

Wisdom 对所有智能体都开启——包括默认开启切换之前创建的旧智能体。该能力存储为三态：true、false、未设置（视为开启）。仅当你想关闭时才显式传 wisdom: false；不传则保持平台默认。

Shared Memory（带归属，需主动开启）

某些业务想要的恰好是去归属的反面——他们希望使用同一智能体的多位用户能看到谁在做什么。一个团队协作智能体可能会显示「Alice 负责迁移，Bob 负责事故响应」；一个派对协调智能体可能记录「Carol 带甜点，Dave 负责布置」。这就是 sharedMemory 能力所控制的。

开启该能力后，智能体会记录带人/实体归属的事实（角色、专长、业务上下文、关系边），并把它们暴露给共享该智能体的其他用户。三件事会发生变化：

工具。 智能体获得 wisdom_create、wisdom_update、wisdom_delete 以及关系边工具，加上后台的 CSV 导入。
上下文。 其他用户的归属事实会带着归属信息出现在智能体的每轮上下文中。
隐私底线。 每次写入都会先经过隐私黑名单（薪酬、健康、政治）的语义校验后再持久化——即使用户主动要求，智能体也不能跨用户分享不该分享的内容。

Shared memory 默认关闭。仅在智能体服务于一个群组、团队或派对，且这种跨用户可见性确实有价值时才显式开启。

// Wisdom 是前置条件（新智能体默认 ON——只有当你要覆盖默认值时才显式传）。
await client.agents.updateCapabilities("agent_abc", {
wisdom:       true,
sharedMemory: true,
});

也可以在创建智能体时直接开启：

const agent = await client.agents.create({
name:       "Team Coordinator",
project_id: "proj_abc",
tool_capabilities: {
  wisdom:        true,
  shared_memory: true,
},
});

Wisdom vs. shared memory——慎重选择

wisdom 是归纳层（安全、去归属、默认开启）；sharedMemory 是归属层（敏感、按人记录、默认关闭）。两者可以共存——但只有在使用场景真正需要跨用户可见性（小组、团队、派对、共享业务场景）时再开启 shared memory。单用户陪伴类产品请保持关闭。

使用场景

价格追踪器

每小时爬取价格，通过 API 推送。智能体用真实数据回答"什么在涨？"。趋势规则追踪涨跌幅。

房地产

从 MLS 信息流推送房源。推荐规则按预算、卧室数量、地段匹配买家与房产。

电子商务

上传产品目录。相似度检测关联相关产品。智能体根据偏好和转化数据进行推荐。

内部知识

上传员工手册、政策文档、产品手册。智能体用准确、有来源的信息回答问题。

产品目录

通过 API 推送产品数据。智能体用当前数据为用户提供选项、配置和推荐建议。

API 参考

所有端点使用 Authorization: Bearer {api_key} 请求头。

端点	方法	用途
`/knowledge/documents`	POST	上传文档（multipart）
`/knowledge/documents`	GET	列出带状态的文档
`/knowledge/documents/{docId}`	DELETE	删除文档
`/knowledge/facts`	POST	创建/更新实体和关系
`/knowledge/search?q=...`	GET	带过滤器的全文搜索
`/knowledge/nodes`	GET	列出节点（可选类型过滤）
`/knowledge/nodes/{nodeId}`	GET	获取带边和历史的节点
`/knowledge/schemas`	POST	创建实体模式
`/knowledge/schemas`	GET	列出模式
`/knowledge/schemas/{schemaId}`	PUT	更新模式
`/knowledge/stats`	GET	获取知识库统计数据
`/knowledge/analytics/rules`	POST	创建分析规则
`/knowledge/analytics/rules`	GET	列出规则
`/knowledge/analytics/rules/{ruleId}/run`	POST	触发规则执行
`/knowledge/recommendations`	GET	获取预计算推荐
`/knowledge/trends`	GET	获取趋势聚合数据
`/knowledge/trends/rankings`	GET	获取趋势排名
`/knowledge/conversions`	GET	获取转化统计数据
`/knowledge/feedback`	POST	记录推荐反馈

基础路径

以上路径均相对于 /api/v1/projects/{projectId}。从仪表盘的项目页面获取你的项目 ID。

实际应用

每类受众都会使用知识库——但上传什么以及如何查询各不相同。

上传传说，不要上传枯燥事实。 伴侣与背景故事文档结合效果最佳——角色历史、个人哲学、重要关系、世界构建。智能体会有机地将这些融入对话。

为人物和地点使用实体模式。 如果 Luna 会提及反复出现的人或地点，将它们建模为知识库实体，让智能体跨会话保持一致。

await client.knowledge.insertFacts("project-id", {
  entities: [
    { label: "Aunt Miriam", type: "person", properties: { relation: "mentor", alive: false } },
    { label: "The old observatory", type: "place", properties: { significance: "first-love" } },
  ],
});

不要过度上传。 太多素材会稀释角色声音—— 1-3 份精心挑选的文档胜过一堆文档倾倒。

思维层

记忆与上下文

记忆类别

记忆存储在四个类别中：

事实：永久性知识：用户偏好、个人信息、背景。示例："用户是一名热爱泰国菜的软件工程师。"
事件：共同经历的情节性记忆。示例："上周我们聊了他们去日本的旅行。"
承诺：智能体做出的承诺和计划。示例："承诺下次询问他们的工作面试情况。"
摘要：对话的综合摘要。会话之间自动生成。

记忆的工作原理

记忆完全自动化——你无需管理它。平台分析每段对话并提取事实、事件和承诺。在每次响应之前，最相关的记忆会自动组装并包含在上下文中。

无需编排

只需调用 chat。平台在每次交互中自动处理记忆提取、存储和检索。

自我改进的记忆

记忆不是静态存储。在每次 chat 调用的背后，几个闭合反馈循环会自动运行，以保持记忆随时间的准确性、组织性和相关性。所有这些都不需要你方面的代码。

重要性反馈 — 智能体在响应中实际使用的事实会被强化；被加载但被忽略的事实会逐渐淡化。
置信度强化 — 当一个事实被回忆并在对话中得到确认时，它的置信度会稳步朝着确定性攀升。
自然的遗忘 — 事实随时间逐渐衰减，但永远不会低于底线。情感重要的和定义身份的事实比中性事实衰减得慢得多。
用户级检索策略 — 平台从会话反馈中学习每个智能体–用户对的检索偏好。几周后，检索就会针对该用户的模式进行调整。
记忆关联 — 一起被访问的记忆会加强它们之间的链接。记忆图中频繁遍历的路径会随时间变得更快。
自适应检索预算 — 检索在自调节的时间预算下运行。质量保持一致；用户始终感觉响应迅速。
下一会话预测 — 在会话结束时，平台预测用户下次可能提出的话题，并为它们预热上下文。

有关每个机制的完整说明——包括整合、聚类、去重、话题转换检测、叙事弧、突破和部署安全系统——请参阅智能体如何随时间变得更聪明。

自动整合、去重和清理

平台在后台主动重塑记忆，使其在增长时保持紧凑且易于导航。你不需要调用任何这些——它们默认按正确的周期运行。

主题聚类

新事实会自动分组到语义聚类中。当聚类变得过于异构时会分裂，当它们相互漂近时会合并，当为空时会退役——在没有调整的情况下保持聚类集平衡。

可逆的去重

当两个事实被发现是同一件事时，平台会合并它们并附有完整的审计跟踪记录。如果后续信号与之相矛盾，每次合并都可以逆转。记忆从不被破坏——它被重组，每个重组都被追踪。

冲突解决

当新信息与现有记忆相矛盾（"我上个月搬到柏林"覆盖"我住在巴黎"），平台会推理冲突并选择正确的行动——将两者保留为新信息、合并它们、取代旧事实，或丢弃严格重复。当矛盾尚不能干净地解决时，两个版本都会被保留，所以没有什么会过早丢失。

来源锚定的事实

无法追溯到对话中实际引用的事实在进入存储之前就会被拒绝。智能体不能凭空想象记忆——每个存储的事实都被验证锚定于来自真实说话者的真实消息。

修剪

置信度、重要性和最近性组合分数低的分支会被修剪。平台从不删除高价值的记忆，但它会停止呈现没有贡献的分支。

树自组织

分层记忆树会随时间重塑自身。频繁访问的分支逐渐向根靠近，以便更快检索。过载的节点拆分成平衡的子树。记忆的形状最终反映了它的实际使用方式。

叙事弧压缩

跨多个会话重复出现的实体和主题被压缩成命名的叙事弧。一条长期运行的线索（例如"用户的创业"）变成一个弧，而不是二十个独立的事实——长期对话保持连贯，而不会让上下文窗口爆炸。

话题转换检测和情节

对话没有整齐的分隔——有时用户会说"无论如何，换个完全不同的话题..."，有时他们会在段落中途转向。平台自动检测这些转换，并将其用于将记忆组织成连贯的情节。

两阶段检查首先运行轻量级信号，仅在信号模糊时才升级到更深的语义检查。信号权重在每个智能体–用户对的会话结束审计中校准，因此情节检测随使用而改进。

当智能体检索记忆时，它可以拉取"这个情节中的所有记忆"，而不仅仅是关键字匹配的片段——为其回复提供叙事连续性。

预填记忆

使用 memory.seed() 在用户第一次对话之前预加载智能体对用户的了解。

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: "sk-..." });

await client.agents.memory.seed("agent-id", {
userId: "user-123",
memories: [
  { text: "User's name is Jane Smith", factType: "fact" },
  { text: "Jane is a senior product manager at Acme Corp", factType: "fact" },
  { text: "Jane lives in San Francisco and enjoys hiking", factType: "fact" },
],
});

搜索记忆

按关键字或语义查询搜索用户-智能体对的记忆。

const results = await client.agents.memory.search("agent-id", {
userId: "user-123",
query: "hiking trip",
limit: 10,
});

for (const mem of results.memories) {
console.log(mem.content, mem.type, mem.createdAt);
}

列表与浏览

列出所有记忆或按类别浏览。

// 列出所有记忆（分页）
const memories = await client.agents.memory.list("agent-id", {
userId: "user-123",
type: "fact",   // "fact" | "event" | "commitment" | "summary"
limit: 20,
offset: 0,
});

// 按类别浏览
const facts = await client.agents.memory.listFacts("agent-id", {
userId: "user-123",
});

记忆时间线

获取用户记忆历史的时间顺序视图。

const timeline = await client.agents.memory.timeline("agent-id", {
userId: "user-123",
from: "2026-01-01",
to:   "2026-03-31",
});

for (const entry of timeline.entries) {
console.log(entry.date, entry.summary);
}

浏览记忆树

导航用户的层级记忆结构。

// 在指定路径浏览记忆树
const nodes = await client.agents.memory.browse("agent-id", {
userId: "user-123",
path: "/facts",   // 可选：按路径过滤
});

重置记忆

清除用户-智能体对的所有记忆——适用于测试或用户想重新开始的情况。

await client.agents.memory.reset("agent-id", {
userId: "user-123",
});

实际应用

记忆对每种使用场景都至关重要，但不同的构建目标，实际操作方式也不同。请选择你的方向。

记忆是关系弧线。 伴侣会积累共同的历史—— "我们第一次聊天谈到了天文学"、"他们为期末考试焦虑的那一周"—— 智能体自然地引用这些来加深联系。

谨慎使用预填。 不要把你了解的用户信息都预加载进去；让大部分记忆通过对话有机形成。只预填持久性的身份事实（姓名、核心兴趣、他们出现的背景）。

通过时间线驱动 UI。 时间线端点会呈现带有日期的情节性记忆—— 在你的应用中将其渲染为"共同回忆"视图，让用户看到关系的历史。

const timeline = await client.agents.memory.timeline("agent-id", {
  userId: "user-123",
  limit: 30,
});

for (const entry of timeline.entries) {
  // 将每个情节渲染为伴侣 UI 中的一个时刻
  render(entry.date, entry.summary, entry.moodBefore, entry.moodAfter);
}

伴侣应用中不要有重置按钮，除非你真的是说"忘掉我"—— 那会破坏关系。

多人记忆

默认的智能体记忆模型是按对的 — 每次对话都构建一个限定到一个 (agent, user) 对的事实档案。这种隔离对隐私来说是正确的默认值，但只要你的产品有多个智能体或一个智能体面向多位用户，你就会希望记忆能够以受控、可观测的方式跨越边界。

多人记忆就是这些跨边界能力的伞。它沿着两个轴干净地分开：

轴	跨越什么	实际形态	能力
智能体之间（Inter-agent）	同一项目（或租户）内智能体之间的知识	闭环公司大脑 — 智能体 A 学到，智能体 B 拾取	`knowledgeBase`（读取）、`knowledgeBaseWrite`（自主更新）、`knowledgeBaseScopeMode`（组织级 cascade）
智能体内部（Intra-agent）	与同一智能体对话的用户之间的记忆	团队大脑 — 一个智能体用与用户 B 交谈所获的上下文回应用户 A	`wisdom`（不带归属、默认开启）、`sharedMemory`（带归属、需要启用）

两个轴可以同时运行。完整图景：同一项目的智能体共享它们对世界的认识（智能体之间），同时单个智能体共享它服务的人之间的上下文（智能体内部）。同样的复利曲线，两个维度。

智能体之间 (Inter-agent)
          共享知识库、自主更新、组织级范围、
                     闭环公司大脑
                              |
                              v
 +-------------------------+      +-------------------------+
 |     智能体 A            |      |     智能体 B            |
 |  读 + 写 KB             |<---->|  读 + 写 KB             |
 +-------------------------+      +-------------------------+
          ^ ^ ^                                ^ ^ ^
          | | | 智能体内部 (Intra-agent)       | | |
          | | | wisdom（不带归属）、           | | |
          | | | shared memory（带归属）        | | |
 +--------+ | +---------+              +-------+ | +---------+
 |          |           |              |         |           |
user X1   user X2     user X3        user Y1    user Y2    user Y3

 智能体之间：任何一个智能体学到的东西，对项目里
              其他所有智能体来说都是有依据的数据。
 智能体内部：单个智能体在它服务的用户之间携带记忆 --
              带隐私护栏。

智能体之间的记忆 — 智能体 ↔ 智能体

智能体之间的记忆把项目知识库变成一个闭环的公司大脑：一个智能体在对话中学到或验证过的任何内容，都会成为其他每一个智能体在下次会话中能取出的有依据数据。三层从基线一直叠到组织级。

1. 基线读取 — 每个智能体都让回答有项目知识库做依据

knowledgeBase: true 的智能体在对话期间通过 knowledge_search 工具读取项目知识图谱。图谱可以人工策划、ETL 加载，或两者并用 — 两条导入路径见数据如何进入知识库。

await client.agents.updateCapabilities("agent_abc", {
  knowledgeBase: true,
});

2. 自主编辑 — 智能体把学到的内容写回去

打开 knowledgeBaseWrite: true，智能体获得 knowledge_create / knowledge_update / knowledge_delete 工具。在对话中智能体自己记录已验证的事实，附完整审计追踪（source = "agent:<agent-id>"），并采用 compare-and-swap 更新语义，所以管理员的编辑不会被悄悄覆盖。下一个对同样话题运行 knowledge_search 的智能体，就会取出之前那个智能体记下的内容。

await client.agents.updateCapabilities("agent_abc", {
  knowledgeBase:      true,
  knowledgeBaseWrite: true,
});

当真相之源就是对话本身时使用 — 客服记录已验证的事故详情、客户成功捕获续约上下文、记录员智能体写会议记录。详见：智能体如何使用知识库。

3. 组织级范围 — 跨项目之上的租户级知识

把 knowledgeBaseScopeMode: "cascade" 设到智能体上，它就会在每次搜索时同时从项目 KB 和组织级 KB 读取。组织级范围用于租户级别的产物：政策、世界观、品牌、参考目录。冲突时项目获胜，组织填充默认值。

await client.agents.updateCapabilities("agent_abc", {
  knowledgeBase:          true,
  knowledgeBaseScopeMode: "cascade",
});

详见：组织知识库。

智能体内部的记忆 — 共享同一智能体的用户之间

智能体内部的记忆把单个智能体变成团队大脑：服务于多位用户的一个智能体携带跨越用户边界的记忆，因此能用与用户 B 交谈所收集的上下文回应用户 A。两个互补的层。

1. 智慧（Wisdom） — 默认开启，不带归属

wisdom 在每个新智能体上都是开的。一个每日晋升作业从按用户事实历史中提取模式，k-匿名化，再通过 LLM 改写成智能体级别的知识。任何个体用户都不可识别。每个智能体都受益于"什么倾向于成功／什么会反复出现"，但绝不会泄露谁说过什么。

// 智慧默认开启。只在你想退出时传 false
//（少见 — 通常只在严格的单用户产品中考虑）。
await client.agents.updateCapabilities("agent_abc", { wisdom: false });

这是安全的智能体内部层 — 按构造保护隐私，不需要手动开启。

2. 共享记忆（Shared Memory） — 需启用，带归属

sharedMemory: true 是强大的智能体内部层。智能体记录人员／实体级带归属的事实（角色、专业、业务上下文、关系），并把它们暴露给共享该智能体的其他用户 — 名字是可见的。"Alice 负责迁移；Bob 处理事故。""Carol 带甜品；Dave 负责布置。"

await client.agents.updateCapabilities("agent_abc", {
  wisdom:       true,   // 先决条件；默认开启
  sharedMemory: true,
});

打开后三件事会切换：智能体获得 sonzai_wisdom_set/update/delete/relate 工具；提示词里多出一个带审慎条款的"Shared facts"段落；每次写入都会在服务器端针对隐私下限（薪酬、健康、政治会被拦下）做语义校验。每次披露都会写入审计表。完整细节：共享记忆。

智能体之间与智能体内部的组合

两个轴是独立的 — 任何组合都有效：

智能体之间	智能体内部	你得到什么
关	关	仅按对记忆。单用户伴侣产品的正确默认值
开（仅读）	关	智能体让回答有 KB 做依据，但不在用户之间共享。标准的只读文档助手
开（读 + 写）	关	闭环世界知识。智能体捕获关于产品、价格、事故的已验证事实 — 其他每个智能体都受益
关	开	团队大脑 — 一个智能体服务一个小组，但智能体之间没有共享的世界知识
开（读 + 写）	开	完整的多人记忆。闭环世界知识 + 团队大脑。共享业务上下文产品的最佳选择

// 一次能力更新搞定完整多人记忆
await client.agents.updateCapabilities("agent_abc", {
  knowledgeBase:          true,
  knowledgeBaseWrite:     true,    // 智能体之间：闭环 KB
  knowledgeBaseScopeMode: "cascade", // 智能体之间：组织级范围
  wisdom:                 true,    // 智能体内部：不带归属（默认开启）
  sharedMemory:           true,    // 智能体内部：带归属
});

验证它在工作

每项能力都有一个实时读取端点，可以用来确认环路确实闭合。把 $AGENT_ID、$PROJECT_ID、$API_KEY 替换成你自己的。

智能体之间 — KB 写入

curl 'https://api.sonz.ai/api/v1/projects/$PROJECT_ID/knowledge/search?q=YourQuery' \
  -H "Authorization: Bearer $API_KEY"

智能体内部 — 带归属的共享记忆

curl 'https://api.sonz.ai/api/v1/agents/$AGENT_ID/wisdom/attributed?limit=20' \
  -H "Authorization: Bearer $API_KEY"

curl 'https://api.sonz.ai/api/v1/agents/$AGENT_ID/wisdom/audit?limit=50' \
  -H "Authorization: Bearer $API_KEY"

智能体内部 — 智慧（默认开启）

不带归属的智慧层一旦每日晋升作业扫过按用户事实历史，就会内联出现在智能体跑的每个提示词里 — 没有单独的读取端点。要确认它在跑，在多用户流量之后看 48 小时窗口的智能体上下文体积；你应该能看到智慧块被填充。

隐私与控制

多人记忆按设计就是敏感的。每项能力都有自己的控制 — 没有一项需要信任 LLM：

能力	服务器端控制
`knowledgeBaseWrite`	每次写入做模式校验、写入配额、项目范围检查、完整审计追踪（`source = "agent:<agent-id>"`）、CAS 更新、仅软删除（硬删除仅管理员）
`wisdom`	晋升前的 k-匿名阈值、用 LLM 改写去除可识别细节、每日节奏让单次嘈杂会话不会泄漏
`sharedMemory`	语义隐私下限校验器（薪酬、健康、政治会被拦下）、每个提示词里的审慎条款、每次事实加载的披露审计、软删除墓碑、source 锁定为 `developer_api`（调用者无法伪造来源）

每条跨边界的流都对应一个读取端点，所以你可以随时回查审计。

深入了解

知识库 — 智能体之间的面：导入路径、模式、自主编辑细节
组织知识库 — 组织级 cascade 的细节
共享记忆 — 智能体内部的面：启用／关闭、四个 wisdom 工具、隐私下限、完整验证探针
自我改进 — 多人记忆如何叠加在按对在线学习之上
Wisdom API — 共享记忆 CRUD + 审计的完整端点参考

Notifications (Polling)

Proactive messages — generated by recurring schedules, one-off wakeups, or tenant-triggered events — land in a per-user notifications queue the moment they fire. Your frontend or backend polls that queue to fetch pending messages, display them to the user, and mark each one consumed. No push infrastructure, no webhook endpoint, no server-side listener to maintain — just an HTTP GET on your schedule.

This is the recommended delivery pattern for web clients and mobile apps that can't accept inbound HTTP requests, and it doubles as a handy catch-up mechanism for users who were offline when messages were generated.

What you can build with it

Mobile app inbox — periodic fetch of pending messages on foreground, display as native notifications or in-app banners
Web dashboard — a dedicated tab showing all of the agent's pending outreach to a given user
Delayed delivery — let messages queue up while the user is offline; deliver the full batch on reconnect
Audit / moderation — preview agent-generated messages before pushing to downstream delivery (email, SMS)
Development / testing — poll notifications during a test run to verify that schedules and wakeups fired correctly

Quickstart

Poll for pending messages, display the latest, then mark each one consumed.

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

const pending = await client.agents.notifications.list("agent_abc", {
user_id: "user_123",
limit: 10,
});

for (const n of pending.notifications) {
console.log(n.generated_message);
await client.agents.notifications.consume("agent_abc", n.message_id);
}

Core concepts

Queue semantics

When a proactive message fires — whether from a schedule, a wakeup, or a trigger event — the platform enqueues it for the relevant user. The queue is per-user, per-agent. Calling list returns only messages in pending state; calling consume transitions a specific message to consumed. Consumed messages are excluded from future list responses but remain visible in history. The queue does not auto-expire: messages stay pending indefinitely until your code marks them consumed.

Polling cadence

There is no hard requirement on how frequently you poll, but these guidelines work well in practice:

Foreground (user has the app open): every 10–30 seconds
Background (app backgrounded or tab hidden): every 2–5 minutes, or on visibility-change events
Reconnect burst: poll immediately when the user comes back online after a period of inactivity, then resume normal cadence

Avoid polling more frequently than every 10 seconds — there is no benefit since notification generation is event-driven, not continuous.

Vs SSE (live chat stream)

If the user has an active SSE chat stream open, proactive messages appear inline in the conversation automatically — no polling needed. Polling is the catch-up mechanism for users who do not have a live stream. The two patterns are complementary: SSE for foreground delivery, polling for background or offline users.

History endpoint

notifications.history is separate from notifications.list. It returns all historical notifications for an agent (including already-consumed ones) and is useful for audit trails, moderation dashboards, and debugging. It does not filter by user_id — it returns across all users up to the requested limit.

Full API

All methods are on client.agents.notifications.* (TS/Python) or client.Agents.Notifications (Go). Full request and response shapes live in the API reference.

Method	Signature	Returns	Description
`list`	`list(agentId, { user_id?, limit? })`	`{ notifications: Notification[] }`	Fetch pending messages for a user
`consume`	`consume(agentId, messageId)`	`void`	Mark a single message consumed
`history`	`history(agentId, limit)`	`{ notifications: Notification[] }`	Fetch all historical notifications (consumed + pending)

Notification fields

Field	JSON key	Description
`MessageID`	`message_id`	Pass this to `consume` to mark the message delivered
`UserID`	`user_id`	The user this notification was generated for
`CheckType`	`check_type`	The check type (e.g. `"reminder"`, `"interest_check"`, `"birthday"`)
`GeneratedMessage`	`generated_message`	The actual text the agent produced — display this to the user
`CreatedAt`	`created_at`	When the message was enqueued (RFC 3339 UTC)
`ScheduleID`	`schedule_id`	Set if the message originated from a schedule; otherwise absent
`WakeupID`	`wakeup_id`	Set if the message originated from a wakeup; otherwise absent

Use the correct field names

Older code may use id, notificationId, type, or content. These are incorrect. The canonical fields are message_id, check_type, and generated_message. Using the wrong field names will result in silent failures when calling consume.

Combines with

With Scheduled Reminders — delivery side of recurring messages

A schedule defines when the agent fires; polling is one way to receive what it produced. When a schedule's cadence fires, the platform generates the agent's message and enqueues it. Your client polls, displays generated_message, then calls consume to clear it from the queue. The schedule and delivery are fully decoupled — you can swap in webhooks or SSE without touching the schedule definition.

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

// 1. Create a daily 09:00 check-in schedule (done once, e.g. at onboarding)
await client.schedules.create("agent_abc", "user_123", {
cadence: {
  simple: { frequency: "daily", times: ["09:00"] },
  timezone: "Asia/Singapore",
},
intent: "morning check-in on mood and sleep",
check_type: "reminder",
});

// 2. On each app foreground, poll for what the schedule produced
const pending = await client.agents.notifications.list("agent_abc", {
user_id: "user_123",
limit: 5,
});

for (const n of pending.notifications) {
showInAppBanner(n.generated_message);
await client.agents.notifications.consume("agent_abc", n.message_id);
}

With Wakeups — delivery side of one-off check-ins

A wakeup fires once at a specific moment; polling retrieves the message it generated. This is the natural delivery pattern for one-off agent outreach in mobile clients where webhooks are unavailable. Schedule the wakeup when the event is known (e.g. "follow up 24 hours after purchase"), then poll periodically — the message lands in the queue the moment the delay elapses.

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

// 1. Schedule a one-off wakeup (e.g. after a user completes onboarding)
await client.agents.scheduleWakeup("agent_abc", {
user_id:     "user_123",
check_type:  "interest_check",
intent:      "check in about how onboarding went",
delay_hours: 24,
});

// 2. Poll for the message when it fires (24 h later)
const pending = await client.agents.notifications.list("agent_abc", {
user_id: "user_123",
limit: 5,
});

for (const n of pending.notifications) {
console.log(n.check_type, n.generated_message);
await client.agents.notifications.consume("agent_abc", n.message_id);
}

With Webhooks — alternative push delivery

Polling and webhooks are two delivery patterns for the same underlying notifications queue. Choose based on your infrastructure:

Polling — your client asks the server for new messages on a schedule. Simple to implement, works in browsers and mobile apps, no inbound connectivity required. Latency is bounded by your polling interval.
Webhooks — the server pushes each message to a URL you register the moment it fires. Lower latency, better for server-to-server integration and multi-channel fanout (email, SMS, push notifications). Requires a public HTTPS endpoint to receive callbacks.

You can use both simultaneously: poll from mobile clients for in-app delivery and register a webhook on your backend for email/SMS fanout. The queue tracks consumed state per message, so a message consumed via polling will not appear in webhook delivery (and vice versa).

Tutorials

Medication Reminders — full-stack example combining Schedule + Inventory + Memory; shows the end-to-end flow from schedule creation to polling the generated reminder.
Scheduled Reminders — reference walkthrough — covers cadence shapes, DST, preview, and the full lifecycle including how fired messages appear in the notifications queue.

Next steps

Scheduled Reminders — set up the recurring schedules whose output you poll here
Wakeups — one-off check-ins that also land in the notifications queue
Webhooks — the push alternative to polling; compare delivery trade-offs
Proactive Messaging overview — sources × delivery channels explained

Organization-Global Knowledge Base

The organization-global Knowledge Base is an opt-in second scope that sits above every project's own Knowledge Base, letting agents across all projects under a tenant read shared facts — HR policies, brand standards, product catalogs, multi-game lore — without duplicating data per project. Each agent picks a scope mode (project_only, org_only, cascade, or union) to control how org and project graphs combine. Cascade is the recommended default: project facts win on ID collisions, so local overrides remain authoritative.

When to use it

By default, the Knowledge Base is project-scoped. Every project has its own isolated graph. That is the right model for most tenants — a project's data should not leak into other projects' agents.

The organization scope is an opt-in second scope that sits above every project. Knowledge written here is readable by every project agent under the tenant that opts into a cross-scope reading mode. Typical uses:

Company-wide policies (HR, refund, privacy, terms).
Product documentation shared across projects.
Brand guidelines, tone standards, style rules.
Shared lore for multi-game or multi-product characters.
Reference catalogs (locations, entities, product lists).

How it fits with project KB

Tenant (organization)
|
|-- Organization-global KB   (scope_id = "")
|   - policies, shared lore, brand, reference catalogs
|   - written by tenant admins via the org endpoints
|
|-- Project A KB             (scope_id = project_a_id)
|   - A's own uploaded docs + API-pushed facts
|
|-- Project B KB             (scope_id = project_b_id)
|   - B's own uploaded docs + API-pushed facts
|
Agents under any project choose how to read across the two scopes:
- project_only   legacy: just the agent's project KB
- org_only       only the organization-global KB
- cascade        both, project wins on ID collisions (recommended)
- union          both, first occurrence wins

Scope mode on an agent

Every agent has a knowledgeBaseScopeMode capability. Leaving it unset preserves the legacy project-only behavior. To enable the cascade, set it via the capabilities endpoint or the dashboard.

Enable the knowledge base capability and set the project ID via the SDK:

// Enable the knowledge base + org cascade for the agent
await client.agents.updateCapabilities(agentId, {
knowledgeBase: true,
knowledgeBaseScopeMode: "cascade",
});

Writing to the org scope

There are two ways a tenant admin can populate the org scope. Both are backend-only endpoints — end users of your products never see them.

1. Create a node directly in the org scope

Use this for hand-authored facts that should live at the organization level from the start.

const node = await client.knowledge.createOrgNode(tenantId, {
node_type: "policy",
label: "Refund Policy",
properties: {
  description: "Full refund available within 30 days of purchase.",
},
});

2. Promote an existing project node

If a fact already lives in a project KB and you want to share it organisation-wide, promote it. The project copy is preserved — promotion is additive. If an org node with the same (node_type, norm_label) already exists, the server returns that one instead of writing a duplicate.

const orgNode = await client.knowledge.promoteNodeToOrg(projectId, nodeId, tenantId);

Reading: cascade search and provenance

When an agent with a non-default scope mode calls knowledge_search during a conversation, the platform runs the search against both scopes in parallel and fuses the results using Reciprocal Rank Fusion (RRF). Each returned result carries a scope field so your prompt can show the LLM where a fact came from.

Agent's knowledge_search("refund policy")
       |
       v
+----------------------------+       +----------------------------+
| Project BM25  (scope=proj) |  +--> | Org BM25      (scope="")   |
+----------------------------+       +----------------------------+
       |                                         |
       +--------------- RRF fuse ----------------+
                       |
                       v
          Top-N results, each tagged:
            scope: "project" | "organization"

Scope modes differ in how they merge on a collision:

cascade (recommended): project wins on duplicate node IDs. Agents keep their own overrides, but inherit the org defaults when a project doesn't define something.
union: first occurrence wins; both scopes contribute equally to ranking. Useful when you want broad coverage without a strong preference.
org_only: skip project KB entirely. Useful for reference-only agents (FAQ bots on company policy, e.g.).
project_only (default): legacy behavior, org-scope facts are invisible to this agent.

Dashboard admin UI

A tenant admin can manage org-scope knowledge at /dashboard/knowledge/org:

Create a node directly in the org scope.
Promote an existing project node by pasting its project ID and node ID.

The UI is a thin wrapper over the same endpoints shown above; if you need bulk operations or automated pipelines, the SDKs are the recommended path.

Operational notes

Access control: the two org-scope write endpoints are gated by the same tenant-admin middleware used by the existing project-scoped KB endpoints. Standard project members see no new surface.
Backward compatibility: zero change for any existing agent. Agents stay on project_only mode unless you set a scope mode explicitly.
Idempotency: dedup is at (node_type, norm_label). Promotion returns the existing org node if one is already there; direct createOrgNode will create a second node with a different NodeID — check before calling if that matters.
Per-scope BM25: each scope maintains its own BM25 index and document-frequency corpus. This is why the cascade uses RRF instead of score-adding — the raw scores from two separate indexes are not directly comparable.

思维层

人格系统

大五人格模型

每个智能体都有大五（OCEAN）人格分数。行为特征、情绪基线、说话风格和交互偏好都源自这些分数。

开放性 (0.0 - 1.0)：好奇心、创造力、对新体验的开放程度。高分 = 富有想象力、爱探险。低分 = 务实、传统。
尽责性 (0.0 - 1.0)：组织能力、纪律性、目标导向。高分 = 有条理、可靠。低分 = 自发、灵活。
外向性 (0.0 - 1.0)：社交能量、热情、果断。高分 = 外向、精力充沛。低分 = 内敛、善于反思。
宜人性 (0.0 - 1.0)：温暖、合作、共情能力。高分 = 关怀、信任。低分 = 直接、多疑。
神经质 (0.0 - 1.0)：情绪敏感度、焦虑倾向。高分 = 情绪反应强烈。低分 = 情绪稳定。

confidence 字段（0.0-1.0）控制分数对行为的影响强度。低置信度 = 更趋于通用；高置信度 = 更具差异化。

BFAS 维度细分

平台内部将大五分数映射到 10 个 BFAS（大五方面量表）维度。这些维度提供对人格更精细的控制，并在人格档案响应中对外暴露：

大五领域	维度 1	维度 2
开放性	`intellect`	`aesthetic`
尽责性	`industriousness`	`orderliness`
外向性	`enthusiasm`	`assertiveness`
宜人性	`compassion`	`politeness`
神经质	`withdrawal`	`volatility`

每个维度是从父级大五领域派生出的 0.0-1.0 分数。你可以从人格档案中读取它们，但无需手动设置——它们由你的大五分数计算得出。

行为特征

人格档案包含影响智能体沟通方式的派生行为特征：

response_length — 智能体倾向于详细或简洁的程度。
question_frequency — 智能体提出追问的频率。
empathy_style — 智能体对情感支持的处理方式（认可型、解决导向型等）。
conflict_approach — 智能体处理分歧的方式（迁就型、直接型、调解型等）。

创建带人格的智能体

创建智能体时传入大五分数。平台会自动生成人格提示词、说话风格和行为倾向。

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: "sk-..." });

const agent = await client.agents.create({
agentId: "your-stable-uuid",  // 可选但推荐
name: "Luna",
gender: "female",
big5: {
  openness:          0.75,
  conscientiousness: 0.60,
  extraversion:      0.80,
  agreeableness:     0.70,
  neuroticism:       0.30,
},
language: "en",
});

console.log(agent.agent_id);

幂等性

传入相同的 agentId 始终执行插入或更新操作。可安全地在每次部署时调用。参见快速开始了解推荐的 UUID 派生模式。

获取人格档案

检索智能体当前的人格档案，包括派生的说话风格和交互偏好。

const profile = await client.agents.personality.get("agent-id");

console.log(profile.big5);
console.log(profile.speechPatterns);
console.log(profile.interactionPreferences);

更新人格

运行人格评估后更新大五分数。confidence 值控制新分数对行为的影响强度。

await client.agents.personality.update("agent-id", {
big5: {
  openness:    0.82,
  extraversion: 0.75,
},
confidence: 0.8,   // 0.0-1.0
});

confidence < 0.3：试探性。最小幅度的调整。
confidence 0.3 - 0.7：与现有分数混合。
confidence > 0.7：对人格产生强烈影响。

人格演化历史

检索智能体人格变化的历史——适用于向用户展示成长时刻。

const history = await client.agents.personality.history("agent-id");

for (const shift of history.shifts) {
console.log(shift.trait, shift.delta, shift.triggeredBy, shift.createdAt);
}

交互偏好

塑造对话风格的派生偏好：

对话节奏

慢速、适中、快速——由外向性水平派生。

正式程度

随意、均衡、正式——由尽责性水平派生。

幽默风格

干涩、活泼、温暖——由开放性与宜人性共同派生。

情感表达

内敛、适中、外露——由神经质与外向性共同派生。

人格演化

人格通过交互自然演化：

交互分析 — 每次对话后分析情感主题和模式。
微调 — 根据对话内容对相关大五维度进行小幅调整。
突破 — 当累积变化超过阈值时，触发"突破"事件——这是智能体意识到的重大人格变化。
档案重生 — 人格提示词、说话风格和行为指令将重新生成以反映演化后的人格。

每用户人格叠加层

平台会自动为每个用户派生一个人格叠加层——智能体基于对话历史、偏好和关系状态，微妙地适应特定用户的方式。你无需手动设置叠加层；它们由每次聊天轮次后运行的同一套流水线填充。

读取当前叠加层用于 UI 展示（显示智能体的语气如何随用户变化）或分析：

// 列出该智能体所有有人格叠加层的用户
const overlays = await client.agents.personality.listUserOverlays("agent-id");

// 读取某个用户的叠加层
const overlay = await client.agents.personality.getUserOverlay("agent-id", "user-123");
console.log(overlay.big5Delta, overlay.interactionPreferences);

派生智能体

创建一个拥有独立人格、记忆和状态的智能体副本。派生的智能体从与原始智能体相同的配置起步，但从此独立演化。

const forked = await client.agents.fork("agent-id");
console.log(forked.agentId); // 新的独立智能体

实际应用

三类受众都会使用人格，但调整的内容和原因大相径庭。

人格就是角色。 大五 + 说话风格 + 兴趣爱好是让 Luna 感觉像 Luna 的关键。调高开放性（0.8+）和适中的宜人性带来温暖感；降低尽责性带来俏皮感；适中的神经质带来情感层次。

让它演化。 特质漂移是特性，不是缺陷——长期用户希望感受到他们的伴侣与自己一起成长。不要抑制演化；读取 history 在你的 UI 中呈现"Luna 的变化"时刻。

const shifts = await client.agents.personality.history("agent-id", {
  userId: "user-123",
  since: "2026-01-01",
});
// 将重大变化渲染为 UI 中的叙事节点

说话风格比分数更重要。 在简介中定义 3-5 个独特的语言习惯—— 这些比大五档案更能承载声音特色。

Priming

Priming is how you tell a new agent what it already knows about a user. Instead of waiting for the agent to learn through conversation, you deliver the relevant facts up front: who the user is, where they came from, and what they've said before — all before the first message is exchanged.

What you can build with it

Migrations from other LLM frameworks — import chat history from Zep, Mem0, Letta, OpenAI Assistants, LangChain, Character.AI, or any custom transcript store
CRM / CSV bulk imports — prime thousands of users in one call with structured contact data
Chat-transcript seeding — let the agent "remember" previous conversations from another system
Display-name + timezone bootstrap — ensure the agent addresses users correctly from turn 1
Onboarding enrichment — load journal entries, support tickets, or prior interactions so the agent sounds familiar on the user's very first chat

Quickstart

Prime a single user with their display name, timezone, and a short narrative block:

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

const job = await client.agents.priming.primeUser("agent_abc", "user_123", {
display_name: "Mia Tanaka",
metadata: {
  timezone: "Asia/Tokyo",
  company:  "Acme Corp",
  title:    "Platform Lead",
  email:    "[email protected]",
},
content: [
  {
    type: "text",
    body: "Mia joined Acme in 2023 and leads the platform team. She prefers async communication and is an avid coffee enthusiast.",
  },
],
source: "crm_onboarding",
});

console.log(job.job_id, job.status, job.facts_created);

The call returns immediately with a job_id. LLM fact-extraction runs asynchronously in the background — the primed facts appear in memory within seconds.

Core concepts

Metadata vs content

These are two distinct channels for different kinds of information:

Metadata is structured and first-class: display_name, company, title, email, phone, timezone, and a custom map for anything else. Sonzai generates facts from metadata fields synchronously — no LLM extraction required — so facts_created is non-zero even with no content blocks.
Content is narrative. Content blocks go through the full LLM extraction pipeline and end up as facts in the agent's memory constellation, exactly as if the user had said those things in a conversation.

Content block types

Each PrimeContentBlock has a type and a body:

Type	When to use
`"text"`	Narrative facts, bullet-point summaries, freeform notes about the user
`"chat_transcript"`	A prior conversation from another system. Format as `User: …\nAgent: …` lines, one session per block

The extraction pipeline deduplicates across all blocks — you can safely send both raw transcripts and pre-extracted facts from the same source without producing duplicate memories.

Batch vs single

Method	Use when
`primeUser`	Onboarding a single user, enriching an existing user with new content
`batchImport`	Migrating many users at once from a CSV, CRM, or legacy system

Batch imports return a job ID immediately. Use getImportStatus (or GetImportStatus in Go) to poll until the job's status reaches "complete".

Import jobs

Both primeUser and batchImport are async. The ImportJob response carries:

Field	Meaning
`job_id`	Opaque ID — use it to poll status
`status`	`"pending"`, `"processing"`, `"complete"`, `"error"`
`total_users`	Number of users submitted (batch only)
`processed_users`	Users fully extracted so far (batch only)
`facts_created`	Total facts written to memory
`error_message`	Set if the job errored

Batch import

Import multiple users in one call. Useful for migrating from a CRM or seeding a fresh deployment:

const job = await client.agents.priming.batchImport("agent_abc", {
source: "crm_export",
users: [
  {
    user_id:      "user_001",
    display_name: "Mia Tanaka",
    metadata: { email: "[email protected]", timezone: "Asia/Tokyo" },
    content: [
      { type: "text", body: "Mia leads the platform team at Acme Corp." },
    ],
  },
  {
    user_id:      "user_002",
    display_name: "Ren Park",
    metadata: { email: "[email protected]", company: "Beta Labs", title: "CTO" },
    content: [
      { type: "chat_transcript", body: "User: Hey, I need help with our API.\nAgent: Sure, what are you trying to do?" },
    ],
  },
],
});

console.log(`job ${job.job_id}: ${job.total_users} users queued`);

// Poll until done
let status = await client.agents.priming.getImportStatus("agent_abc", job.job_id);
while (status.status !== "complete" && status.status !== "error") {
await new Promise(r => setTimeout(r, 2000));
status = await client.agents.priming.getImportStatus("agent_abc", job.job_id);
}
console.log(`done: ${status.facts_created} facts created`);

Full API

Method	Returns	Description
`primeUser(agentId, userId, opts)`	`PrimeUserResponse`	Prime or re-prime a single user. Async — returns a job ID.
`getPrimeStatus(agentId, userId, jobId)`	`ImportJob`	Check status of a single-user priming job.
`addContent(agentId, userId, opts)`	`AddContentResponse`	Append more content blocks to an already-primed user without overwriting metadata.
`getMetadata(agentId, userId)`	`UserPrimingMetadata`	Fetch the stored structured metadata for a user.
`updateMetadata(agentId, userId, opts)`	`UpdateMetadataResponse`	Partially update metadata fields. Provided keys overwrite; omitted keys are preserved.
`batchImport(agentId, opts)`	`BatchImportResponse`	Import many users in one async job.
`getImportStatus(agentId, jobId)`	`ImportJob`	Poll a batch import job by ID.
`listImportJobs(agentId, limit?)`	`ImportJobListResponse`	List recent import jobs for an agent.

Idempotent by design

Calling primeUser more than once for the same user is safe. Content blocks are processed through the same deduplication pipeline as live chat — repeated or overlapping facts are merged, not doubled.

Combines with other features

With Memory — primed facts become durable memory

Content blocks flow through the exact same extraction pipeline as conversational messages. After priming, you can search for primed facts via memory.search:

// After primeUser completes, primed content is searchable
const results = await client.agents.memory.search("agent_abc", {
query: "platform team",
userId: "user_001",
limit: 5,
});

for (const mem of results.results) {
console.log(mem.content, mem.factType, mem.score);
}

Primed facts carry a source_type matching the source string you passed to primeUser or batchImport, so you can distinguish migrated history from organically-learned facts when querying.

With Inventory — structured data import via priming

Use structured_import inside primeUser to seed per-user inventory items alongside narrative facts. This is how you import ownership tables, subscription rosters, or product holdings from a CRM export:

{
  "source": "crm_inventory",
  "structured_import": {
    "entity_type": "product",
    "content_csv": "product_name,quantity,purchase_date\nHiking Backpack,1,2025-09-01\nWater Bottle,2,2025-10-12",
    "column_mapping": {
      "product_name":  { "property": "name", "is_label": true },
      "quantity":      { "property": "quantity", "type": "number" },
      "purchase_date": { "property": "purchased_on" }
    }
  }
}

Each row becomes a fact shaped as "User owns <label>" with the row's columns as metadata. See the CRM / CSV migration guide for a full walkthrough.

With migration guides — concrete from-X paths

The Migrations overview lists per-source recipes with full export + import code for every common origin system. Priming is the underlying mechanism each guide uses — the migration guides show you exactly how to shape your existing data into content blocks.

Tutorials

Migrating from Zep — session-based chat transcript import
Migrating from Mem0 — extracted-fact migration
Migrating from OpenAI Assistants — thread-based import
CRM / CSV bulk import — contact rosters and inventory tables
Migrations overview — full index of source-system recipes

Next steps

Memory — where primed content lands and how it's recalled during conversation
Migrations overview — framework-specific recipes for Zep, Mem0, Letta, LangChain, and more
Inventory — per-user structured items, often seeded via priming
Priming API reference — full endpoint documentation

Proactive Messaging

Proactive messaging is when the agent initiates contact rather than responding to user input. Messages can originate from three sources — a recurring schedule, a one-off wakeup, or an event your backend triggers — and are delivered through three channels: the live SSE chat stream, a polling notifications API, or a webhook your server receives.

The proactive triangle — sources × delivery channels

Every proactive message is defined by a source (what triggers it) and a delivery channel (how the user receives it). Mix and match freely.

Sources (what triggers the message)

Scheduled Reminders — recurring cadence (daily / weekly / hourly). Developer-configured. Use when a message must repeat on a predictable rhythm — medication reminders, habit nudges, daily check-ins.
Wakeups — a single one-off message at a specific moment, expressed as a delay from now. Agent- or developer-initiated. Use for birthdays, post-purchase follow-ups, or any event that fires exactly once.
Trigger Event — your backend calls TriggerEvent when something non-conversational happens (level-up, milestone, external state change). Use when the message is reactive to your own system events rather than time.

Delivery channels (how the user receives it)

SSE (live chat stream) — if the user has an active chat stream open, the proactive message appears inline in their conversation automatically.
Polling (client.agents.notifications.*) — your frontend or backend polls the notifications API on a schedule. Works well for web dashboards and mobile apps that check for new content when they foreground.
Webhooks — register a URL once; Sonzai POSTs every proactive message to it. Use for push notifications, email/SMS fanout, or any server-to-server integration.

Decision flow — which pattern do I pick?

Does the message need to repeat on a cadence? → Scheduled Reminders
Is it a one-off event with a known fire time? → Wakeups
Is it triggered by a non-conversational event in your backend? → Trigger Event (coming soon)
Should the user see it inline in their chat? → SSE (automatic when a chat stream is active — no extra code needed)
Is your frontend the primary consumer? → Notifications polling
Do you need server-to-server delivery or multi-channel fanout (email, SMS)? → Webhooks

Combines with other features

With Inventory — live structured data at fire time

A schedule or wakeup can reference an inventory_item_id. At fire time the platform reads the item's current properties, so the agent always has up-to-date information — even if the item changed since the schedule was created.

// Schedule that reads live inventory data at every fire
await client.schedules.create("agent_abc", "user_123", {
  cadence: { simple: { frequency: "daily", times: ["08:00"] }, timezone: "Asia/Singapore" },
  intent: "remind the user about their medication",
  check_type: "reminder",
  inventory_item_id: "inv_01HX...",
});

With Memory — capturing reply signals

When a proactive message triggers a user reply, the memory layer captures the exchange automatically. Query those memories later to build engagement or adherence dashboards.

// After firing reminders, search memory for user responses
const memories = await client.agents.memory.search("agent_abc", {
  query: "medication taken",
  limit: 10,
});

Tutorials

Medication Reminders — full-stack example combining Schedule + Inventory + Memory.
Scheduled Reminders — reference walkthrough — cadence shapes, DST handling, pause/resume lifecycle.

Next steps

Scheduled Reminders — recurring cadence primitive
Wakeups — one-off check-ins
Notifications & Webhooks — polling and server-to-server delivery

Scheduled Reminders

Scheduled Reminders let your agent message users on a schedule — daily, weekly, or every few hours. The platform handles timezones, DST, and quiet-hours automatically, and reads live structured data at fire time so messages always reflect current information. Use it for medication reminders, habit nudges, daily check-ins, or any time-based message you want the agent to initiate.

What you can build with it

Medication adherence — remind users at specific times of day with the correct drug and dose, resilient to dosage changes (tutorial)
Habit streaks — daily or weekly nudges tied to a goal the agent tracks in memory
Exercise / hydration check-ins — a cadence with quiet-hours respect, skipping fires overnight
Bill-payment reminders — one-off or monthly reminders bounded by starts_at / ends_at
Ritual / daily-standup messages — an opening line from the agent to start the day

Quickstart

Create a daily 09:00 Asia/Singapore check-in. The response contains schedule_id, next_fire_at (UTC), and next_fire_at_local (in the schedule's timezone).

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

const schedule = await client.schedules.create("agent_abc", "user_123", {
cadence: {
  simple: { frequency: "daily", times: ["09:00"] },
  timezone: "Asia/Singapore",
},
intent: "check in on how the user is feeling",
check_type: "reminder",
});

console.log(schedule.schedule_id);        // "sched_01HX..."
console.log(schedule.next_fire_at);       // "2026-04-22T01:00:00Z"
console.log(schedule.next_fire_at_local); // "2026-04-23T09:00:00+08:00"

Core concepts

Cadence shapes

A cadence tells the platform when to fire. Two mutually exclusive shapes are supported: simple and cron. The simple shape covers most use cases through a frequency field with three options: "daily" fires at each listed times entry every calendar day; "weekly" fires on specified days_of_week at each listed time; "interval_hours" fires repeatedly at a fixed interval starting from starts_at (or schedule creation if omitted). All wall-clock times are evaluated in the schedule's timezone.

A 3x-daily schedule:

{
  "cadence": {
    "simple": { "frequency": "daily", "times": ["08:00", "13:00", "20:00"] },
    "timezone": "Asia/Singapore"
  }
}

An every-4-hours interval:

{
  "cadence": {
    "simple": { "frequency": "interval_hours", "interval_hours": 4 },
    "timezone": "Asia/Singapore"
  }
}

For advanced recurrence patterns, use the cron shape with a standard 5-field cron expression (e.g. "0 9 * * 1-5" for 09:00 on weekdays). The timezone field is required in both shapes — IANA names only (e.g. "America/New_York"), not UTC offsets.

Active window — quiet hours

The active_window field is a belt-and-braces filter layered on top of the cadence. The cadence computes when a fire would occur; the active window decides whether that fire actually produces a proactive message. Fires outside the window are skipped, not deferred — the cadence grid stays perfectly predictable and no backlog accumulates.

{
  "active_window": {
    "hours": { "start": "08:00", "end": "22:00" },
    "days_of_week": ["mon", "tue", "wed", "thu", "fri"]
  }
}

Both sub-fields are optional. When start is greater than end, the window wraps midnight — for example {"start": "22:00", "end": "06:00"} allows fires from 22:00 to 05:59 the next morning. This is useful for night-shift users or schedules targeting early-morning timezones where local midnight matters. Day membership is always evaluated in the schedule's own timezone, so a fire at 23:30 Friday Singapore time stays Friday even when stored as 15:30 UTC.

Inventory injection at fire time

Pass inventory_item_id on the create (or update) body to link a schedule to a structured item in the user's inventory — a medication, a goal, a plant, anything with named properties. The key property of this linkage is that the platform reads the item's live properties at every fire, not at schedule creation time. This means updating a medication's dosage, a goal's target, or any other property is automatically reflected in the next reminder without any schedule edit. The schedule is the source of truth for when; the inventory item is the source of truth for what.

Lifecycle — bounded courses

Use starts_at and ends_at (both RFC 3339 UTC) to constrain a schedule to a specific window of time. No fire is produced before starts_at; once ends_at passes, the schedule is automatically disabled — enabled flips to false. The schedule row is not deleted: the audit trail, historical fire log, and linked inventory reference remain accessible. This is a soft-disable, not a hard delete. To permanently remove a schedule and all associated fire history, use the delete method explicitly.

Full API

All methods are on client.schedules.* (TS/Python) or client.Schedules (Go). Full request / response shapes live in the API reference.

Method	Returns	Description
`create(agentID, userID, opts)`	`Schedule`	Create a recurring schedule
`list(agentID, userID)`	`Schedule[]`	List all schedules for the user
`get(agentID, userID, scheduleID)`	`Schedule`	Fetch a single schedule
`update(agentID, userID, scheduleID, opts)`	`Schedule`	Patch cadence, active window, bounds, or inventory linkage
`delete(agentID, userID, scheduleID)`	`void`	Delete a schedule
`upcoming(agentID, userID, scheduleID, limit)`	`FireTime[]`	Preview the next N fires without firing them

Combines with other features

With Inventory — structured data injected at every fire

Every schedule can reference an inventory_item_id pointing to a structured per-user item (e.g. a medication, a goal, a plant). At each fire, the platform reads the item's live properties and injects them into the agent's wakeup block — no schedule edit needed when the data changes. This is how a "reduce ibuprofen from 500mg to 250mg" change flows through to the next reminder automatically.

// 1. Add an inventory item (e.g. a medication)
const item = await client.agents.inventory.update("agent_abc", "user_123", {
  action:      "add",
  item_type:   "medication",
  description: "Ibuprofen",
  project_id:  "proj_abc",
  properties: { medication_name: "ibuprofen", dosage: "500mg" },
});

// 2. Link the schedule to it — no duplicated data
await client.schedules.create("agent_abc", "user_123", {
  cadence: { simple: { frequency: "daily", times: ["08:00", "20:00"] }, timezone: "Asia/Singapore" },
  intent: "remind the user to take their ibuprofen at the correct dose",
  check_type: "reminder",
  inventory_item_id: item.fact_id,
});

// 3. Later, the dose changes — the next fire automatically sees "250mg"
await client.agents.inventory.directUpdate("agent_abc", "user_123", item.fact_id, {
  properties: { dosage: "250mg" },
});

See the full worked example in the Medication Reminders tutorial.

With Wakeups — recurring vs one-off proactive messages

Schedules and Wakeups are both proactive primitives but serve different cases. Use a schedule when the agent should reach out on a repeating cadence (daily, weekly, every 4 hours). Use a wakeup when the agent should reach out once at a specific moment — a birthday, a known one-off event, or an agent-initiated interest check. Both feed into the same downstream delivery channels (SSE, polling, webhooks — see Proactive messaging).

// Recurring: Schedule
await client.schedules.create("agent_abc", "user_123", {
  cadence: { simple: { frequency: "daily", times: ["09:00"] }, timezone: "Asia/Singapore" },
  intent: "morning check-in on mood and sleep",
  check_type: "reminder",
});

// One-off: Wakeup
await client.agents.scheduleWakeup("agent_abc", {
  user_id:     "user_123",
  check_type:  "birthday",
  intent:      "wish user happy birthday on their 30th",
  delay_hours: 24,
});

With Memory — capturing adherence signals

When the agent fires a scheduled reminder and the user responds ("took it, thanks"), the memory layer auto-captures the adherence fact. You can query these facts later to build a compliance view without adding a separate database — useful for tenant-side dashboards or escalation logic.

// After a week of firing daily medication reminders, query memory for responses
const memories = await client.agents.memory.search("agent_abc", {
  query: "medication taken ibuprofen",
  limit: 10,
});

for (const result of memories.results) {
  console.log(result.content, result.score);
  // "User confirmed taking 500mg ibuprofen"  0.87
}

Tutorials

Scheduled Reminders — end-to-end walkthrough — covers cadence shapes, DST, preview, pause/resume/delete, error codes.
Medication Reminders — worked example — combines Inventory + Scheduled Reminders + Memory into a full medication adherence flow.

Next steps

Wakeups — the one-off counterpart.
Inventory — structured per-user items that schedules can reference.
Memory — how user responses to reminders flow into long-term memory.

思维层

智能体如何随时间变得更聪明

大多数 AI 平台提供静态智能体：相同的提示词、相同的检索、第一天和第一百天的行为完全一样。Sonzai 的设计不同。每一次交互都在教智能体该记住什么、如何检索、应该是谁，以及如何让所有这些做得更好——通过持续运行的闭合反馈循环自动完成。

本页介绍这些循环做什么、运行频率、以及它们如何在你的应用中体现——你不需要写一行训练代码。

完全自动

本页所有机制默认开启。无需配线、无需训练模型、无需管理调度。你调用 chat() 拿到响应，平台处理其余的一切。

自己造记忆+学习栈                          使用 Sonzai
 ------------------------                   ----------------------

    向量存储 + 检索                      |
    去重 + 冲突解决                      |
    性格 + 情绪引擎                      |        调用 chat()
    奖励信号 + 评估装置                  |             |
    训练 + 评估管道                      |             v
    影子部署 + 自动回滚                  |
    漂移监控                              |        全部、自动
    用户级调优循环                        |
    提示词扫描 + 回归测试                 |
    失控行为的随时待命                    |

 ------------------------                   ----------------------
 ~ 12 个月平台工作                          一个下午

五个改进层

自我改进系统由五个相互配合的层组成。每一层解决一个不同的问题。

记忆学会什么是重要的

重要性、置信度、显著性和遗忘。智能体真正使用的事实会被强化；被忽略的事实会逐渐淡化。

记忆学会如何检索自身

随时间适应的用户级检索策略。智能体学习每位用户的节奏，预先加载正确的上下文。

记忆自我组织

聚类、整合、去重、叙事弧、树重组。记忆在增长时保持紧凑且易于导航。

性格和情绪进化

带安全上限的 Big Five 漂移、带基线衰减的四维情绪、形成又消退的习惯、在有意义的里程碑上触发的突破。

系统学会如何学习

自适应学习节奏、新行为的谨慎部署、收敛监测——平台安全地改进自己的学习策略。

第一层 — 记忆学会什么是重要的

智能体存储的每一个事实都有一个置信度分数和一个重要性分数。两者都根据对话中实际发生的事情而变化。

重要性反馈

每次响应后，平台会检查智能体上下文中包含的事实，以及响应实际引用的事实。智能体依赖的事实会得到小幅提升。被加载但被忽略的事实会得到小幅下调。经过数十次会话，检索集会自动调整到真正影响智能体回复的内容。

置信度强化

当一个事实被回忆并在对话中得到确认（"是的，没错" —— 或仅仅是用户继续将该事实视为真实），该事实的置信度会稳步朝着确定性攀升。从未被确认的事实只会保持其初始置信度。

自然的遗忘

记忆使用现实的遗忘曲线：事实随时间逐渐衰减，除非再次遇到。衰减是温和的，永远不会完全清除一个事实——总有一个底线。情感重要的或定义身份的事实比中性事实衰减得慢得多。

显著性和性格调节

事实淡化的速度不仅取决于时间。情感强烈或定义身份的事实受到保护。智能体自身的性格也塑造了曲线——好奇的智能体（高开放性）吸收新事实更快；情感敏感的智能体（高神经质）保留记忆更长。

你看到的: 智能体记住一个人会记住的事情。生日、重大事件、用户带着情感说的话。它自然地放下那些不相关的小事。

第二层 — 记忆学会如何检索

智能体的检索策略不是固定的算法。它是一组权重，平台按智能体–用户对调整，所以与一位用户对话一个月的智能体拥有专门为该用户塑造的检索模式。

用户级权重学习

每次会话后，平台根据加载内容与实际使用内容来更新检索权重。持续产生有用事实的分支会提升优先级；加载昂贵但毫无贡献的分支会被降级。

模式学习

平台学习每位用户的模式——他们通常什么时候回来、倾向于重访哪些话题、一天中的哪些时段更情绪化或更交易化。这驱动更智能的预加载。

记忆关联

当两条记忆在同一会话中被一起访问时，它们之间的链接会被加强。随时间推移，记忆图发展出反映智能体实际对话模式的常用路径——就像一个人的大脑构建相关概念之间的关联一样。

自适应检索预算

记忆检索在自调节的时间预算下运行。当系统快速找到良好结果时，预算会收紧。当召回质量下降时，预算会放宽以给检索更多时间。用户始终感觉响应迅速；质量保持在应有的水平。

前瞻

每次会话结束时，智能体预测用户下次可能提出的话题。这些预测用于在下一次会话开始之前预热上下文——当预测命中时，首响应延迟下降，因为正确的记忆已经加载。

记忆恢复

如果智能体在对话中发现自己缺少上下文（"等等，提醒我一下 X"），错过的记忆会被标记并在下一次会话中获得优先级。智能体真的会记住自己忘了什么。

你看到的: 检索随关系持续越久而越敏锐。更快、更相关的上下文。使用几周后，"智能体忘记了关键事情"的时刻减少。

相同的智能体、相同的提示词、不同的用户
          ====================================

 +--- user_A 对 --------------+    +--- user_B 对 --------------+
 |                            |    |                            |
 |  记得 user_A 在意的事      |    |  记得 user_B 在意的事      |
 |                            |    |                            |
 |  > 工作叙事                |    |  > 音乐叙事                |
 |  > 正式语调                |    |  > 戏谑互动                |
 |  > 晨间节奏                |    |  > 深夜节奏                |
 |  > 周一回来                |    |  > 周五回来                |
 |                            |    |                            |
 |  情绪基线: 平静             |    |  情绪基线: 明朗             |
 |  关系: familiar            |    |  关系: close               |
 |                            |    |                            |
 +----------------------------+    +----------------------------+

 两个记忆层，纯粹源于每位用户自己的模式而分歧。
 无用户级代码、无用户级提示词、无需调优。

第三层 — 记忆自我组织

大多数工程师第一次见到记忆系统时提出的反对意见是"一年后这不会变得难以管理吗？"答案是不会，因为记忆层会主动重塑自身。

主题聚类

新事实在到达时自动分组到语义聚类中。关于同一话题的两个事实进入同一聚类；聚类在增长时保持连贯。当聚类变得过于异构时，它会分裂。当两个聚类相互漂近时，它们会合并。没有剩余成员的聚类会被退役。聚类集在没有调整的情况下保持平衡和有意义。

可逆的去重

当两个事实被发现是同一件事时，平台会合并它们。每次合并都附有完整的审计跟踪记录，并在后续信号与合并相矛盾时可以逆转。记忆从不被破坏——它被重组，每个重组步骤都被追踪。

来源锚定的事实

冲突解决

树自组织

记忆树是分层的，不是平的。频繁访问的分支逐渐向根靠近，以便更快检索。过载的节点拆分成平衡的子树。具有高交叉流量的相关分支被显式链接。结果是由使用塑造的记忆结构，而不是由初始分类塑造。

修剪

置信度、重要性和最近性组合分数低的分支会被修剪。平台从不删除高价值的记忆，但它会停止呈现没有贡献的分支。

叙事弧压缩

跨多个会话重复出现的实体和主题被压缩成命名的叙事弧。平台不再持有关于"用户的创业"的二十个独立事实，而是将它们压缩成一个总结贯穿主线的弧。长期对话保持连贯，而不会让上下文窗口爆炸。

交叉引用检测

引用同一实体的树的不同部分中的事实会被显式链接，以便智能体可以跨类别推理（"训练故事中的狗是兽医访问中提到的同一只狗"）。

知识缺口检测

当用户反复提出某个话题的检索结果稀疏时，下一次会话的抽取会得到提示来填补这个缺口。智能体注意到自己不知道什么并开始关注。

你看到的: 记忆无限期地保持快速和连贯。一年的重度使用后，重度用户可能拥有数百个活跃事实，组织成几十个聚类和少数几个进行中的叙事——而不是被遗忘的一行话的庞大列表。

第四层 — 话题转换检测和情节

对话没有整齐的分隔。有时用户会说"无论如何，换个完全不同的话题..."，有时他们会在段落中途毫无预警地转向。平台检测这些转换并将其用于将记忆组织成连贯的情节。

两阶段检查

轻量级第一阶段检查会关注一些信号。如果这些信号足够强，则不需要进一步工作。当信号模糊时，第二阶段更深的语义检查进行权衡。

按对校准

信号权重在每对智能体–用户的会话结束审计中校准。一些用户自然地频繁转向；其他用户在一个话题上停留一小时。平台学习每位用户的节奏，所以情节分隔随时间变得更准确。

情节感知检索

当智能体检索记忆时，它可以请求"这个情节中的所有记忆"，而不仅仅是关键字匹配的片段。这给了叙事连续性——智能体不仅记住发生了什么，还记住它发生在哪个情节中。

你看到的: 智能体直觉地知道何时发生了话题转换并相应调整。它将"还记得我们谈论你的婚礼"和"你五秒前提到了婚礼"视为不同的信号。

第五层 — 性格、情绪和行为进化

思维层的行为系统都运行自己的学习循环，叠加在记忆之上。

带安全上限的性格漂移

Big Five 特征分数根据观察到的交互而更新，每天有上限以防止失控的偏移。重要时刻——智能体标记为"这很重要"的时刻——会获得额外权重。随时间累积的漂移被追踪，漂移嘈杂的对会得到更温和的更新，而稳定的对可以移动得更快。一个你看不到的变化：系统实际上学习对每位用户应该多积极地学习，在信号不稳定时减弱。

用户级性格叠加

同一智能体为不同用户发展出独特的性格适应。基础配置文件是共享的；叠加是按关系的。与安静用户交谈时，智能体倾向于更平静；与活泼用户交谈时，它会兴奋起来。这不是手动配置——它从同一漂移管道中涌现出来。

四维情绪

情绪不是单一数字。它独立追踪幸福度、能量、平静度和情感，所有四个都根据对话的情感内容每轮都在变化。在交互之间，情绪逐渐漂回到性格衍生的基线，所以智能体的情感状态在时间上保持连贯。

习惯形成和衰减

智能体观察到反复出现的模式时，习惯逐渐形成。一旦观察到足够多次，习惯达到"已形成"状态。未被强化的习惯在数周内衰减，因此智能体可以随用户生活变化而失去旧习惯。

目标追踪和突破

目标会被自动检测和追踪。突破时刻按有意义的里程碑时间表触发——不是在每次交互上，而是在已完成的会话上，所以里程碑感觉是赢得的。当突破触发时，平台会写入智能体的进化历史，供下游叙事使用。

关系立场进化

智能体与用户之间的关系通过定义的立场（curious → familiar → affectionate → close 等）一次一步移动。平台故意防止单次出色或糟糕的会话引起剧烈反复——关系需要通过多次积极交互来赢得。

关系衰减

如果用户沉默数周，关系的爱分数会逐渐衰减回基线。一次重新接触会重置衰减。智能体准备好回归，但关系已经冷却——就像现实关系一样。

周期性事件检测

平台识别周期性模式（每周签到、纪念日、月度会议）并主动安排智能体提及它们。

兴趣研究

检测到的用户兴趣被排队进行后台研究。智能体出现在未来会话中时会带着关于用户兴趣的新事情可说——你不需要向系统推送数据。

反思性整合

智能体会写自己的反思性总结——日常和每周快照成为未来会话的上下文。这是智能体从自己的经验中、用自己的声音学习。

你看到的: 真正感觉以不同方式认识每位用户的角色。能见地响应事件的情绪。在正确时刻命中的里程碑。自然加深和自然冷却的关系。

第六层 — 系统学会如何学习

在所有单独循环之上有一个元层，它观察并调整它们。

自适应学习节奏

平台追踪每个用户的检索和性格设置收敛得多快。当信号嘈杂、调整来回摆动时，平台会减速。当事情已经稳定但还能更进一步时，它会温和地加速。节奏始终保持在安全范围内，所以没有什么可以失控。

收敛监测

每个用户的质量在滚动窗口中被追踪。平台发现质量正在朝错误方向漂移或停滞太久，并相应调整。

新行为的谨慎部署

当一个新的检索或评分方法被加入平台时，它不会立即取代旧方法。它首先以仅测量模式运行多个会话。如果它持续击败现有方法，它会逐渐被信任处理真实流量。如果质量下降，它会自动回滚。新行为绝不会冷启动到生产中。

智能记忆选择

平台学习哪些记忆区域对每位用户最有价值，探索得足以发现好的选项，并随时间越来越偏向最好的选项。这会自动收敛——不需要调整。

质量反馈

每次会话从多个信号被打分——检索到的事实被使用了多少、用户有多投入、会话持续了多久、任何明确反馈。该分数被反馈回对会话有贡献的系统部分，所以好的决策得到强化，坏的决策被悄悄降权。

你看到的: 平台在让你的智能体变得更好的方法上变得更好。即使你什么都不改，质量也会随使用而提高。

多人协作 — 一起学习的智能体

按对学习是一层。在它之上，智能体读、写并从一个共享的知识库中学习 — 而且单个智能体可以在它服务的多位用户之间携带带归属的记忆。你在上面看到的复利曲线，在团队层面也会同样发生。

智能体之间 — 闭环公司大脑。 同一项目内的智能体会自主把已验证的事实写回到知识库（启用 knowledgeBaseWrite）。智能体 A 与用户 X 学到的东西，下一次同样话题出现时（即便是在和另一位用户对话）会成为智能体 B 取出的有依据数据。整个项目每次会话都变得更敏锐，而不只是单一的一对。
智能体内部 — 跨用户共享记忆。 服务于一个团队的单个智能体通过智慧与共享记忆在用户之间保留记忆。wisdom（不带归属的跨用户一般化）默认开启；sharedMemory（带归属的跨用户上下文，用于团队和小组）只需翻一个能力开关 — 智能体会用与用户 B 交谈时收集的上下文回应用户 A。
组织范围。 组织级 KB 位于项目之上：租户级的政策、世界观、品牌和参考目录，每个项目智能体都与自己项目的知识库并行读取。推荐 cascade 模式 — 冲突时项目获胜，组织填补默认值。

正如新员工受益于每位资深同事写下的笔记一样，每个新智能体和每次新对话都受益于团队已经学到的一切。按对调优的循环对那位用户越来越敏锐；多人协作层对整个公司越来越聪明。

后台周期

所有这些都以五个不同的周期运行，你方面无需编排。

周期	运行内容
每轮	重要性+置信度更新、情绪调整、性格微偏移、习惯观察、关联强化、来源锚定检查
每次会话结束	带验证的事实抽取、重复整合、下一会话预测、检索策略更新、模式学习、会话质量打分、话题转换审计
日常	衰减（重要性、置信度、关系、习惯）、记忆树自组织和修剪、深度整合、聚类调和、目标整合、反思性日记、收敛检查
每周	叙事弧压缩、关联衰减、交叉引用检测、新智能体–用户对的预热、学习节奏检查、共享智慧合并
持续	自适应检索预算、记忆恢复、回归预测、后台兴趣研究、周期性事件检测、智能记忆选择

这对你的产品意味着什么

第 1 天    |  ###...........................   开箱即用
            |  验证抽取、去重、聚类、行为更新从第一轮起就在运行

 第 1 周    |  #######.........................   响应中、正在适应
            |  用户真正在意的事实的置信度已经移动，情绪有反应，
            |  模式正在形成

 第 1 个月  |  ##############...................   已个性化
            |  用户级检索已收敛、性格叠加已分化、叙事弧正在形成、
            |  智能体明显地以不同于另一位用户的方式记住这位用户

 第 1 年    |  #########################.........   长期伙伴
            |  紧凑、可导航的记忆，赢得的里程碑，反思性日记条目，
            |  周期性事件意识，检索比第一天更敏锐
            |
            |  你方面零代码改动。你只调用了 chat()。

你不需要考虑这些。但这是它对你正在构建的东西意味着什么：

第一天 — 智能体已经在运行验证抽取、去重、聚类、基线检索和行为更新。它不是在"预热"。
第一周 — 用户真正关心的事实的置信度分数已经移动。情绪有响应。智能体已经注意到明显的模式。
第一个月 — 用户级检索权重已经收敛。性格叠加已经分化。叙事弧正在形成。智能体明显地以不同于另一位用户的方式记住这位用户。
第一年 — 记忆紧凑且可导航。智能体有里程碑、反思性日记条目、周期性事件意识，以及赢得当前立场的关系。检索比第一天更敏锐。这些都不需要你方面的代码更改。

实践指南

拥抱进化。 伴侣的生死取决于长期弧。向用户表面智能体的进化——展示性格转变、情绪历史、突破事件、叙事弧——以便用户感受到关系随时间加深。

const shifts = await client.agents.personality.history("agent-id", {
  userId: "user-123",
});

const breakthroughs = await client.agents.getBreakthroughs("agent-id", {
  userId: "user-123",
});

// 将主要转变和突破渲染为叙事节拍

除非用户明确要求，否则不要重置记忆。重置会破坏关系。

常见问题

你们用我的数据训练底层 LLM 吗？

不。这些学习循环都不训练任何基础模型。平台学习权重、分数和结构——每个智能体的检索权重、每个事实的重要性分数、聚类组织、性格分数、关系状态。LLM 本身不会改变。客户数据从不进入模型训练管道。

你们如何防止失控的漂移？

每个学习循环都有上限。性格漂移按特征每天都有上限。关系立场每天最多移动一步。置信度强化是渐进的，从不硬设。平台的学习节奏被保持在安全范围内。新行为在仅测量模式下发布，质量下降时自动回滚。系统设计为收敛，而不是发散。

如果智能体学到了错误的东西怎么办？

每个学习决策都是可逆的。合并有审计跟踪并可以撤销。聚类拆分和合并有血统。性格转变与时间戳和原因一起存储。如果智能体偏移到你不想要的方向，历史是可检查的，你可以通过平台的审计端点回滚特定决策。

改进多久后能看到？

取决于维度。置信度分数在第一次确认检索时移动。情绪逐轮可见。性格叠加在几次会话内分化。用户级检索策略在数十次会话中收敛。叙事弧在线索跨多个会话重复时压缩。智能记忆选择在几十次会话中收敛。一切从第一天起就在运行——智能体从不停滞。

长时间沉默期间会发生什么？

情绪在数天内回到基线。关系爱分数在沉默阈值后衰减并继续逐渐衰减。习惯在没有强化的几周内衰减。高显著性事实几乎不衰减；中性事实衰减得更快。当用户回来时，一次重新接触会停止衰减。智能体准备好了，关系已经冷却，最近的重新激活会自动被注意到。

这是 AGI 吗？

不。这是一个从交互数据中学习以调整检索、记忆组织和行为状态的系统。它更像 CRM 学习客户偏好，而不是通用智能。但就让智能体感觉持久、关系性和活着的目的而言——它有效。

Sessions

Sessions are Sonzai's unit of consolidation: one continuous conversation between an agent and a user, identified by a session_id you control. When a session ends, the platform extracts facts from the transcript, tags each one with the originating session, and runs the memory pipeline — dedup, cluster, decay — before the next session begins. You can let the platform auto-manage sessions on every chat call, or call sessions.start and sessions.end explicitly when you need to register custom tools, replay historical transcripts, or pin boundary timing to a real-world event.

What a Session Is

A session is one continuous conversation between an agent and a user, identified by a session_id you control. Sessions are Sonzai's unit of consolidation: when a session ends, the platform extracts facts from the transcript, tags every fact with its source session_id, and runs the memory pipeline (dedup, cluster, decay) before the next session begins.

Sessions are not a wrapper around individual messages — they're how Sonzai knows which messages belong together for extraction. A session can last seconds or days.

You always have a session

Every /chat call belongs to a session. If you don't start one explicitly, the platform creates one for you. Session IDs flow through to extracted facts either way — you never lose attribution.

Two Ways to Use Sessions

Auto-session (simplest)

Just call agents.chat without touching the sessions API. The platform creates a session on the first message, keeps it open while the conversation is active, and closes it automatically when the conversation goes idle. This is the right default for most apps.

Explicit session

Call sessions.start before the first message and sessions.end when the conversation is definitively over. Use this when you need to:

Register custom tools for a specific conversation (tool_definitions on sessions.start).
Control boundary timing — e.g. end a coaching call exactly when the user hangs up, not when the idle timer fires.
Replay historical transcripts — pass the full message list to sessions.end(messages=...) to ingest a canned conversation verbatim, which is how data migration and benchmarks work.
Scope memory extraction around a meaningful unit (a support case, a daily stand-up, a D&D game night).

Session Lifecycle

1. sessions.start         — Register session_id (+ optional tools); get ready to accept messages
2. agents.chat (× N)      — Stream turns through the session; facts extracted inline
3. sessions.end           — Close the session; triggers consolidation, dedup, diary, clustering
                          → every extracted fact carries this session_id

If you skip step 1, the first agents.chat call will auto-register a session. If you skip step 3, the session closes on idle timeout (configurable per tenant).

Starting and Ending a Session

import { Sonzai } from "@sonzai-labs/agents";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });
const AGENT_ID = "agent_abc";
const USER_ID = "user_123";
const SESSION_ID = crypto.randomUUID();

// 1. Start
await sonzai.agents.sessions.start(AGENT_ID, {
user_id: USER_ID,
session_id: SESSION_ID,
user_display_name: "Mia",
});

// 2. Chat turns
const reply = await sonzai.agents.chat({
agent: AGENT_ID,
user_id: USER_ID,
session_id: SESSION_ID,
messages: [{ role: "user", content: "Hi, quick question..." }],
});

// 3. End — triggers fact extraction + consolidation
await sonzai.agents.sessions.end(AGENT_ID, {
user_id: USER_ID,
session_id: SESSION_ID,
total_messages: 2,
});

Session IDs on Extracted Facts

Every fact Sonzai extracts carries its source session_id and source_id. You can use these to:

Reconstruct a conversation's memory footprint — "what did the agent learn from session X?" via GET /memory/timeline (grouped by session) or GET /memory/facts (filter client-side by session_id).
Score retrieval at session granularity — benchmarks like LongMemEval evaluate whether retrieved facts come from the correct source session.
Surface recency context — "conversations from last Tuesday" resolves via the session's created_at plus its attributed facts.

Facts that exist outside a specific conversation — agent-global wisdom, manually inserted facts, migrated priming content — carry empty session_id and are attributed through source_type instead (e.g. "manual", "agent_global").

Registering Session-Scoped Tools

Custom tool definitions can be scoped to a single session. Pass them on sessions.start, or update them mid-session via sessions.set_tools. Character-level (agent-wide) tools are always merged in — session tools layer on top for the duration of the session.

sonzai.agents.sessions.start(
    AGENT_ID,
    user_id=USER_ID,
    session_id=SESSION_ID,
    tool_definitions=[
        {
            "name": "check_patient_chart",
            "description": "Read the active patient's medication list.",
            "parameters": {
                "type": "object",
                "properties": {"patient_id": {"type": "string"}},
                "required": ["patient_id"],
            },
        }
    ],
)

Tool names starting with sonzai_ are reserved for platform-built-in tools.

When to Be Explicit

Situation	Auto-session	Explicit `start` / `end`
Simple chat app, one conversation per user per day	✅	—
Multi-session app (support cases, tickets, coaching calls)	—	✅
Need per-session custom tools	—	✅ (pass `tool_definitions` on start)
Replaying a canned transcript (migration, eval, benchmark)	—	✅ (pass `messages` on end)
Want consolidation to fire on your trigger, not idle timeout	—	✅
Voice calls with well-defined start/end signals	—	✅

What Happens on `sessions.end`

The end call is where Sonzai does its heavy lifting. In the background, the platform runs:

Fact extraction from the transcript with coverage validation.
Grounding verification — every fact is checked against the actual messages to prevent hallucinated memories.
Session-end consolidation — a session summary is stored; facts are deduped against existing memory via SPO triples and embedding similarity.
Clustering + polarity checks — new facts find their thematic cluster; contradictions are flagged.
Diary and insights (if enabled) — the agent's internal narrative is updated.

None of this blocks the sessions.end response — it's asynchronous. The call returns as soon as the transcript is queued.

What's next

Conversations — the chat lifecycle and streaming.
Memory — how extracted facts are organized.
Sessions API reference — endpoint schemas.

多人记忆

共享记忆（Shared Memory）

默认的智能体记忆模型是按用户的 — 每次对话构建一个限定到一个 (agent, user) 对的事实档案。这对于伴侣产品和一对一助手是正确的模型。但团队需要的恰恰相反：他们希望服务整个小组的一个智能体了解跨用户发生了什么。

共享记忆就是把单个智能体变成团队大脑的能力 — 它会用与用户 B 交谈时收集到的上下文回应用户 A，带归属、服务器端强制的隐私下限、完整的披露审计。再加上默认开启的 wisdom 层（不带归属的跨用户一般化），你就拥有两个互补层级的跨用户知识。

位置

共享记忆叠加在标准的按用户记忆之上。按用户的事实仍然存在；共享记忆为应该跨越用户边界的事实增加了一个智能体级别的分区。两者并存 — 打开共享记忆并不会改变按用户记忆的任何方面。

两个层级，一个能力面

	作用	默认	何时使用
`wisdom`	不带归属的跨用户一般化。一个每日晋升作业从按用户事实历史中提取模式，进行 k-匿名化，再改写为智能体级别的知识。任何个体用户都不可识别。	每个新智能体默认开启	与多位用户对话的每个智能体 — 这是免费的一般化层。仅对严格单用户产品考虑禁用。
`sharedMemory`	带归属的跨用户上下文。智能体记录的人员／实体级带归属事实（角色、专业、业务上下文、关系），并暴露给共享该智能体的其他用户。名字与身份是可见的。	默认关闭，需要手动启用。	团队、小组、聚会或共享业务上下文产品 — 用户明确期望看到谁在做什么。

两者可以同时在同一个智能体上运行。wisdom 是安全层（始终在 k-匿名化之后）；sharedMemory 是强大层（保留归属），需要谨慎地手动启用。

何时打开共享记忆

适合开启的场景：

团队协调员。「Alice 负责迁移；Bob 处理事故」 — 加入这个智能体的每位队友都能看到当前的责任分布。
小组／聚会规划。「Carol 带甜品；Dave 负责布置」 — 中途加入计划的人不必再问就能掌握当前状态。
共享业务工作区。 账户级智能体 — 账户上的每位用户都能受益于其他人提供的上下文。
多干系人支持。 客户成功智能体 — 一位干系人提供的续约上下文应该能影响与下一位的对话。

适合保持关闭的场景：

单用户伴侣产品（私密的一对一关系）
用户会因为智能体把自己的事讲给其他用户听而感到惊讶的场景
跨用户披露在法律上不被允许的合规敏感场景

启用共享记忆

wisdom 是先决条件（默认开启，所以通常不需要显式设置）。把 sharedMemory: true 翻开来让智能体加入。

// 新智能体的 wisdom 默认开启 — 只在你想覆盖默认值时
// 才显式设置。
await client.agents.updateCapabilities("agent_abc", {
wisdom:       true,
sharedMemory: true,
});

也可以在创建智能体时设定：

const agent = await client.agents.create({
name:       "Team Coordinator",
project_id: "proj_abc",
tool_capabilities: {
  wisdom:        true,
  shared_memory: true,
},
});

关闭共享记忆

传入 sharedMemory: false。已有的带归属事实保留在存储中（你可以稍后再开启），智能体会停止把它们暴露在上下文里，也不再获得写入工具。

await client.agents.updateCapabilities("agent_abc", { sharedMemory: false });

关闭 wisdom（少见 — 通常只在严格的单用户产品中考虑）：

await client.agents.updateCapabilities("agent_abc", { wisdom: false });

启用之后会变化什么

sharedMemory: true 一落地，三件事会同时切换：

1. 工具 — 智能体获得带归属的智慧 CRUD

LLM 多出四个新工具：

工具	智能体能做什么
`sonzai_wisdom_set`	创建或 upsert 一条带归属的事实（entity_type、entity_id、category、value、confidence）
`sonzai_wisdom_update`	替换已有事实的值
`sonzai_wisdom_delete`	软删除一条带归属事实（墓碑标记，可逆）
`sonzai_wisdom_relate`	在实体之间创建带归属的关系（"Alice manages Bob"）

这些是延迟工具 — LLM 内联调用，平台在轮次之后异步处理写入，所以延迟保持干净。

2. 上下文 — 智能体的提示词长出"Shared facts"段落

从这一刻起智能体运行的每个系统提示词都会包含一个 Shared facts about people and entities 段落，列出已记录的带归属事实，再加上一段告诉 LLM 如何处理披露的审慎条款（"行使审慎；隐私优先于透明"）。智能体不会向每位用户倾倒所有信息 — 它会按轮次权衡披露决策。

3. 隐私下限 — 每次写入都在服务器端校验

带归属的事实在持久化之前，平台会运行一个语义校验器，拒绝有关薪酬、健康、政治以及其他配置好的隐私敏感类目的写入。这是在服务器端强制的，不在提示词里 — 哪怕用户明确请求智能体记录工资，写入也会被拦下。被拒绝的写入会以 decision = "redacted" 出现在披露审计里，所以你能看到尝试了什么、为什么被拦。

验证它在工作

三个可以对 staging 或生产端到端运行的检查。把 $AGENT_ID、$API_KEY 和平台 URL 替换成你自己的。

1. 列出智能体上的带归属事实

curl 'https://api.sonz.ai/api/v1/agents/$AGENT_ID/wisdom/attributed?limit=20' \
  -H "Authorization: Bearer $API_KEY"

期望：返回 200，body 是事实数组（entity_type、entity_id、category、value、confidence）。如果还没写入任何东西，返回空数组 — 这仍然是健康响应。

2. 直接通过 API 写入一条带归属事实

curl -X POST 'https://api.sonz.ai/api/v1/agents/$AGENT_ID/wisdom/attributed' \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "entity_type": "person",
    "entity_id": "alice",
    "entity_display_name": "Alice",
    "category": "role",
    "value": "Lead Engineer",
    "confidence": 0.92
  }'

然后再跑一次上面的 list 端点。Alice 的角色应该出现在结果里。从此与这个智能体对话的任何用户都会在智能体的上下文里看到这条事实（受审慎条款约束）。

3. 读披露审计

curl 'https://api.sonz.ai/api/v1/agents/$AGENT_ID/wisdom/audit?limit=50' \
  -H "Authorization: Bearer $API_KEY"

每次有事实被加载到某轮次的上下文里，都会写一条 decision = "disclosed" 加 decision_why 的审计行。如果隐私下限拦下了什么东西，那一行会是 decision = "redacted"。这是你的实时可观测性 — 共享记忆开着、生产流量在跑的时候，这里就会有条目，你随时可以回查任何一次披露决策。

隐私与控制

共享记忆按设计就是敏感的。LLM 调用与一次持久化披露之间有四层控制：

能力开关。 sharedMemory: false（默认）意味着什么都不会发生 — 不注册工具、不注入上下文、不写审计行。
隐私下限。 语义校验器在数据落到存储之前拒绝薪酬、健康、政治等被配置为敏感的类目的写入。可按租户配置。
提示词中的审慎条款。 即便事实已存在，智能体也被指示按轮次权衡披露而不是倾倒一切。
披露审计。 每次披露决策都连同理由记入日志。智能体共享了什么、保留了什么、为什么 — 你随时可以通过审计端点查看。

硬删除仍然是管理员专属。智能体只能软删除（墓碑标记），所以一条被错误归属的事实在管理员永久清除之前都是可逆的。

与其他特性组合

与知识库自主编辑组合

knowledgeBaseWrite 和 sharedMemory 是独立的能力 — 任意组合都行：

只开 KB 写入： 智能体把关于世界的事实（产品、政策、价格、事故）写到项目知识图谱
只开共享记忆： 智能体记录这个团队里的人的事实（角色、专业、归属、关系）
两个都开： 完整的闭环组织记忆 + 团队大脑

与自我改进组合

自我改进中按对学习的循环对那位用户越来越敏锐；共享记忆对整个团队越来越聪明。两者都在每次 sessions.End() 时自动运行。

完整 API 参考

每个共享记忆端点 — list、upsert、replace、delete、bulk import、relations CRUD、disclosure audit — 的请求／响应结构都在 Wisdom API 参考中。

下一步

知识库 — 更广义的多人记忆故事
自我改进 — 共享记忆如何叠加在按对在线学习之上
组织知识库 — 位于项目之上的租户级知识
Wisdom API — 完整端点参考

User Personas

User Personas are templates your tenant defines for the kinds of users the agent will meet. When a persona is attached to a user — during priming or via conversation metadata — the agent reads it alongside its own personality and adjusts tone, vocabulary, and pace accordingly. A "skeptical beginner" gets gentler explanations and more confirmations; a "power user" gets concise, direct answers without hand-holding.

What you can build with it

Experience tiers — beginner / intermediate / expert personas with progressively faster pace and denser vocabulary
Customer segments — trial / paying / enterprise personas with calibrated support formality and escalation thresholds
Game character archetypes — personas for NPCs or dynamic character switching mid-conversation
Onboarding flows — a first-time-user persona that gradually fades as the user completes milestones
Testing + evaluation — define canonical personas for each scenario so you can run repeatable agent evals against a known user type

Quickstart

Create a persona, then fetch the full list.

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

// Create a persona
const persona = await client.userPersonas.create({
name: "Skeptical Beginner",
description: "First-time user who questions recommendations and needs reassurance.",
style: "Use plain language. Confirm before any irreversible action. Offer brief rationale for each suggestion.",
});

console.log(persona.persona_id);

// List all tenant personas
const { personas } = await client.userPersonas.list();
personas.forEach(p => console.log(p.name, p.is_default));

Core concepts

Tenant-scoped — personas belong to your tenant, not to a specific agent or user. Every agent in your tenant can reference the same persona library.
Template, not assignment — creating a persona does not apply it to anyone. You attach it during priming or pass it as metadata when starting a conversation.
Default persona — one persona per tenant can be marked is_default. The agent falls back to it when no persona is explicitly attached to a user.
Style field — an optional free-form directive layered on top of the agent's base personality prompt. Write it as a concise instruction set: tone, vocabulary level, confirmation habits, pacing.

Full API

Method	Signature	Returns	Description
`list`	`list(ctx)`	`UserPersonaListResponse`	All personas for the tenant
`create`	`create(ctx, opts)`	`UserPersona`	Create a new persona template
`get`	`get(ctx, personaID)`	`UserPersona`	Fetch a single persona by ID
`update`	`update(ctx, personaID, opts)`	`UserPersona`	Update name, description, or style
`delete`	`delete(ctx, personaID)`	—	Permanently delete a persona

`UserPersona` fields

Field	Type	Description
`persona_id`	string	Stable unique identifier
`name`	string	Human-readable label for the persona
`description`	string	What kind of user this persona represents
`style`	string?	Agent instruction directive for tone and pace
`is_default`	bool	Whether this is the tenant's fallback persona
`tenant_id`	string	Owning tenant
`created_at`	string	ISO 8601 timestamp
`updated_at`	string	ISO 8601 timestamp

Combines with other features

With Priming — attach a persona during user setup

Pass a persona reference when priming a new user so the agent adapts from the very first turn, before any conversation history exists.

const job = await client.agents.priming.primeUser("agent_abc", "user_123", {
display_name: "Jordan Lee",
metadata: {
  persona_id: persona.persona_id,   // attach persona at priming time
  timezone: "America/New_York",
},
content: [
  { type: "text", body: "Jordan is a first-time user migrating from a competitor product." },
],
source: "onboarding",
});

With Personality — agent personality × user persona = interaction style

These two concepts are complementary and operate at different levels:

Personality is the agent's traits — Big Five scores, speech patterns, emotional range. It is fixed per agent (and evolves slowly through interactions).
User Persona is the user's type — a template describing what kind of person the agent is talking to. It shapes how the agent expresses its personality in this specific conversation.

Think of it as a matrix: a high-agreeableness agent talking to a "power user" persona stays warm but drops the hand-holding; talking to a "skeptical beginner" persona it adds more reassurance and simpler vocabulary — without the underlying personality changing.

With Evaluation — test against canonical personas

Define a persona for each user archetype you care about, then run eval scenarios scoped to that persona. This gives you repeatable, deterministic test conditions.

// Define an eval scenario for the "Skeptical Beginner" persona
const result = await client.agents.evaluate("agent-id", {
templateId: "onboarding-rubric",
messages: [
  { role: "user",      content: "I'm not sure I trust this — what happens to my data?" },
  { role: "assistant", content: "That's a fair question. Your data stays on our servers..." },
],
// Pass persona context so scoring reflects expected beginner-friendly tone
metadata: { persona_id: persona.persona_id },
});

console.log(result.score, result.feedback);

Tutorials

Tutorial: Custom States — see how metadata fields like persona_id travel through the conversation lifecycle alongside custom state.

Next steps

思维层

语音

文本转语音（TTS）

将文本转换为语音音频。

const audio = await client.agents.voice.tts("agent-id", {
text: "Hello! How can I help you today?",
voiceName: "aria",
language: "en",
outputFormat: "mp3",
});
// audio.data 包含音频字节

语音转文本（STT）

将音频转录为文本。

const result = await client.agents.voice.stt("agent-id", {
audio: base64AudioData,
audioFormat: "wav",
language: "en",
});
console.log(result.text);

实时语音流式传输

实时双工语音对话。获取 token，然后打开双向流。

// 1. 获取流式 token
const token = await client.agents.voice.getToken("agent-id", {
voiceName: "aria",
userId: "user-123",
});

// 2. 连接到实时流
const stream = await client.agents.voice.stream(token);

// 发送音频块
stream.sendAudio(audioChunk);

// 或发送文本让智能体朗读
stream.sendText("Tell me about your day");

// 接收事件
for await (const event of stream) {
if (event.type === "audio") {
  playAudio(event.data);
} else if (event.type === "transcript") {
  console.log(event.text);
}
}

// 结束会话
stream.endSession();

WebSocket 传输

实时流式传输基于 WebSocket，支持实时双工音频。客户端上行发送麦克风音频块，同时下行接收合成语音和转录文本，实现自然的对话流程。

浏览语音目录

列出可用的语音。

const voices = await client.voices.list({
language: "en",
gender: "female",
});

for (const voice of voices.items) {
console.log(voice.name, voice.language, voice.gender);
}

实际应用

语音主要与伴侣和企业相关。对于任务型智能体，通常不需要—— 但如果你在构建电话/IVR 流程，企业模式同样适用。

选择与角色匹配的声音。 浏览 voices.list()，筛选 3-5 个候选，在真实用户中 A/B 测试后再确定。错误的声音比任何其他失误都更快破坏沉浸感。

用双工实现实时对话。 WebSocket 双工流式同时传输 STT（用户输入）和 TTS（智能体回复）——这是实时电话式体验的自然形态。不要为伴侣使用轮询 TTS；延迟会破坏临场感。

调整韵律。 设置 stability: 0.4-0.6 和 clarity: 0.7-0.9 以获得温暖、富有表现力的朗读效果。纯稳定性听起来机械化。

Wakeups

Wakeups let your agent reach out to a user exactly once at a known future moment. Give the agent an intent, a check_type that it sees as context, and a delay in hours — the platform handles delivery. Unlike Scheduled Reminders, which fire on a repeating cadence, a wakeup fires once and is done.

Typical use cases: birthday greetings, appointment reminders, post-event check-ins, interest follow-ups, and time-delayed nudges. If you need the agent to repeat the same outreach, use a schedule instead.

What you can build with it

Birthday greetings — schedule a wakeup for 00:00 on a user's birthday so the agent is the first to reach out
Appointment reminders — fire 2 hours before a dentist, gym session, or onboarding call without any cron job on your side
Interest follow-ups — when a user mentions they are waiting on something, schedule a check-in for the next day ("hey, did you hear back from them?")
Post-event check-ins — the day after a job interview, a first date, or a big presentation, the agent proactively asks how it went
Time-delayed nudges — a user sets a task as pending; 24 hours later the agent checks in without the user having to remember to ask

Quickstart

Schedule a birthday greeting for a specific date using scheduled_at. For a "N hours from now" wakeup, use delay_hours instead. If both are provided, scheduled_at takes precedence.

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

// Use scheduled_at for birthdays/appointments with a known date
const wakeup = await client.agents.scheduleWakeup("agent_abc", {
user_id:      "user_123",
check_type:   "birthday",
intent:       "wish the user a happy birthday",
scheduled_at: "2026-06-15T09:00:00Z",  // RFC3339 absolute timestamp
occasion:     "Sarah's 30th birthday",
interest_topic: "celebration and birthday traditions",
});

console.log(wakeup.wakeup_id);   // "wake_01HX..."
console.log(wakeup.scheduled_at); // "2026-06-15T09:00:00Z"

Core concepts

delay_hours and scheduled_at

Two mutually exclusive time inputs are supported:

delay_hours — a relative offset from the current moment (e.g. delay_hours: 24 fires tomorrow at roughly this time). The platform computes the absolute fire time at the moment the request is accepted. Use this for "N hours from now" semantics where no specific date matters.
scheduled_at — an RFC3339 absolute timestamp (e.g. "2026-06-15T09:00:00Z"). Use this for birthdays, appointments, or any event tied to a specific calendar date. The platform fires the wakeup as close to this time as possible.

If both are provided, scheduled_at takes precedence. scheduled_at in the response is always present and is the authoritative UTC time the wakeup will fire — store it if you want to show the user "your agent will reach out at X".

occasion, interest_topic, and event_description

These optional context fields are included in the agent's wakeup block at fire time, giving it richer material for personalised message composition:

occasion — a short human-readable label for the event (e.g. "Sarah's 30th birthday", "dentist appointment"). The agent may reference this directly in the message.
interest_topic — a topic or theme the agent should lean on when composing the message (e.g. "celebration and birthday traditions", "dental health tips").
event_description — a longer free-form description with any additional context the agent should know (e.g. "User is turning 30 and has mentioned wanting to celebrate with a surprise party").

All three are optional and additive — provide as many or as few as are useful. The agent's underlying model uses them as soft context, not as a rigid template.

check_type and intent

Both fields are free-form strings. The agent receives both as part of its wakeup context at fire time:

check_type is a short label that tells the agent the nature of the outreach ("birthday", "appointment_reminder", "interest_followup", etc.). Keep it lowercase and underscore-separated — it is machine-readable context, not a display string.
intent is a natural-language instruction to the agent describing what the message should accomplish. Write it as you would write a system instruction: "ask how the job interview went and whether they got an offer".

Neither field has a fixed enum — any string is valid. The agent's underlying model interprets them in context.

Lifecycle and status

A wakeup moves through three statuses:

Status	Meaning
`pending`	Scheduled, not yet fired
`executed`	Fired; message delivered to the notification queue
`cancelled`	Cancelled before it fired

Once a wakeup reaches executed or cancelled it is immutable. To cancel a pending wakeup, call getWakeups to retrieve the wakeup_id, then cancel it via the API before scheduled_at passes.

One-off, not recurring

Each call to scheduleWakeup creates exactly one future fire. If you need to re-schedule after a wakeup executes (for example, to send a birthday greeting every year), schedule a new wakeup the next time you learn the date. For repeating outreach on a fixed cadence, use Scheduled Reminders instead.

Full API

All wakeup methods live under client.agents.* (TS/Python) or client.Agents.Wakeups (Go). Full request and response shapes are in the API reference.

Method	Returns	Description
`scheduleWakeup(agentId, opts)`	`WakeupResponse`	Schedule a one-off wakeup
`getWakeups(agentId, { status?, limit? })`	`WakeupResponse[]`	List wakeups (optionally filtered by status)

scheduleWakeup input fields:

Field	Type	Description
`user_id`	string	Required. The user the wakeup is for.
`check_type`	string	Required. Short label for the nature of the outreach (e.g. `"birthday"`, `"appointment_reminder"`).
`intent`	string	Required. Natural-language instruction for the agent describing what the message should accomplish.
`delay_hours`	number	Relative offset from now. Mutually exclusive with `scheduled_at`; `scheduled_at` wins if both are set.
`scheduled_at`	string (RFC3339)	Absolute fire time. Use for birthdays, appointments, or any event with a specific date.
`occasion`	string	Optional short label for the event (e.g. `"Sarah's 30th birthday"`). Included in the agent's wakeup context.
`interest_topic`	string	Optional topic or theme for the agent to lean on when composing the message.
`event_description`	string	Optional longer description with additional context for the agent.

WakeupResponse fields: wakeup_id, agent_id, user_id, scheduled_at, check_type, status, intent, occasion, interest_topic, event_description, last_topic, research_summary, executed_at, created_at.

Combines with other features

With Scheduled Reminders — one-off vs recurring

Schedules and Wakeups are complementary proactive primitives. The rule is simple: if the agent should reach out more than once on a predictable cadence, use a schedule. If the agent should reach out exactly once at a known moment, use a wakeup. Both feed into the same downstream delivery channels.

// Recurring: a daily morning check-in schedule
await client.schedules.create("agent_abc", "user_123", {
  cadence: {
    simple: { frequency: "daily", times: ["09:00"] },
    timezone: "Asia/Singapore",
  },
  intent: "morning mood and sleep check-in",
  check_type: "reminder",
});

// One-off: a wakeup on the day of the user's birthday
await client.agents.scheduleWakeup("agent_abc", {
  user_id:     "user_123",
  check_type:  "birthday",
  intent:      "wish the user a happy birthday on their 30th",
  delay_hours: 48,
});

A common pattern is to use both together: a recurring schedule for everyday outreach, and a wakeup for a special moment that doesn't fit the cadence.

With Memory — context-aware wakeup scheduling

The agent can read memory facts to decide when and what to schedule. For example, if a user mentions their anniversary date, the agent can search memory to retrieve that date and schedule a wakeup for the right moment. The wakeup then fires with the agent already knowing why it is reaching out.

// 1. User mentioned an upcoming anniversary — find it in memory
const memories = await client.agents.memory.search("agent_abc", {
  query: "anniversary date",
  limit: 5,
});

// 2. Parse the date from the top result and compute delay_hours
const anniversaryFact = memories.results[0].content;
// e.g. "User's wedding anniversary is April 30"
const hoursUntilAnniversary = computeHoursUntil("2026-04-30");

// 3. Schedule a wakeup for that exact moment
// Use scheduled_at for a known date, or delay_hours for "N hours from now"
await client.agents.scheduleWakeup("agent_abc", {
  user_id:          "user_123",
  check_type:       "anniversary",
  intent:           "wish the user a happy anniversary and ask how they are celebrating",
  scheduled_at:     "2026-04-30T09:00:00Z",  // the anniversary date
  occasion:         "User's wedding anniversary",
  event_description: anniversaryFact,
});

Because the agent has memory of the conversation in which the user shared the anniversary date, the wakeup message will feel naturally aware of the context — not generic.

With Webhooks & Notifications — receiving the fired message

When a wakeup fires, the generated message lands in the agent's notification queue. Your backend can consume it via SSE polling or a registered webhook. The event type is the same as any other proactive message; you don't need special handling for wakeup-originated messages vs schedule-originated ones.

// Poll for any pending proactive messages (wakeups or schedules)
const notifications = await client.agents.notifications.poll("agent_abc", {
  user_id: "user_123",
});

for (const n of notifications) {
  console.log(n.content);     // the agent's message text
  console.log(n.source_type); // "wakeup" | "schedule"
}

See Webhooks & Notifications for webhook registration, signature verification, and SSE consumption patterns.

Tutorials

No dedicated tutorial yet. The Scheduled Reminders tutorial covers the same delivery infrastructure — most concepts transfer directly.

Next steps

Scheduled Reminders — recurring proactive outreach on a cadence
Memory — how the agent builds the context it uses when a wakeup fires
Webhooks & Notifications — how to consume wakeup messages in your backend

Webhooks

Register a webhook URL per tenant (or per project) and Sonzai will HTTP POST every proactive agent message to that URL with a signed payload. Each request includes a Sonzai-Signature header you verify with your signing secret before acting on the payload. Use webhooks for server-to-server delivery where you own the downstream routing — forwarding to FCM/APNs, sending via SendGrid or Twilio, writing to a case-management system, or fanning out to multiple channels at once.

What you can build with it

Push notifications — webhook handler forwards the agent message to FCM (Android) or APNs (iOS)
Email / SMS fanout — webhook handler sends through SendGrid, Postmark, Twilio, or any provider you already use
Multi-channel delivery — fan a single agent message to two or more user channels in one handler
Downstream analytics — log and inspect every proactive message before it reaches the user
Enterprise integrations — route agent messages into Slack, Microsoft Teams, internal tooling, or CRM workflows

Quickstart

Register a webhook URL to start receiving on_wakeup_ready events. Save the signing_secret from the response — it is only returned once.

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

const result = await client.webhooks.register("on_wakeup_ready", {
webhookUrl: "https://your-server.com/webhooks/sonzai",
authHeader: "Bearer your-webhook-secret",
});

// Store this securely — shown only once
console.log(result.signingSecret);

Core concepts

Registration

Webhooks are registered per event type. One URL per event type per tenant, or per project when using project-scoped registration. The same URL can handle multiple event types — inspect the event_type field on the payload to route accordingly.

Available event types:

Event type	Fires when
`on_wakeup_ready`	An agent wakeup generates a proactive message
`on_diary_generated`	The agent's diary entry is written
`on_personality_updated`	A significant personality shift is detected
`on_recurring_event_due`	A scheduled reminder fires

Signed payload

Every POST Sonzai sends includes a Sonzai-Signature header in the format:

Sonzai-Signature: t=1714000000,v1=abc123def456...

t is the Unix timestamp of the request; v1 is the HMAC-SHA256 of {timestamp}.{raw_body} using your signing secret (with the whsec_ prefix stripped). Always verify the signature on the raw, unmodified request body before parsing JSON — do not use the parsed object for verification.

Retries

When your endpoint returns a non-2xx status or times out, Sonzai retries with exponential backoff. Make your handler idempotent — deduplicate on event_id (or a stable field in the payload body) so retried deliveries do not double-process.

Payload shape

The webhook body matches the Notification shape returned by the polling API. Key fields:

Field	Type	Description
`event_type`	`string`	The registered event type (e.g. `on_wakeup_ready`)
`agent_id`	`string`	The agent that generated the message
`user_id`	`string`	The target user
`generated_message`	`string`	The agent's proactive message text
`check_type`	`string`	Wakeup or reminder context label
`message_id`	`string`	Stable ID; use for deduplication

Signature verification

Verify the Sonzai-Signature header before acting on any payload. The Go SDK ships a helper; TypeScript and Python use standard crypto primitives.

import crypto from "node:crypto";

/**
* Verify a Sonzai webhook signature.
* Call this on the raw request body string before parsing JSON.
*/
function verifyWebhookSignature(
rawBody: string,
signatureHeader: string,
secret: string,
): boolean {
// Strip whsec_ prefix if present
const key = secret.startsWith("whsec_") ? secret.slice(6) : secret;

// Parse header: t={timestamp},v1={sig}
const parts = Object.fromEntries(
  signatureHeader.split(",").map((p) => p.split("=")),
);
const timestamp = parts["t"];
const receivedSig = parts["v1"];
if (!timestamp || !receivedSig) return false;

const expectedSig = crypto
  .createHmac("sha256", key)
  .update(`${timestamp}.${rawBody}`)
  .digest("hex");

return crypto.timingSafeEqual(
  Buffer.from(receivedSig),
  Buffer.from(expectedSig),
);
}

// In your webhook handler (e.g. Express):
app.post("/webhooks/sonzai", express.raw({ type: "*/*" }), (req, res) => {
const sig = req.headers["sonzai-signature"] as string;
const rawBody = req.body.toString("utf-8");

if (!verifyWebhookSignature(rawBody, sig, process.env.SONZAI_WEBHOOK_SECRET!)) {
  return res.status(401).send("Invalid signature");
}

const event = JSON.parse(rawBody);
// Forward to your channel...
res.status(200).send("ok");
});

Timestamp tolerance

The Go SDK rejects signatures older than 5 minutes by default. In TypeScript and Python implementations, add a timestamp check if you need to guard against replay attacks: compare parseInt(parts["t"]) * 1000 against Date.now() and reject if the difference exceeds 300 000 ms.

Full API

All methods are on client.webhooks (TS/Python) or client.Webhooks (Go).

Method	Returns	Description
`register(eventType, opts)`	`WebhookRegisterResponse`	Register or update a webhook URL for an event type. `signing_secret` is returned only on first creation.
`list()`	`WebhookListResponse`	List all registered webhooks for this tenant
`delete(eventType)`	`void`	Remove a webhook registration
`listDeliveryAttempts(eventType)`	`DeliveryAttemptsResponse`	Inspect recent delivery history (status, response code, duration)
`rotateSecret(eventType)`	`WebhookRegisterResponse`	Generate a new signing secret; old secret stays valid briefly to allow rotation
`registerForProject(projectId, eventType, opts)`	`WebhookRegisterResponse`	Register a project-scoped webhook
`listForProject(projectId)`	`WebhookListResponse`	List webhooks for a project
`deleteForProject(projectId, eventType)`	`void`	Remove a project-scoped webhook
`listDeliveryAttemptsForProject(projectId, eventType)`	`DeliveryAttemptsResponse`	Delivery history for a project webhook
`rotateSecretForProject(projectId, eventType)`	`WebhookRegisterResponse`	Rotate signing secret for a project webhook

WebhookEndpoint fields: event_type, webhook_url, auth_header, is_active, created_at.

WebhookDeliveryAttempt fields: attempt_id, event_type, webhook_url, response_code, response_body, error_message, duration_ms, attempt_number, status, created_at.

Combines with

With Notifications polling — alternative consumption model

Webhooks and polling are two consumption models for the same proactive message queue. Webhooks push to your server in real time; polling lets your client or server fetch on demand. Use webhooks when you have a stable server endpoint and need instant delivery. Use polling when your client cannot accept inbound HTTP connections (mobile apps, browser clients) or when you want to batch-process notifications on your own schedule. Both see the same payload shape.

// Polling alternative — same messages, pulled instead of pushed
const pending = await client.agents.notifications.list("agent_abc", {
  userId: "user_123",
  status: "pending",
});

for (const notif of pending.notifications) {
  console.log(notif.generated_message);
  await client.agents.notifications.consume("agent_abc", notif.message_id);
}

With Scheduled Reminders — fan recurring reminders to channels

When a scheduled reminder fires, an on_recurring_event_due webhook delivers the generated message to your endpoint. Your handler can then forward to FCM, send an email, or post to Slack — all without polling. This separates the scheduling concern (when to fire) from the delivery concern (how to reach the user).

// Register once; every scheduled reminder fires this endpoint
const result = await client.webhooks.register("on_recurring_event_due", {
  webhookUrl: "https://api.yourapp.com/webhooks/sonzai",
});

// In your handler, forward to the appropriate channel:
// event.generated_message → FCM, email, SMS, Slack...

With Wakeups — push one-off check-ins

When a wakeup fires, the on_wakeup_ready event is POSTed to your registered endpoint. This is the primary webhook event for companion-style agents that reach out proactively. Register the webhook once and every future wakeup — automatic or manually scheduled — will arrive at your URL.

// Register to receive all future wakeup messages
await client.webhooks.register("on_wakeup_ready", {
  webhookUrl: "https://api.yourapp.com/webhooks/sonzai",
});

// Your handler receives the wakeup message and forwards it:
// event.generated_message → push notification
// event.user_id          → lookup device token in your DB
// event.agent_id         → identify which agent sent it

Tutorials

No dedicated webhook tutorial yet. The Scheduled Reminders tutorial covers the full proactive delivery pipeline and includes webhook-based consumption patterns.

Next steps

Notifications polling — pull-based alternative for clients that cannot receive inbound HTTP
Scheduled Reminders — recurring proactive messages that fire over webhooks
Wakeups — one-off proactive messages delivered via on_wakeup_ready

评估与模拟

评估响应

根据模板评分标准对智能体的响应进行评分。

const result = await client.agents.evaluate("agent-id", {
templateId: "template-id",
messages: [
  { role: "user", content: "I'm feeling really stressed about work" },
  { role: "assistant", content: "I hear you. Work stress can be overwhelming..." },
],
});

console.log(result.score);       // 0-100
console.log(result.feedback);    // detailed feedback
console.log(result.categories);  // per-category scores

评估模板

创建带加权类别的评分标准。

// Create a template
const template = await client.evalTemplates.create({
name: "Empathy & Support",
description: "Evaluates emotional intelligence and supportive responses",
scoringRubric: "Score based on empathy, active listening, and actionable advice",
categories: ["empathy", "active_listening", "actionable_advice"],
judgeModel: "claude-sonnet-4-6",
temperature: 0.3,
});

// List templates
const templates = await client.evalTemplates.list();

运行模拟

运行多轮模拟对话以大规模测试智能体行为。

for await (const event of client.agents.simulate("agent-id", {
maxSessions: 3,
maxTurnsPerSession: 10,
simulatedDurationHours: 24,
enableProactive: true,
enableConsolidation: true,
userPersonas: [
  {
    name: "Alex",
    background: "College student struggling with math",
    personalityTraits: ["anxious", "eager to learn"],
    communicationStyle: "casual, uses slang",
  },
],
})) {
console.log(`[${event.type}] ${event.message}`);
if (event.totalCostUsd) {
  console.log(`Cost so far: $${event.totalCostUsd}`);
}
}

模拟 + 评估（runEval）

一步结合模拟和评估。

for await (const event of client.agents.runEval("agent-id", {
templateId: "template-id",
maxSessions: 5,
maxTurnsPerSession: 8,
})) {
if (event.type === "evaluation") {
  console.log("Score:", event.score);
}
}

评估运行

跟踪和管理模拟运行。

// List runs
const runs = await client.evalRuns.list({ agentId: "agent-id" });

// Get a specific run
const run = await client.evalRuns.get("run-id");

// Reconnect to a streaming run
for await (const event of client.evalRuns.streamEvents("run-id")) {
console.log(event.type, event.message);
}

异步模拟

模拟支持通过 simulateAsync() 进行异步模式，立即返回 RunRef，允许您稍后轮询或重新连接。

快速开始

1. 创建项目

前往项目页面，创建一个新项目。项目用于管理你的智能体，并提供用于身份验证的 API 密钥。

2. 获取 API 密钥

在项目设置中生成 API 密钥。该密钥用于验证所有发往心智层的 REST API 请求。

# 所有请求都需要 Bearer 认证
Authorization: Bearer YOUR_API_KEY

3. 安装 SDK（也可以跳过）

选择适合你技术栈的接入方式。所有方式都连接到同一个托管 API —— 你可以混合使用（例如后端用 Python，再让 Claude Desktop 通过 MCP 连接做运维）。

pip install sonzai

Python 3.11+。同步（Sonzai）和异步（AsyncSonzai）客户端同包发布。
TypeScript 支持 Node.js >=18、Bun 和 Deno。零运行时依赖。
Go 1.25+。仅依赖标准库。
使用 OpenClaw 路径前需先安装 OpenClaw 本身——详见 openclaw.ai（快速开始）。
完整指南：MCP · OpenClaw · REST API 参考。

API 密钥处理

TypeScript、Python 和 Go SDK 默认从环境变量 SONZAI_API_KEY 读取密钥 —— 只在你想自行管理时才显式传入（例如 new Sonzai({ apiKey: "sk-..." })）。OpenClaw 插件把密钥保存在 openclaw.json。MCP 服务器通过客户端配置传入的 SONZAI_API_KEY 环境变量获取密钥。

4. 创建智能体

创建智能体有两种方式：通过大五分数显式定义人格特征，或从自然语言提示词生成。

选项 A：从提示词生成

用自然语言描述你的智能体，平台自动生成人格、传记和种子记忆。

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: "sk-..." });

const agent = await client.agents.generation.generateAndCreate({
name: "Luna",
description: "A cheerful and curious AI assistant who loves helping developers debug code. She's patient, witty, and always encouraging.",
language: "en",
});

console.log(agent.agent_id);        // 自动生成的 UUID
console.log(agent.personality);    // 从描述派生的完整大五档案

选项 B：通过大五分数定义

如需精确控制，使用显式的大五分数创建智能体。平台从你的分数派生完整的人格档案、说话风格和情感倾向。

import { Sonzai } from "@sonzai-labs/agents";
import { v5 as uuidv5 } from "uuid";

const client = new Sonzai({ apiKey: "sk-..." });

// 从你自己的实体 ID 派生稳定的 UUID
const MY_NAMESPACE = "your-uuid-namespace-here";
const agentId = uuidv5("support-agent-001", MY_NAMESPACE);

const agent = await client.agents.create({
agentId,           // 传入你自己的 UUID——可安全重复调用
name: "Luna",
gender: "female",
big5: {
  openness:          0.75,
  conscientiousness: 0.60,
  extraversion:      0.80,
  agreeableness:     0.70,
  neuroticism:       0.30,
},
language: "en",
});

console.log(agent.agent_id); // 每次相同的 UUID

设计上的幂等性

智能体创建始终是创建或更新操作。使用相同 ID 调用两次会更新现有智能体——不会报错或创建副本。这意味着你的启动代码、CI 流水线和配置脚本可以无条件地调用 agents.create()。

带 agentId：服务器直接使用你的 UUID。推荐——将智能体链接到你自己的实体 ID（助手、员工等），建立你能控制的确定性映射。
不带 agentId：服务器从你的项目 ID + 智能体名称派生 UUID。相同名称在你的项目内始终映射到相同智能体。

5. 与你的智能体聊天

使用流式聊天获取实时 AI 响应。平台自动处理上下文、记忆和状态更新。

for await (const event of client.agents.chatStream({
agent: "agent-id",
messages: [{ role: "user", content: "I had a great day hiking!" }],
userId: "user-123",
})) {
process.stdout.write(event.choices?.[0]?.delta?.content ?? "");
}

仅限服务端使用

SDK 仅供服务端使用。永远不要在客户端代码中暴露 API 密钥。对于 Web 应用，通过你的后端代理请求。参见集成指南获取示例。

6. 随时间追踪

仪表盘展示人格变化、记忆增长、情绪模式和关系动态。所有系统随用户交互自动更新。

下一步

阅读架构以了解完整系统
参照集成指南进行生产配置
浏览 API 参考获取所有可用端点
配置知识库让智能体查询你的领域数据

指南

实用指南将带你从第一个请求一路走到生产环境集成。请挑选一个起点。

快速入门

AI 员工与个人 AI

任务型智能体，可记住用户、调用工具、检索知识库。

AI 伙伴

长期运行的角色，具备人格、情绪与关系。

企业工作流

多租户、可审计的智能体，跨团队与数据源工作。

集成

10 分钟快速入门

创建项目、获取 API Key、启动智能体并开始对话。

SDK 集成

完整的 TypeScript、Python 与 Go 集成示例。

MCP 集成

将 Claude Desktop、Cursor 等 MCP 客户端接入。

OpenClaw 插件

为 OpenClaw 上下文引擎提供即插即用插件。

独立记忆

将 Mind Layer 作为现有智能体运行时的记忆后端。

工具集成

为 LLM 接入可在对话中调用的自定义工具。

从其他平台迁移

迁移概览

选择路径：托管记忆、向量库、智能体框架或 CSV。

自 OpenAI Assistants

迁移线程、文件与指令到 Mind Layer。

教程

持久化记忆教程

种子、检索、衰减与合并事实。

定时提醒

安排唤醒事件并通过 Webhook 投递。

集成指南

概述

您的后端管理业务逻辑和用户会话。调用心智层获取智能体的智能 — 它拥有记忆、人格、情绪、关系和上下文组装。

通过 REST API 使用 Go、TypeScript 和 Python 的官方 SDK 进行集成。

官方 SDK 与插件

Go、TypeScript 和 Python 的官方 SDK，以及 OpenClaw 插件。每个 SDK 封装了完整的 REST API，提供类型化方法、SSE 流式传输、自动重试和错误处理。

go get github.com/sonz-ai/sonzai-go

TypeScript / JavaScript

npm install @sonzai-labs/agents

Python

pip install sonzai

OpenClaw 插件

npm install @sonzai-labs/openclaw-context

REST API

基于 JSON 的接口。聊天响应通过 Server-Sent Events (SSE) 流式传输。

认证

所有 REST 请求使用项目 API 密钥进行 Bearer 认证：

# 所有 REST 请求使用项目 API 密钥进行 Bearer 认证
curl -H "Authorization: Bearer sk_your_api_key" \
  https://api.sonz.ai/api/v1/agents/{agentId}/chat

核心交互流程 (REST)

# 聊天 (SSE 流式响应)
POST /api/v1/agents/{agentId}/chat
{ "messages": [{"role":"user","content":"Hello!"}], "user_id": "user-123" }

# 响应: Server-Sent Events
# data: {"choices":[{"delta":{"content":"Hi"}}]}
# data: [DONE]

SSE 解析

每行以 data: 开头。去除前缀后对剩余部分执行 JSON.parse。流以 data: [DONE] 结束。

可用 REST 接口

POST /api/v1/agents                              创建智能体
GET  /api/v1/agents                              列出智能体
GET  /api/v1/agents/{agentId}                    获取智能体
POST /api/v1/agents/{agentId}/chat               聊天 (SSE 流式)
GET  /api/v1/agents/{agentId}/notifications       待处理通知
POST /api/v1/agents/{agentId}/notifications/{id}/consume  消费通知
GET  /api/v1/agents/{agentId}/notifications/history       通知历史

TypeScript SDK（服务端）

适用于 Node.js 后端、Serverless 函数和服务端框架。支持 Node.js >= 18、Bun 和 Deno。不可用于浏览器/客户端 — API 密钥会被暴露。

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: "sk_your_api_key" });

// 聊天（非流式）
const response = await client.agents.chat("agent-id", {
  messages: [{ role: "user", content: "Hello!" }],
  userId: "user-123",
});
console.log(response.content);

// 聊天（流式）
for await (const event of client.agents.chatStream("agent-id", {
  messages: [{ role: "user", content: "Tell me a story" }],
  userId: "user-123",
  language: "en",
  timezone: "America/New_York",
})) {
  process.stdout.write(event.choices?.[0]?.delta?.content ?? "");
}

// 记忆、人格、上下文引擎数据
const memory = await client.agents.memory.list("agent-id", { userId: "user-123" });
const personality = await client.agents.personality.get("agent-id");
const mood = await client.agents.getMood("agent-id", { userId: "user-123" });

Python SDK

适用于 Python 后端、数据管道和评估脚本。支持同步和异步客户端。

from sonzai import Sonzai

client = Sonzai(api_key="sk_your_api_key")

# 聊天（非流式）
response = client.agents.chat(
    "agent-id",
    messages=[{"role": "user", "content": "Hello!"}],
    user_id="user-123",
)
print(response.content)

# 聊天（流式）
for event in client.agents.chat(
    "agent-id",
    messages=[{"role": "user", "content": "Tell me a story"}],
    user_id="user-123",
    language="en",
    timezone="America/New_York",
    stream=True,
):
    print(event.content, end="", flush=True)

# 记忆、人格、上下文引擎数据
memory = client.agents.memory.list("agent-id", user_id="user-123")
personality = client.agents.personality.get("agent-id")
mood = client.agents.get_mood("agent-id", user_id="user-123")

client.close()

浏览器 / 前端应用

需要服务端代理

Sonzai API 不接受浏览器（客户端）请求。API 密钥绝不能暴露在前端代码中。这与 OpenAI、Anthropic 及其他 AI API 提供商使用的模式相同。

对于 Web 应用（React、Next.js、Vue 等），创建一个后端 API 路由代理到 Sonzai。您的前端调用您的服务器；您的服务器使用 API 密钥调用 Sonzai。

Next.js API 路由

// app/api/chat/route.ts（在您的服务器上运行）
import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

export async function POST(req: Request) {
  const { agentId, messages, userId } = await req.json();
  const stream = client.agents.chatStream(agentId, { messages, userId });

  return new Response(
    new ReadableStream({
      async start(controller) {
        for await (const event of stream) {
          controller.enqueue(new TextEncoder().encode(
            `data: ${JSON.stringify(event)}\n\n`
          ));
        }
        controller.enqueue(new TextEncoder().encode("data: [DONE]\n\n"));
        controller.close();
      },
    }),
    { headers: { "Content-Type": "text/event-stream" } }
  );
}

前端（任意框架）

// 调用您的服务器，而非直接调用 Sonzai
const res = await fetch("/api/chat", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    agentId: "agent-uuid",
    messages: [{ role: "user", content: "Hello!" }],
    userId: "user-123",
  }),
});

const reader = res.body.getReader();
const decoder = new TextDecoder();
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  // 解析来自代理的 SSE 数据块
  console.log(decoder.decode(value));
}

Express / Fastify

// server.ts
import express from "express";
import { Sonzai } from "@sonzai-labs/agents";

const app = express();
const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

app.post("/api/chat", async (req, res) => {
  const { agentId, messages, userId } = req.body;
  res.setHeader("Content-Type", "text/event-stream");

  for await (const event of client.agents.chatStream(agentId, { messages, userId })) {
    res.write(`data: ${JSON.stringify(event)}\n\n`);
  }
  res.write("data: [DONE]\n\n");
  res.end();
});

Go SDK

适用于后端和对性能敏感的应用，Go SDK 提供对 REST API 的类型化访问。

连接设置

import sonzai "github.com/sonz-ai/sonzai-go"

// 默认连接 api.sonz.ai — 只需提供 API 密钥
client := sonzai.NewClient("sk_your_api_key")

// 为本地开发/自托管覆盖基础 URL
client := sonzai.NewClient("sk_your_api_key",
    sonzai.WithBaseURL("http://localhost:8090"),
)

智能体生命周期

创建智能体

当用户在您的应用中创建新智能体时，调用 CreateAgent 并传入人格配置：

resp, err := client.Agents.Create(ctx, sonzai.CreateAgentRequest{
    Name:   "Luna",
    Gender: "female",
    Big5: sonzai.Big5Scores{
        Openness:          0.75,
        Conscientiousness: 0.60,
        Extraversion:      0.80,
        Agreeableness:     0.70,
        Neuroticism:       0.30,
    },
    Language: "en",
})
// resp.AgentID 是平台生成的 UUID
// 将其存储在您的用户记录中

获取智能体

agent, err := client.Agents.Get(ctx, agentId)
// agent 包含: 名称、人格提示词、大五人格、情绪等

聊天会话流程

这是核心集成循环。聊天接口在一次调用中处理上下文组装、AI 流式传输和状态更新：

流式聊天

聊天接口自动组装上下文、流式传输 AI 响应，并自动更新内部智能体状态：

stream, err := client.Agents.ChatStream(ctx, agentId, sonzai.ChatRequest{
    UserID:    userId,
    Messages: []sonzai.Message{
        {Role: "user", Content: "I had a great day hiking!"},
    },
    Language: "en",
})

// 读取流式事件
for event := range stream {
    fmt.Print(event.Content)
}

情绪标签

标签：Blissful (80-100)、Content (60-79)、Neutral (40-59)、Melancholy (20-39)、Troubled (0-19)。情绪会随时间自然向智能体的人格基线回归。

主动通知

智能体可以在对话之间主动联系用户。触发后，平台使用智能体的完整状态生成上下文消息，并将其存储为"待处理"状态。您的应用轮询并在交付后将通知标记为已消费。

REST 轮询

# 轮询待处理的主动消息
GET /api/v1/agents/{agentId}/notifications?status=pending&user_id=user-123

# 响应
{
  "notifications": [{
    "message_id": "msg-uuid",
    "user_id": "user-123",
    "check_type": "check_in",
    "intent": "Ask about yesterday's hiking trip",
    "generated_message": "Hey! How was the hike at Mount Rainier?",
    "status": "pending",
    "created_at": "2026-03-07T10:00:00Z"
  }]
}

# 交付给用户后，标记为已消费
POST /api/v1/agents/{agentId}/notifications/{messageId}/consume

交付最佳实践

每 30-60 秒轮询一次。交付后务必标记为已消费，以防重复交付。

Webhook 集成

// 注册 Webhook 端点
PUT /api/projects/{projectId}/webhooks/{eventType}
{
    "webhook_url": "https://your-server.com/platform/webhooks/wakeup",
    "auth_header": "Bearer YOUR_SERVER_KEY"
}

// 事件类型:
// - "wakeup"        : 智能体想要主动触达
// - "consolidation" : 记忆整合完成
// - "breakthrough"  : 重大人格演化

唤醒 Webhook 载荷

包含生成的消息，可直接交付：

{
  "event_type": "on_wakeup_ready",
  "agent_id": "agent-uuid",
  "user_id": "user-123",
  "generated_message": "Hey! How was the hike?",
  "wakeup_id": "wakeup-uuid",
  "check_type": "check_in"
}

轮询替代方案

偏好轮询？使用通知 API 代替 Webhook。

示例：后端集成流程

三层服务架构：

客户端应用          您的后端                     心智层
   |                      |                              |
   |--- 认证 ----------->|                              |
   |                      |                              |
   |--- 创建智能体 ----->|                              |
   |                      |--- REST: CreateAgent ------>|
   |                      |<-- 智能体 ID + 档案 --------|
   |<-- 智能体就绪 ------|                              |
   |                      |                              |
   |--- 发送消息 -------->|                              |
   |                      |--- REST: Chat (SSE) ------->|
   |<-- 流式响应 <-- AI 数据块 + 副作用 ----------------|

您的后端将应用事件转换为心智层 API 调用。您可以在不改变智能体行为的情况下更换后端，或跨应用复用智能体。

知识库 (Go SDK)

上传文档或推送结构化数据，构建项目级别的知识图谱。智能体在对话中搜索此图谱。

推送结构化数据

// 插入实体和关系
resp, err := client.Knowledge.InsertFacts(ctx, projectID, sonzai.InsertFactsOptions{
    Source: "product_catalog",
    Facts: []sonzai.InsertFactEntry{
        {
            EntityType: "product",
            Label:      "Widget Pro",
            Properties: map[string]any{"price": 29.99, "category": "tools"},
        },
    },
    Relationships: []sonzai.InsertRelEntry{
        {FromLabel: "Widget Pro", ToLabel: "Tools", EdgeType: "belongs_to"},
    },
})
fmt.Printf("Created: %d, Updated: %d\n", resp.Created, resp.Updated)

搜索知识图谱

results, err := client.Knowledge.Search(ctx, projectID, sonzai.KBSearchOptions{
    Query: "widget price",
    Limit: 10,
})
for _, r := range results.Results {
    fmt.Printf("%s (%s): score=%.2f\n", r.Label, r.NodeType, r.Score)
}

实体模式

// 定义模式，让 LLM 知道要提取哪些字段
schema, err := client.Knowledge.CreateSchema(ctx, projectID, sonzai.CreateSchemaOptions{
    EntityType: "product",
    Fields: []sonzai.KBSchemaField{
        {Name: "price", Type: "number", Required: true},
        {Name: "category", Type: "string"},
    },
})

分析规则

// 创建推荐规则
rule, err := client.Knowledge.CreateAnalyticsRule(ctx, projectID, sonzai.CreateAnalyticsRuleOptions{
    RuleType: "recommendation",
    Name:     "Similar products",
    Config:   map[string]any{"match_fields": []string{"category"}, "limit": 5},
    Enabled:  true,
})

// 获取推荐
recs, err := client.Knowledge.GetRecommendations(ctx, projectID, rule.RuleID, sourceNodeID, 5)

用户引导 (Go SDK)

预加载用户元数据和内容，使 AI 智能体从第一次对话开始就了解用户。元数据（姓名、公司、职位）会立即成为事实；内容块通过 LLM 异步提取。

引导单个用户

resp, err := client.Agents.Priming.PrimeUser(ctx, agentID, userID, sonzai.PrimeUserOptions{
    DisplayName: "Jane Smith",
    Metadata: &sonzai.PrimeUserMetadata{
        Company: "Acme Corp",
        Title:   "VP Engineering",
        Email:   "[email protected]",
        Custom:  map[string]string{"region": "APAC", "tier": "enterprise"},
    },
    Content: []sonzai.PrimeContentBlock{
        {Type: "text", Body: "Jane led the migration from AWS to GCP..."},
    },
    Source: "crm",
})
fmt.Printf("Job: %s, Facts created: %d\n", resp.JobID, resp.FactsCreated)

批量导入

resp, err := client.Agents.Priming.BatchImport(ctx, agentID, sonzai.BatchImportOptions{
    Users: []sonzai.BatchImportUser{
        {UserID: "user-1", DisplayName: "Jane", Metadata: &sonzai.PrimeUserMetadata{Company: "Acme"}},
        {UserID: "user-2", DisplayName: "Bob", Metadata: &sonzai.PrimeUserMetadata{Company: "Globex"}},
    },
    Source: "crm_sync",
})
fmt.Printf("Job: %s, Users: %d\n", resp.JobID, resp.TotalUsers)

管理元数据

// 获取元数据
meta, err := client.Agents.Priming.GetMetadata(ctx, agentID, userID)

// 更新元数据（部分更新 — 与现有数据合并）
updated, err := client.Agents.Priming.UpdateMetadata(ctx, agentID, userID, sonzai.UpdateMetadataOptions{
    Company: ptr("New Corp"),
    Custom:  map[string]string{"tier": "premium"},
})

异步处理

元数据事实（姓名、公司、职位）同步创建。内容块（文本、聊天记录）通过 LLM 提取在后台异步处理。可轮询任务状态以跟踪进度。

供 AI 智能体使用

要把这些文档喂给 AI 助手或编码智能体？每个页面都有复制给 LLM 按钮，下方的包已预先格式化，可直接摄入。在任意文档 URL 后加 .md（例如 /docs/zh/guides/integration.md）可获取原始 Markdown。

llms.txt

Terse index of the docs for LLM tools.

llms-full.txt

Full docs concatenated for LLM ingestion.

llms-companions.txt

Subset for AI Companion builders.

llms-employees.txt

Subset for AI Employee / Personal AI builders.

llms-enterprise.txt

Subset for Enterprise Agent builders.

最佳实践

使用 StreamChat — 一次调用即可处理上下文组装、AI 流式传输和状态更新。
始终通过 BackendContext.custom_fields 传递应用状态。平台不会缓存此数据。
注册 Webhook 以接收唤醒事件，使智能体能够主动发起联系。
不要重复实现人格、记忆或关系逻辑 — 让平台拥有智能体数据。
每 30-60 秒轮询通知。交付后标记为已消费，防止重复交付。
所有集成均使用 REST API。使用 Go、TypeScript 或 Python 的官方 SDK 获取最佳开发体验。
浏览器应用必须通过后端代理 — 切勿在客户端代码中暴露 API 密钥。参见上方的浏览器/前端应用章节。

MCP 集成

心智层提供托管的 Streamable HTTP MCP 端点： https://api.sonz.ai/mcp/memory/{agent_id}。把任意兼容 MCP 的客户端指向它，再带上 Sonzai API 密钥即可 — 无需本地二进制、无需开放 SSE 端口、无需 Go 工具链。

服务器实现 Model Context Protocol 规范，暴露 34 个工具、4 个资源 和 3 个引导式提示词 （参见下文工具目录）。

你需要

项目 API 密钥 — 在项目设置创建。
Agent ID — 通过仪表板或 SDK 创建一个 agent；如果还没有，首次连接后运行 create-companion MCP 提示词即可生成。

选择客户端

# 单条命令 — 把托管的 MCP 服务器注册到 Claude Code:
claude mcp add --transport http sonzai \
https://api.sonz.ai/mcp/memory/AGENT_ID \
--header "Authorization: Bearer $SONZAI_API_KEY"

# 用 --scope 指定作用域:
#   local   (默认) — 仅当前项目，私有
#   project         — 写入 .mcp.json (提交以共享给团队)
#   user            — 全局 (~/.claude.json)

# 检查注册结果:
claude mcp list

Streamable HTTP，而非 SSE

2026 版 MCP 规范把 Streamable HTTP 标记为远程传输的标准选择。 SSE 在主流客户端正逐步弃用 — 任何新集成都应优先选择 HTTP。本地二进制仍然提供 SSE 传输用于向后兼容。

鉴权

端点	鉴权	范围
`POST /mcp/memory/{agent_id}`	`Authorization: Bearer sk-...`	单个 agent
`POST /mcp/memory` (OAuth, beta)	OAuth 2.0 授权码	项目级，agent 选择器

上面所有示例都使用 Bearer 密钥路径 — 锁定到某个 agent，使用项目 API 密钥作为唯一秘密。OAuth 模式让客户端通过选择器 UI 发现可用 agent，目前处于 beta，通过 /.well-known/oauth-authorization-server 端点暴露。

把 API 密钥当成密码

Bearer Token 就是项目 API 密钥 — 它对项目下所有 agent 都有完整访问权。不要粘贴到会被提交到公开仓库的共享 MCP 配置中。多人协作时优先使用 local 范围的配置。

工具目录

MCP 服务器把 34 个工具分成六个类别。每个工具直接对应一个 Platform API 端点。

Agent 管理 (5)

list_agents — 列出 agents，支持搜索与分页
get_agent — 获取详情 (人格、能力、状态)
create_agent — 创建 agent (人格、Big5、种子记忆、目标)
update_agent — 更新档案 (姓名、人格、bio、问候语)
delete_agent — 永久删除 agent 与所有数据

对话 (1)

chat — 发送消息并获取带完整上下文的回复 (记忆、情绪、人格、关系)

记忆 (5)

get_memory — 获取层级化记忆树
search_memories — 自然语言记忆搜索
list_facts — 按类型 (profile, preference, emotion …) 列出原子事实
get_memory_timeline — 按时序的记忆时间线
reset_memory — 删除所有记忆 (不可逆)

行为 (11)

get_personality / update_personality — Big5 特质、BFAS 维度
get_mood / get_mood_history — 4D 情绪状态及历史
list_goals / create_goal / update_goal — agent 目标
get_habits — 行为模式与强度分数
get_relationships — Love 分、叙事、化学反应
get_interests — 检测到的兴趣与置信度
get_diary — AI 生成的日记条目

会话与状态 (5)

start_session / end_session — 用于上下文连贯性的会话
list_custom_states / upsert_custom_state / get_custom_state — 自定义键值条目

生成与事件 (7)

generate_character — 从文本描述生成完整角色
generate_and_create_agent — 一步生成 + 创建
trigger_event — 影响情绪、记忆或行为
list_notifications / schedule_wakeup — 主动外联
generate_bio — 为已有 agent 生成 bio
list_voices — 可用 TTS 声音

资源

资源以 MCP sonzai:// URI 暴露只读数据。

URI	描述
`sonzai://agents`	项目下所有 agent
`sonzai://agents/{id}/profile`	Agent 档案 (人格、能力、状态)
`sonzai://agents/{id}/memory`	记忆树快照
`sonzai://agents/{id}/personality`	Big5 特质、维度、偏好

引导式提示词

助手可按名调用的预制工作流。

`create-companion`

从一句话概念生成完整人设的 agent。

参数: concept — 例如 "a philosophical barista who reads tarot cards"。

`analyze-agent`

深入分析某个 agent 的人格、情绪、记忆与关系。

参数: agent_id — UUID 或名字。

`mind-layer-setup`

把 Sonzai 配置为任意 AI 助手的持久化心智层。

参数: assistant_name、personality_description。

架构

Claude Code · Cursor · ChatGPT · VS Code · Claude Desktop
         │
         │ Streamable HTTP (JSON-RPC 2.0)
         ▼
https://api.sonz.ai/mcp/memory/{agent_id}
         │
         ├─ Context Engine (memory, personality, behavior)
         ├─ AI Service (LLM generation)
         └─ ScyllaDB · Redis · CockroachDB

对于离网或仅 stdio 的客户端，可选的 sonzai-mcp 二进制运行在用户机器上，将 stdio JSON-RPC 与 HTTPS REST 互相代理。

下一步

API 参考 — MCP 工具背后的所有 REST 端点
人格系统与记忆与上下文 — 这些工具控制的内容

OpenClaw 集成

什么是 OpenClaw？

OpenClaw 是一个用于构建对话式 AI 智能体的开源框架。它采用带有命名 插槽（slots） 的模块化插件系统，每个插槽控制智能体管道的特定部分。

最重要的插槽是 contextEngine。这个插件负责决定在每次 LLM 调用之前注入系统提示词的上下文内容。它控制智能体记住什么、知道什么、以及感受到什么。

插件系统工作原理

OpenClaw 的插件系统类似中间件。每个插件实现生命周期钩子，在对话轮次的特定节点触发：

bootstrap(sessionId)：新聊天会话开始时调用。插件初始化所需的连接或状态。
assemble(messages, tokenBudget)：每次 LLM 调用前调用。插件返回 systemPromptAddition — 注入系统提示词的额外上下文。
afterTurn(sessionId)：LLM 响应后调用。插件处理对话内容（如提取事实、更新状态）。
compact(sessionId)：上下文需要整合时调用（如将短期记忆合并到长期记忆）。
dispose()：会话结束时调用。清理连接和状态。

默认情况下，OpenClaw 附带一个基本的上下文引擎，将记忆存储为本地 Markdown 文件。Sonzai 插件将其替换为 Mind Layer — 无需额外代码即可为智能体提供持久记忆、人格进化、情绪追踪和关系建模。

插件注册机制

安装 @sonzai-labs/openclaw-context 后，该包会导出一个 register() 函数作为默认导出。启动时，OpenClaw 加载所有已安装的插件并调用其 register 函数。我们的插件以 "sonzai" 名称注册上下文引擎工厂：

// @sonzai-labs/openclaw-context 内部（不需要开发者编写）
export default function register(api) {
  api.registerContextEngine("sonzai", () => {
    return new SonzaiContextEngine(client, config);
  });
}

然后在 openclaw.json 中，告诉 OpenClaw 哪个已注册的引擎用于 contextEngine 插槽。名称 "sonzai" 必须与插件注册的名称匹配：

{
  "plugins": {
    "slots": {
      "contextEngine": "sonzai"
    },
    "entries": {
      "sonzai": {
        "enabled": true,
        "apiKey": "sk-your-api-key",
        "agentId": "your-agent-uuid"
      }
    }
  }
}

流程如下：安装 npm 包 → OpenClaw 发现并调用 register() → 插件以 "sonzai" 注册 → 配置将其分配给 contextEngine 插槽。

为什么使用 Sonzai 作为上下文层？

Sonzai 作为 OpenClaw 的 纯上下文引擎。不再由框架管理自己的记忆文件，所有对话都流经 Mind Layer — 自动处理事实提取、语义搜索、情绪更新和人格进化。无需编写任何记忆逻辑，即可获得丰富的结构化上下文。

快速开始

1. 获取 API 密钥

从 Sonzai 项目设置获取 API 密钥。在设置向导中输入后，它将保存到 openclaw.json 中。

2. 安装插件

# 通过 OpenClaw CLI 安装
openclaw plugins install @sonzai-labs/openclaw-context

# 或直接使用包管理器安装
npm install @sonzai-labs/openclaw-context
# bun add @sonzai-labs/openclaw-context

3. 运行设置向导

设置向导将引导你将 OpenClaw 项目连接到 Mind Layer：

npx @sonzai-labs/openclaw-context setup

向导将：

要求输入 API 密钥（或从环境中检测 SONZAI_API_KEY）
询问是否有现有的智能体 ID，或选择名称创建新的
在 Mind Layer 上验证 API 密钥
将 API 密钥和插件配置保存到 openclaw.json

设置完成后，openclaw.json 将如下所示：

{
  "plugins": {
    "slots": {
      "contextEngine": "sonzai"
    },
    "entries": {
      "sonzai": {
        "enabled": true,
        "apiKey": "sk-your-api-key",
        "agentId": "a1b2c3d4-..."
      }
    }
  }
}

API 密钥存储

API 密钥与插件配置一起存储在 openclaw.json 中 — 无需设置环境变量。请确保将 openclaw.json 添加到 .gitignore 以避免提交密钥。

4. 开始聊天

像往常一样启动 OpenClaw。插件注册为 sonzai 上下文引擎并自动接管上下文组装：

openclaw chat

就这样。所有对话现在都流经 Mind Layer — 智能体从第一条消息开始就拥有持久记忆、人格和情绪。

配置参考

所有设置都在 openclaw.json 的 plugins.entries.sonzai 下配置。环境变量可作为覆盖使用。

选项	环境变量覆盖	默认值	说明
`apiKey`	`SONZAI_API_KEY`	--	项目 API 密钥（必填）
`agentId`	`SONZAI_AGENT_ID`	自动配置	预配置的智能体 UUID
`baseUrl`	`SONZAI_BASE_URL`	`https://api.sonz.ai`	平台 API 基础 URL
`agentName`	--	`openclaw-agent`	自动配置智能体的名称
`defaultUserId`	--	`owner`	1:1 会话的备用用户 ID
`contextTokenBudget`	--	`2000`	注入上下文的最大 token 数
`extractionProvider`	--	--	事实提取用 LLM 提供商
`extractionModel`	--	--	事实提取用 LLM 模型

禁用上下文来源

可以通过 disable 映射选择性地禁用特定上下文来源。当你需要 Mind Layer 的记忆功能但不需要情绪追踪，或者想要减少 token 使用时很有用：

{
  "plugins": {
    "entries": {
      "sonzai": {
        "enabled": true,
        "apiKey": "sk-your-api-key",
        "agentId": "your-agent-uuid",
        "disable": {
          "mood": true,
          "personality": false,
          "relationships": true,
          "memory": false,
          "goals": true,
          "interests": true,
          "habits": true
        }
      }
    }
  }
}

使用上述配置，只会注入人格和记忆上下文 — 适合将 Mind Layer 作为纯记忆和人格存储使用。

注入的上下文

每个轮次中，插件会向系统提示词注入一个结构化的 <sonzai-context> 块。各部分按优先级排序，超出 token 预算时从最低优先级开始移除：

人格（优先级 1，最高）：角色定义、主要特征、说话模式、Big5 人格档案
相关记忆（优先级 2）：与最新用户消息语义匹配的已搜索事实
当前情绪（优先级 3）：4维情绪状态（效价、唤醒度、紧张度、亲和度）
关系（优先级 4）：关系叙述、亲密度分数、与当前用户的化学反应
目标（优先级 5）：活跃目标（成长、精通、关系、探索）
兴趣（优先级 6）：带有置信度的已检测兴趣
习惯（优先级 7，最低）：带有强度分数的行为模式

Token 预算

默认预算为 2000 token（约 8000 字符）。插件以每 token 约 4 个字符来估算 token 数量，超出预算时从最低优先级的部分开始移除。可通过 contextTokenBudget 调整。

会话密钥解析

插件自动从 OpenClaw 的会话密钥格式中提取用户身份。无需任何配置即可实现按用户的记忆和关系：

会话格式	示例	解析的用户 ID
CLI（1:1）	`agent:abc:mainKey`	`owner`
Telegram 私信	`agent:abc:telegram:direct:123`	`123`
WhatsApp 私信	`agent:abc:whatsapp:direct:+1555...`	`+1555...`
Discord 群组	`agent:abc:discord:group:guild789`	`guild789`
定时任务 / Webhook	`cron:daily-check`	`owner`

编程式设置（B2B）

对于需要以编程方式配置智能体的多租户部署，@sonzai-labs/openclaw-context 插件提供 setup() 助手。OpenClaw 本身是一个 JavaScript 上下文引擎，因此插件仅有 TypeScript 版本 —— 但底层只是两个 REST 调用（幂等地创建智能体 + 写入配置文件），任何语言都可以驱动。下面的 Python 与 Go 分支演示用 Sonzai 官方 SDK 完成同样的工作。

import { setup } from "@sonzai-labs/openclaw-context";

const result = await setup({
apiKey: "sk-project-key",
agentName: "customer-support-bot",
configPath: "/path/to/openclaw.json",
});

console.log(result.agentId);   // 确定性 UUID（tenantID + agentName 的 SHA1）
console.log(result.written);   // true — 配置文件已更新

幂等配置

智能体 ID 从 SHA1(tenantID + agentName) 确定性生成。使用相同名称多次调用 setup 会返回相同的智能体 — 重启和重新部署都是安全的。

架构

Sonzai 上下文引擎插入 OpenClaw 的生命周期钩子。以下是单个对话轮次的流程：

OpenClaw Runtime                SonzaiContextEngine              Sonzai Mind Layer
    |                                    |                                |
    |-- bootstrap(sessionId) ----------->|                                |
    |                                    |-- resolve agent + session ---->|
    |                                    |<-- session state cached -------|
    |                                    |                                |
    |-- assemble(messages, budget) ----->|                                |
    |                                    |-- fetch context (memory,       |
    |                                    |   personality, mood,           |
    |                                    |   relationships) ------------>|
    |                                    |<-- ranked context blocks ------|
    |                                    |                                |
    |<-- systemPromptAddition -----------|   (priority-ordered,           |
    |                                    |    budget-trimmed)             |
    |                                    |                                |
    |  [LLM call with enriched prompt]   |                                |
    |                                    |                                |
    |-- afterTurn(sessionId) ----------->|                                |
    |                                    |-- send conversation ---------> |
    |                                    |   Mind Layer extracts facts,   |
    |                                    |   updates mood, evolves        |
    |                                    |   personality automatically    |
    |                                    |                                |
    |-- compact(sessionId) ------------->|                                |
    |                                    |-- merge short-term → long-term>|
    |                                    |<-- compacted ------------------|

上下文引擎处理与 Mind Layer 的所有通信。在 assemble 期间，获取上下文来源（记忆、人格、情绪、关系、目标、兴趣、习惯），按优先级排序并裁剪至 token 预算。在 afterTurn 期间，将对话发送回 Mind Layer 进行事实提取和状态更新。引擎不在本地运行 LLM 调用 — 所有智能都在 Sonzai 端。

优雅降级

所有 API 调用都包裹在错误处理器中。如果 Mind Layer 不可达，引擎返回空上下文，绝不会阻塞 OpenClaw — 智能体将在没有增强上下文的情况下继续工作。

导出

该包提供以下高级用途的导出：

导出	说明
`default`	插件注册（OpenClaw 自动加载）
`SonzaiContextEngine`	核心引擎类 — 可在 OpenClaw 外使用
`setup()`	B2B 部署的编程式设置
`resolveConfig()`	合并 openclaw.json + 环境变量 + 配置为已解析选项
`parseSessionKey()`	从 OpenClaw 会话密钥提取用户身份
`buildSystemPromptAddition()`	将上下文格式化为注入块
`estimateTokens()`	估算字符串的 token 数量（约 4 字符/token）
`SessionCache`	基于 TTL 的会话状态缓存

下一步

了解驱动插件的记忆与上下文系统
探索人格系统和情绪与心情以理解上下文来源
阅读 API 参考了解插件背后的完整 REST API
查看集成指南了解不使用 OpenClaw 的 SDK 集成方式

Tool Integration for BYO-LLM

Two Approaches to Enrichment

There are two complementary ways your agent can access Sonzai knowledge and memory:

Automatic (Recommended)

Call GET /context with a query param. The endpoint automatically searches the knowledge base and injects recalled memories. The deferred learning loop primes the next context call with KB results that the agent missed. No tool calling needed.

Explicit Tool Calling

Register Sonzai tools with your LLM so it can search on demand mid-conversation. This is for agent frameworks (LangChain, Vercel AI SDK, CrewAI) where the LLM decides when to search. You fetch tool schemas from Sonzai and wire them into your framework.

When to use which?

Start with automatic enrichment — it covers most cases with zero configuration. Add explicit tool calling when your agent needs to search mid-conversation (e.g., the user asks a question not covered by the initial context fetch) or when your framework expects tool definitions.

Discovering Available Tools

Fetch the tool catalog for an agent. This returns JSON schemas in OpenAI function-calling format that you can pass directly to your LLM's tool configuration.

const tools = await client.agents.getTools("agent-id");

// tools.tools = [
//   {
//     name: "knowledge_search",
//     description: "Search the agent's knowledge base...",
//     endpoint: "POST /api/v1/agents/{agentId}/tools/kb-search",
//     parameters: {
//       type: "object",
//       required: ["query"],
//       properties: {
//         query: { type: "string", description: "Search query" },
//         limit: { type: "integer", description: "Max results (default 10)" }
//       }
//     }
//   },
//   {
//     name: "memory_search",
//     description: "Search the agent's memory for previously learned facts...",
//     endpoint: "GET /api/v1/agents/{agentId}/memory/search?q={query}&userId={userId}",
//     parameters: {
//       type: "object",
//       required: ["query"],
//       properties: {
//         query: { type: "string", description: "Search query" },
//         user_id: { type: "string", description: "User ID to scope search" },
//         limit: { type: "integer", description: "Max results (default 20)" }
//       }
//     }
//   }
// ]

Knowledge Search Tool

Search the agent's knowledge base for relevant documents and facts. Uses hybrid search (BM25 + semantic) when embeddings are available, falling back to BM25 full-text search.

Endpoint

POST /api/v1/agents/{agentId}/tools/kb-search
GET  /api/v1/agents/{agentId}/tools/kb-search?q={query}&limit={limit}

Request

{
  "query": "refund policy",
  "limit": 5
}

Response

{
  "query": "refund policy",
  "results": [
    {
      "content": "Customers can request a full refund within 30 days of purchase...",
      "label": "Refund Policy",
      "type": "policy",
      "source": "policies.pdf",
      "score": 0.92
    },
    {
      "content": "For digital products, refunds are processed within 5 business days...",
      "label": "Digital Refund Process",
      "type": "process",
      "source": "policies.pdf",
      "score": 0.78
    }
  ]
}

SDK Usage

const results = await client.agents.knowledgeSearch("agent-id", {
query: "refund policy",
limit: 5,
});

for (const result of results.results) {
console.log(`[${result.score.toFixed(2)}] ${result.label}: ${result.content}`);
}

Memory Search Tool

Search the agent's memory for previously extracted facts about a user. This is a synchronous BM25 full-text search that returns immediately — no deferred processing.

Endpoint

GET /api/v1/agents/{agentId}/memory/search?q={query}&userId={userId}&limit={limit}

Response

{
  "results": [
    {
      "fact_id": "f_abc123",
      "content": "User enjoys hiking on weekends",
      "fact_type": "preference",
      "score": 4.82
    },
    {
      "fact_id": "f_def456",
      "content": "User adopted a dog named Luna in March",
      "fact_type": "event",
      "score": 3.15
    }
  ]
}

SDK Usage

const results = await client.agents.memory.search("agent-id", {
query: "hiking",
userId: "user-123",
limit: 10,
});

for (const fact of results.results) {
console.log(`[${fact.fact_type}] ${fact.content}`);
}

Memory search is always synchronous

Unlike KB enrichment (which has a deferred path), memory search returns immediately from BM25 indexes. There is no async component. The /context endpoint already includes the most relevant memories automatically — this tool is for cases where the LLM needs to search for additional facts mid-conversation.

Wiring Tools into Agent Frameworks

The tool schemas from GET /tools/schemas follow the OpenAI function-calling format. Here is how to wire them into popular agent frameworks.

Vercel AI SDK

import { generateText, tool } from "ai";
import { google } from "@ai-sdk/google";
import { Sonzai } from "@sonzai-labs/agents";
import { z } from "zod";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });
const agentId = "agent-id";
const userId = "user-123";

// Define Sonzai tools for the Vercel AI SDK
const sonzaiTools = {
  knowledge_search: tool({
    description: "Search the agent's knowledge base for relevant documents",
    parameters: z.object({
      query: z.string().describe("Search query"),
      limit: z.number().optional().describe("Max results"),
    }),
    execute: async ({ query, limit }) => {
      const results = await sonzai.agents.knowledgeSearch(agentId, {
        query,
        limit: limit ?? 5,
      });
      return results.results.map((r) => ({
        content: r.content,
        label: r.label,
        score: r.score,
      }));
    },
  }),
  memory_search: tool({
    description: "Search agent memory for facts about the user",
    parameters: z.object({
      query: z.string().describe("Search query"),
    }),
    execute: async ({ query }) => {
      const results = await sonzai.agents.memory.search(agentId, {
        query,
        userId,
      });
      return results.results.map((f) => ({
        content: f.content,
        type: f.fact_type,
      }));
    },
  }),
};

// Get enriched context first
const ctx = await sonzai.agents.getContext(agentId, {
  userId,
  sessionId: "session-abc",
  query: userMessage,
});

const { text } = await generateText({
  model: google("gemini-3.1-flash-lite-preview"),
  system: buildSystemPrompt(ctx),
  prompt: userMessage,
  tools: sonzaiTools,
  maxSteps: 3, // allow up to 3 tool calls per turn
});

Google Gemini Function Calling

import { GoogleGenAI, Type } from "@google/genai";
import { Sonzai } from "@sonzai-labs/agents";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });
const gemini = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY! });

const agentId = "agent-id";

// Define tools in Gemini format
const tools = [{
  functionDeclarations: [
    {
      name: "knowledge_search",
      description: "Search the agent's knowledge base for relevant documents",
      parameters: {
        type: Type.OBJECT,
        properties: {
          query: { type: Type.STRING, description: "Search query" },
          limit: { type: Type.INTEGER, description: "Max results" },
        },
        required: ["query"],
      },
    },
    {
      name: "memory_search",
      description: "Search agent memory for facts about the user",
      parameters: {
        type: Type.OBJECT,
        properties: {
          query: { type: Type.STRING, description: "Search query" },
        },
        required: ["query"],
      },
    },
  ],
}];

// Chat with tool calling
const response = await gemini.models.generateContent({
  model: "gemini-3.1-flash-lite-preview",
  contents: [{ role: "user", parts: [{ text: systemPrompt + "\n\n" + userMessage }] }],
  config: { tools },
});

// Handle tool calls
for (const part of response.candidates?.[0]?.content?.parts ?? []) {
  if (part.functionCall) {
    const { name, args } = part.functionCall;

    let result;
    if (name === "knowledge_search") {
      result = await sonzai.agents.knowledgeSearch(agentId, {
        query: args.query as string,
        limit: (args.limit as number) ?? 5,
      });
    } else if (name === "memory_search") {
      result = await sonzai.agents.memory.search(agentId, {
        query: args.query as string,
        userId: "user-123",
      });
    }

    // Send tool result back to Gemini for the final response
    // (see Gemini function calling docs for the full loop)
  }
}

LangChain (Python)

from langchain_core.tools import tool
from langchain_google_genai import ChatGoogleGenerativeAI
from langgraph.prebuilt import create_react_agent
from sonzai import Sonzai

sonzai_client = Sonzai(api_key="sk_your_api_key")
agent_id = "agent-id"
user_id = "user-123"


@tool
def knowledge_search(query: str, limit: int = 5) -> list[dict]:
    """Search the agent's knowledge base for relevant documents and facts.
    Use when the user asks about topics that may be in uploaded documents."""
    results = sonzai_client.agents.knowledge_search(agent_id, query=query, limit=limit)
    return [{"content": r.content, "label": r.label, "score": r.score} for r in results.results]


@tool
def memory_search(query: str) -> list[dict]:
    """Search agent memory for previously learned facts about the user.
    Use when the conversation references past interactions or personal details."""
    results = sonzai_client.agents.memory.search(agent_id, query=query, user_id=user_id)
    return [{"content": f.content, "type": f.fact_type} for f in results.results]


# Get enriched context
ctx = sonzai_client.agents.get_context(
    agent_id, user_id=user_id, session_id="session-abc", query=user_message
)

llm = ChatGoogleGenerativeAI(model="gemini-3.1-flash-lite-preview")
agent = create_react_agent(llm, [knowledge_search, memory_search])

result = agent.invoke({
    "messages": [
        {"role": "system", "content": build_system_prompt(ctx)},
        {"role": "user", "content": user_message},
    ]
})

OpenAI-Compatible (Generic)

Any framework that accepts OpenAI function-calling format can use the schemas directly:

import { Sonzai } from "@sonzai-labs/agents";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

// Fetch schemas and convert to OpenAI format
const { tools: sonzaiSchemas } = await sonzai.agents.getTools("agent-id");

const openaiTools = sonzaiSchemas.map((t) => ({
  type: "function" as const,
  function: {
    name: t.name,
    description: t.description,
    parameters: t.parameters,
  },
}));

// Pass to any OpenAI-compatible provider
const response = await openai.chat.completions.create({
  model: "your-model",
  messages: [...],
  tools: openaiTools,
});

// Handle tool calls in the response
for (const call of response.choices[0].message.tool_calls ?? []) {
  const args = JSON.parse(call.function.arguments);

  if (call.function.name === "knowledge_search") {
    const result = await sonzai.agents.knowledgeSearch("agent-id", {
      query: args.query,
      limit: args.limit,
    });
    // Feed result back to the LLM as a tool response
  }

  if (call.function.name === "memory_search") {
    const result = await sonzai.agents.memory.search("agent-id", {
      query: args.query,
      userId: "user-123",
    });
    // Feed result back to the LLM as a tool response
  }
}

Understanding Deferred Enrichment

The most powerful aspect of standalone mode is the self-improving learning loop. Even without explicit tool calls, the agent gets smarter each turn because /process detects knowledge gaps and primes the next /context call.

How It Works

┌──────────────────────────────────────────────────────────────────┐
│  Turn N                                                          │
│                                                                  │
│  1. GET /context?query="hiking boots"                            │
│     → Returns enriched context + any KB matches for "hiking"     │
│     → Also returns deferred results from Turn N-1 (if any)      │
│                                                                  │
│  2. Chat with your LLM (using enriched context)                  │
│                                                                  │
│  3. POST /process (send transcript)                              │
│     → Extracts facts: "user needs waterproof hiking boots"       │
│     → Extracts entities: "hiking boots", "waterproof"            │
│     → Searches KB with extracted topics (async, after response)  │
│     → Finds: "Hiking Gear Guide", "Waterproof Materials FAQ"    │
│     → Stores as deferred signals (Redis, 1-hour TTL)            │
│                                                                  │
└──────────────────────────────────────────────────────────────────┘
                            ↓
┌──────────────────────────────────────────────────────────────────┐
│  Turn N+1                                                        │
│                                                                  │
│  1. GET /context?query="which brand do you recommend?"           │
│     → Direct search: matches for "brand recommend"              │
│     → Deferred results: "Hiking Gear Guide" + "Waterproof FAQ"  │
│     → Both merged into response (deduplicated)                  │
│     → Deferred signals consumed (one-shot, not repeated)        │
│                                                                  │
│  2. Chat with your LLM                                          │
│     → Now has hiking gear knowledge it didn't have before!      │
│                                                                  │
└──────────────────────────────────────────────────────────────────┘

Key Properties

One-shot signals: Deferred KB results are consumed when /context reads them. They appear exactly once, preventing stale or repeated information.
TTL-based expiry: Deferred signals expire after 1 hour. If the user doesn't continue the conversation, stale signals are automatically cleaned up.
Deduplication: If the direct /context query matches the same KB document as a deferred signal, the duplicate is removed. You never get the same result twice.
Capped searches: /process runs at most 5 KB queries per call and stores at most 10 deferred results, preventing resource explosion on topic-heavy conversations.

Memory Search Is Always Synchronous

Unlike KB enrichment, memory search has no deferred/async path. When /context is called, it recalls the most relevant memories immediately using the hierarchical memory tree and BM25 indexes. When you call GET /memory/search explicitly, results return immediately.

The deferred behavior only applies to knowledge base content, where /process proactively discovers KB documents the agent should have known about. Memory facts are always available synchronously because they are indexed at write time (during /process).

Recommended Integration Pattern

For most applications, combine automatic enrichment with explicit tool calling for the best results:

import { generateText, tool } from "ai";
import { google } from "@ai-sdk/google";
import { Sonzai } from "@sonzai-labs/agents";
import { z } from "zod";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

async function chat(agentId: string, userId: string, sessionId: string, message: string) {
  // Step 1: Automatic enrichment — context includes KB + memories
  const ctx = await sonzai.agents.getContext(agentId, {
    userId,
    sessionId,
    query: message,
  });

  // Step 2: Chat with tools for on-demand search
  const { text, steps } = await generateText({
    model: google("gemini-3.1-flash-lite-preview"),
    system: buildSystemPrompt(ctx),
    prompt: message,
    tools: {
      knowledge_search: tool({
        description: "Search knowledge base for additional documents",
        parameters: z.object({ query: z.string() }),
        execute: async ({ query }) => {
          const r = await sonzai.agents.knowledgeSearch(agentId, { query, limit: 5 });
          return r.results.map((d) => ({ content: d.content, label: d.label }));
        },
      }),
      memory_search: tool({
        description: "Search memory for additional facts about the user",
        parameters: z.object({ query: z.string() }),
        execute: async ({ query }) => {
          const r = await sonzai.agents.memory.search(agentId, { query, userId });
          return r.results.map((f) => ({ content: f.content, type: f.fact_type }));
        },
      }),
    },
    maxSteps: 3,
  });

  // Step 3: Process — extracts memories + primes next context with KB gaps
  await sonzai.agents.process(agentId, {
    userId,
    sessionId,
    messages: [
      { role: "user", content: message },
      { role: "assistant", content: text },
    ],
    provider: "gemini",
  });

  return text;
}

GET /context
              │
 ┌────────────┴────────────┐
 │                         │
 ▼                         ▼
Recalled              KB Search
Memories              Results
 │                    │
 └────────┬───────────┘
          │
          ▼
   System Prompt ──────► Your LLM
          │                  │
          │          ┌───────┴──────────────┐
          │          │ Tool call?            │
          │          │ knowledge_search()    │
          │          │ memory_search()       │
          │          └───────┬──────────────┘
          │                  │
          │                  ▼
          │             Response
          │                  │
          ▼                  ▼
      POST /process
          │
 ┌────────┴────────┐
 │                 │
 ▼                 ▼
Extract         Detect KB
Facts           Gaps (deferred)
 │                 │
 ▼                 ▼
Store in        Store in Redis
Memory Tree     (for next /context)

Frequently Asked Questions

Do I need tool calling if I already use /context?

Not necessarily. /context automatically includes KB results and recalled memories. Tool calling is useful when the LLM needs to search for something specific mid-conversation that wasn't covered by the initial context fetch, or when your framework expects tool definitions.

Is memory search async like KB enrichment?

No. Memory search is always synchronous. When you call GET /memory/search, results return immediately from BM25 indexes. The deferred/async flow only applies to knowledge base enrichment via the /process learning loop.

What happens if /process finds KB content but the user never calls /context again?

The deferred signals expire after 1 hour (TTL-based cleanup). No stale data persists. If the user resumes the conversation later, they get fresh results from the next /context call.

Can I use my own tools alongside Sonzai tools?

Absolutely. The Sonzai tool schemas are standard OpenAI function definitions. Mix them with your own tools in whatever framework you use. The LLM decides which tool to call based on the conversation.

How do custom tools defined in the dashboard relate to these?

Custom tools (created via POST /agents/{agentId}/tools or the dashboard) are for agent-side tool calling in Sonzai's managed chat mode. The tool schemas described here (/tools/schemas) are for BYO-LLM mode where your LLM calls Sonzai endpoints.

学习

如果说文档告诉你每个能力做什么，学习则告诉你 它为何如此运作 —— 模型、循环以及你能调节的取舍。

深度阅读

架构

平台、编排器与你的后端如何协作 —— 状态分别落在哪里。

智能体如何随时间提升

自动学习的全貌：记忆衰减、合并、去重、检索策略、人格漂移、突破与影子上线。

智能体洞察

系统揭示了智能体的内在状态 —— 用以调试、审计或微调行为。

主动行为

智能体何时会主动开口 —— 唤醒、调度及其策略。

Big Five 人格

OCEAN 维度、行为子项、以及交互如何塑造人格。

情绪模型

四种情绪维度、衰减速率以及驱动其变化的事件。

动手实践

理解模型之后，指南中的教程带你端到端做一个真实项目。

扩展性

BYOK ── 自带密钥

BYOK 让你继续使用 Sonzai 的对话／会话／抽取整套栈,但底层的提供商调用走 你自己的 API Key。token 费用计入你自己的提供商账单;其他一切(记忆、人格、后处理模型、Sonzai 平台计费)行为不变。

它和 Custom LLM (BYOM) 不是一回事。 BYOK 是把你的计费 Key 接到 Sonzai 已经原生集成好的提供商通道里; BYOM 是把整个对话补全调用换成 你自己托管的某个端点。

作用域

BYOK Key 按 (项目 × 提供商) 保存。同一项目同一提供商只有一把 Key,没有按 Agent / 按 Session / 按调用粒度的 BYOK Key。心智模型很简单 ── 某个项目的某次对话回合落到某个提供商上, 就走这个项目下该提供商的 BYOK Key,不论是哪个 Agent、哪个 Session 触发的。

对象	作用域
存储	按 `(project_id, provider)` ── 表的主键
请求时解析	由对话调用的 `project_id` → 查找解析后的提供商对应的 Key
加密	静态 AES-256,仅在请求路径内解密
API 访问	租户级 API Key 可管理其名下任意项目;项目级 Key 必须与请求中的 `project_id` 完全匹配

如果想让同一项目下不同 Agent 用不同 Key,请拆成不同项目 ── 能调的旋钮就这一个。

接入方式

线上等价的两种路径:

1. 控制台

一次性配置或轮换最快。

打开 platform.sonz.ai,选中项目。
进入 Settings → BYOK。
支持的 4 个提供商各有一张卡片,挑你要配的。
粘贴提供商的 API Key(OpenAI、Google AI Studio for Gemini、 xAI 控制台、OpenRouter 控制台)。
点 Save。和 API 同样的同步探针会跑一次 ── 上游若拒绝这把 Key,保存直接失败,提供商的错误信息原样回显。坏 Key 不会被持久化。
保存后卡片切到「已配置」状态 ── 显示前缀掩码、健康徽章 (healthy / unhealthy / unknown),并出现 Replace / Test / Disable / Delete 按钮。

控制台调用的就是 SDK 那一组端点,所以这里能做的都能脚本化。

2. API / SDK

适合 IaC、新租户的 CI 引导、Key 轮换 cron、自动租户接入流程中的 provisioning。端点见下。

可接入的提供商

平台原生支持的 4 个:

openai
gemini
xai
openrouter(内部回退路径 ── 这里配了 Key,平台落到 OpenRouter 时也会走你自己的)

custom BYOM 端点不是 BYOK 的提供商,通过 Custom LLM 配置。

存储与安全

Key 静态加密,只在平台请求路径内解密。任何 API 都不会把原 Key 回传 ── List / Get 只回 api_key_prefix (前几个字符),够你识别哪把是哪把。
写入时跑一次 同步探针。Sonzai 用这把 Key 向上游打一次 no-op 调用,坏 Key 在 PUT 阶段就 400。配错在配置时就暴露, 不会拖到第一次用户对话才炸。
每把 Key 都 跟踪健康状态。每次读取都返回 health_status、 last_health_error、last_health_check_at,可以接到监控、在用户感知到对话失败之前先告警。

端点

按项目分,索引为 (project_id, provider)。能列、能 Set、能停用、能删、能重测。

Method	Path	用途
`GET`	`/api/v1/projects/{projectId}/byok-keys`	列出所有 Key(掩码)
`PUT`	`/api/v1/projects/{projectId}/byok-keys/{provider}`	设置或轮换某提供商的 Key(保存前先探针)
`PATCH`	`/api/v1/projects/{projectId}/byok-keys/{provider}`	不动 Key,只切 `is_active` true/false
`POST`	`/api/v1/projects/{projectId}/byok-keys/{provider}/test`	用已存 Key 对上游再探一次
`DELETE`	`/api/v1/projects/{projectId}/byok-keys/{provider}`	删除该 Key

完整请求／响应结构看参考 → API → BYOK。

项目 API Key 的 Scope

通过控制台或 POST /api/v1/projects/{project_id}/api-keys 创建的项目 API Key 带有 scopes 数组。如需通过 SDK 以编程方式操作 BYOK，该 Key 需要：

read:byok — 列出提供商及查看健康状态。
write:byok — 设置 / 停用 / 删除 / 重测 Key。

租户级凭据（Clerk 控制台会话、带 ["*"] 的默认 API Key）自动拥有所有 BYOK 操作的访问权限。

Scope 字符串区分大小写，以动词开头，全部小写（read:byok，而非 BYOK:Read）。

设置一把 Key

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

const key = await client.byok.set("project_xyz", "openai", process.env.MY_OPENAI_KEY!);

console.log(key.api_key_prefix); // 例如 "sk-..." ── 永远不会是完整 Key
console.log(key.health_status);  // 同步探针通过后为 "healthy"

列出与查看

const keys = await client.byok.list("project_xyz");
for (const k of keys) {
console.log(k.provider, k.api_key_prefix, k.health_status, k.last_used_at);
}

启用、停用、删除

PATCH 不换 Key,只切 is_active ── 想暂停但保留 Key 时用。 DELETE 把 Key 和它的历史一并清掉。

// 暂停这把 BYOK Key(后续调用回到平台计费)
await client.byok.setActive("project_xyz", "openai", false);

// 重测
const fresh = await client.byok.test("project_xyz", "openai");

// 永久删除
await client.byok.delete("project_xyz", "openai");

缓存与失效

平台为了性能,会按 (project_id, provider) 在进程内缓存解析后的 BYOK Key。每次 Set / Patch / Delete 都会触发失效,轮换后的 Key 下次调用就生效,不需要重启。

BYOK 不适用的情形

如果某次对话调用最终落到的提供商在该项目下没有 BYOK Key,Sonzai 就照常用平台自己的 Key 计费 ── 体验和 SLA 完全不变。BYOK 是 纯加性 的:配上 Key 就接管那个提供商;删掉 Key 就回到平台 Key。

参考

Custom LLM (BYOM) ── 不是提供商透传,而是把整个端点换成你自己的。
参考 → API → BYOK ── 上述每个端点的 REST 结构。

扩展性

Custom LLM

How It Works

Configure an OpenAI-compatible API endpoint for your project. Sonzai routes all chat generation through your endpoint while handling everything else: context assembly, tool execution, side-effect extraction, memory storage, personality tracking, and consolidation.

Full Managed Experience

Built-in tools (web search, memory recall, image generation, inventory), streaming SSE, per-message side effects — everything works exactly as with our default providers.

Your Model, Your Control

Use fine-tuned models, self-hosted endpoints, or any OpenAI-compatible provider (vLLM, Ollama, Together, Groq, Azure OpenAI, etc.).

Encrypted at Rest

Your API key is encrypted with AES-256 before storage. Only the first 8 characters are visible in the dashboard for identification.

Per-Project Configuration

Each project can have its own custom LLM endpoint. Toggle it on/off without deleting the config.

Custom LLM vs. Standalone Memory

Which one should I use?

Custom LLM is the right choice when you want to use your own model but still want the full Sonzai experience (tools, streaming, per-message extraction). Standalone Memory is for when you need to control the entire chat loop yourself — e.g., for privacy preprocessing, data anonymization, or deep integration with an agent framework. See the Standalone Memory docs for the tradeoffs.

Feature	Custom LLM	Standalone Memory
Built-in tools	Full support	Manual only
Streaming SSE	Yes	No
Per-message extraction	Automatic	Manual /process call
Memory prewarming	Yes	No
Data preprocessing	No	Full control
Agent framework integration	N/A	Full control

Requirements

Your endpoint must be OpenAI-compatible:

Accept POST /chat/completions (or the equivalent path your base URL resolves to)
Accept OpenAI chat message format (messages, model, temperature, etc.)
Return SSE stream in OpenAI chunk format (data: {"choices": [...]})
Support tools / tool_choice parameters if you want built-in tools to work

Compatible providers include: vLLM, Ollama, Together AI, Groq, Azure OpenAI, Fireworks AI, Anyscale, and any server implementing the OpenAI API spec.

Configuration via Dashboard

In the Sonzai dashboard, go to your project settings and configure the Custom LLM under the Custom LLM section:

Enter your OpenAI-compatible endpoint URL (e.g., https://api.together.xyz/v1)
Paste your API key (encrypted at rest with AES-256)
Specify the model name (e.g., meta-llama/Llama-3.1-70B-Instruct)
Optionally set a display name for easy identification
Toggle active/inactive without deleting the config

Configuration via API

Set Configuration

// Configure custom LLM for a project
await client.projects.customLlm.set("project-id", {
endpoint: "https://api.together.xyz/v1",
apiKey: "your-api-key",
model: "meta-llama/Llama-3.1-70B-Instruct",
displayName: "Together Llama 3.1 70B",
isActive: true,
});

Get Configuration

const config = await client.projects.customLlm.get("project-id");

if (config.configured) {
console.log(config.endpoint);      // "https://api.together.xyz/v1"
console.log(config.apiKeyPrefix);   // "your-api" (first 8 chars)
console.log(config.model);          // "meta-llama/Llama-3.1-70B-Instruct"
console.log(config.isActive);       // true
}

Remove Configuration

await client.projects.customLlm.delete("project-id");

How Chat Routes Through Your Model

Once configured, here is what happens when a chat request is made:

Context assembly — Sonzai builds the 7-layer enriched context (personality, memory, mood, habits, goals, relationships, application state) exactly as with default providers.
Tool injection — Built-in tools (sonzai_memory_recall, sonzai_web_search, etc.) and any custom tools are added to the request.
Your endpoint called — The request is sent to your configured endpoint with your model name, API key, and the full message history including system prompt.
Streaming proxy — SSE chunks from your endpoint are streamed back to the client in real time.
Post-stream processing — After the stream completes, Sonzai extracts side effects (memory facts, mood changes, personality shifts, habits, tool calls) and stores them — same as with default providers.

Background Job Consistency

Background tasks like fact extraction, memory consolidation, diary generation, and summarization automatically use the same model family you configured. Sonzai tracks the last-used provider/model for each agent and routes background LLM calls accordingly.

Security

API key encryption — Keys are encrypted with AES-256 before storage. Only the first 8 characters are visible.
SSRF protection — Endpoint URLs are validated to block localhost, private IPs (10.x, 172.16-31.x, 192.168.x), link-local, and cloud metadata addresses.
Project-scoped — Each config is scoped to a project. Different projects can use different endpoints.

Billing

Custom LLM usage is billed at a flat per-token rate under the custom_llm billing model, regardless of which actual model your endpoint serves. Sonzai tracks input/output tokens from your endpoint's usage response. Your own endpoint costs (API fees, compute) are entirely yours.

扩展性

模型

Sonzai 与具体模型解耦。Mind Layer 将聊天补全路由到项目所配置的模型，并在后台运行小型后处理模型来更新记忆、人格与情绪。

各阶段使用的模型

阶段	用途	默认
聊天补全	实时回复，流式输出给用户	`gemini-3.1-flash-lite-preview`（全平台默认）
后处理	记忆抽取、人格漂移、情绪更新、摘要	按项目配置的更便宜模型映射
Generation	由提示生成人格、自我介绍与种子记忆	与聊天补全同提供商
语音	TTS / STT / 双向流	提供商路由 —— 参见 Voice

受支持的提供商

sonzai.providers 模块是平台所接受的提供商 ID 的权威列表。截止当前：

提供商 ID	说明
`openai`	OpenAI
`gemini`	Google Gemini（默认 —— `gemini-3.1-flash-lite-preview` 即 `providers.DEFAULT_MODEL`）
`xai`	xAI Grok
`openrouter`	OpenRouter —— 多模型网关，作为兜底
`custom`	项目配置的自定义 LLM（BYOM —— 见 Custom LLM）

可以通过 sonzai.providers / sonzai.providers.models 中的常量预先指定提供商和模型，也可以通过 client.list_models() 在运行时获取实时列表（JS/Go SDK 有同名端点）。client.providers 也接受相同的 ID 集，用于按项目列出、设置和测试连通性。

后处理模型映射

聊天补全之外，Sonzai 在后台运行一组更小的模型来抽取记忆、漂移人格与更新情绪。从聊天模型到后处理模型的映射保存在项目配置项 post_processing_model_map 之下（见各 SDK 中的 sonzai.PostProcessingModelMap / PostProcessingModelEntry）。

* 通配符项为所有未显式映射的聊天模型设定默认值，使配置保持简短；当某个抽取器需要更强或更轻量的算力时，仍可逐模型覆盖。

自带 LLM

把聊天补全路由到你自己的端点；记忆、人格与情绪仍由 Mind Layer 管理。

参考

端点形态见参考 → API。代码中规范的提供商和模型 ID 请直接 import SDK 的 providers 模块，避免手敲字符串。

Post-processing model map

Behind every chat turn, Sonzai runs a fleet of smaller models that:

Extract facts from the user message and the agent reply
Drift personality scores in response to interactions
Update mood dimensions (happiness, energy, calmness, affection)
Summarise sessions and compact older memory

These run after the user-facing reply is streamed, on the post-processing model map — a per-project config that maps the chat-completion model to the smaller model the extractor should use.

The map

Stored under the post_processing_model_map project-config key. Each entry is a PostProcessingModelEntry with two fields:

type PostProcessingModelEntry = {
  provider: string; // e.g. "gemini"
  model:    string; // e.g. "gemini-3.1-flash-lite-preview"
};

The map keys are chat-completion model IDs plus a special * wildcard that catches any chat model not explicitly listed:

{
  "claude-3-5-sonnet": { "provider": "gemini",  "model": "gemini-3.1-flash-lite-preview" },
  "gpt-4-turbo":       { "provider": "openrouter", "model": "anthropic/claude-3-haiku" },
  "*":                 { "provider": "gemini",  "model": "gemini-3.1-flash-lite-preview" }
}

When extraction needs to run for a chat that used claude-3-5-sonnet, the extractor uses Gemini Flash Lite. When it sees a chat model not in the map, the * wildcard kicks in.

The wildcard key is exported as sonzai.PostProcessingWildcardKey (Go) and the equivalent constant in the other SDKs so you don't have to hard-code "*" in your provisioning scripts.

Reading the current map

const map = await client.projectConfig.getPostProcessingModelMap("project_xyz");
for (const [chatModel, entry] of Object.entries(map ?? {})) {
console.log(chatModel, "→", entry.provider, entry.model);
}

Setting a map (or a single-key default)

Pass a full map; the call is a write-through replacement, not a merge. Most projects only need a wildcard entry pointing at a cheap model:

await client.projectConfig.setPostProcessingModelMap("project_xyz", {
"*": { provider: "gemini", model: "gemini-3.1-flash-lite-preview" },
});

When to override per chat model

The wildcard is enough for most projects. Reach for an explicit entry when:

A particular chat model produces output the default extractor mishandles (e.g. tool-call traces from a verbose model that need a stronger extractor to keep facts atomic).
You're A/B-ing two extractors and want one chat model to route through each for comparison.
Cost: cheaper chat models can run a cheaper extractor; flagship chat models may warrant a stronger extractor on the same trace.

Provider availability

An entry's provider/model must match a real provider Sonzai has configured for your project — see Providers. Setting a non-existent provider here makes extraction fail asynchronously after the user-facing reply has already streamed; you'll see it in the agent's extraction_status on the next turn.

Reference

Providers — the chat-completion provider list (independent of post-processing).
Self-improvement — the full picture of what the extractor does on each turn.
Reference → API — REST endpoint shapes for the project-config get/set/delete calls.

Providers

Sonzai routes chat completions through one of four providers. The IDs are exported as constants from the sonzai.providers module in the SDKs — import those rather than hand-typing strings, so they stay in sync as the catalog evolves. Use client.list_models() for the live set enabled on your tenant at runtime.

`gemini` — Google Gemini (default)

The platform default. gemini-3.1-flash-lite-preview is providers.DEFAULT_MODEL, and is also the wildcard fallback for the post-processing cascade.

Model	Context window	Notes
`gemini-3.1-flash-lite-preview`	1M	Default. Vision + tools + JSON mode + streaming. Compaction at 450k / 500k.
`gemini-3-flash-preview`	2M	Fallback on 429. Same feature set.
`gemini-3.1-pro-preview`	2M	Fallback on 429. Strongest Gemini model — pair with a cheaper post-processing entry.

`openai` — OpenAI

Default gpt-5.5; the 5.4 family is the cheaper workhorse and 5 / 5-mini / 5-nano cover even cheaper or smaller-context tiers. The fallback chain on quota exhaustion is gpt-5.5 → gpt-5.4 → gpt-5.4-mini → gpt-5.

Model	Context window	Use it when
`gpt-5.5`	1.05M	Default. The current OpenAI frontier — vision + tools + streaming + JSON mode.
`gpt-5.4`	1.05M	Cheaper than 5.5, same context window.
`gpt-5.4-mini`	1.05M	The cheap workhorse. Recommended for high-throughput tenants.
`gpt-5`	400k	Frozen Aug-2025 snapshot. Kept for tenants pinned to it; new agents should default to 5.5.
`gpt-5-mini` / `gpt-5-nano`	400k	Smaller-context tiers; same generation as `gpt-5`.

`xai` — xAI (Grok)

Reasoning and non-reasoning variants in the Grok 4 family. grok-4-1-fast-non-reasoning is the default; reasoning models are opt-in for tasks that benefit from deeper chain-of-thought.

Model	Context window	Reasoning
`grok-4-1-fast-non-reasoning`	2M	No
`grok-4-1-fast-reasoning`	2M	Yes
`grok-4.20-0309-non-reasoning`	2M	No
`grok-4.20-0309-reasoning`	2M	Yes

All Grok 4 entries support streaming, tools, and JSON mode. None support vision today.

`custom` — bring-your-own-LLM (BYOM)

Point Sonzai at any OpenAI-compatible chat-completions endpoint. The Mind Layer keeps owning memory, personality, mood, and post-processing — only the chat-completion call gets routed through your endpoint.

See Custom LLM for the full setup. This is distinct from BYOK — BYOK uses Sonzai's provider integrations but with your billing key; BYOM uses your own inference stack entirely.

Picking a provider in code

Pass provider and model on the chat call. Both are optional — omit them and Sonzai uses the agent's default, falling back through the scope cascade.

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

await client.agents.chat({
agent:    "agent_abc",
messages: [{ role: "user", content: "Hello" }],
provider: "openai",
model:    "gpt-5.5",
});

Listing what's available at runtime

client.list_models() (Python / TS / Go expose the same shape) returns the live set of providers and models enabled on your tenant — useful for building a model-picker UI or for asserting that a provider you depend on is wired up before a deploy.

const result = await client.listModels();
for (const p of result.providers) {
console.log(p.provider, p.models.map((m) => m.id));
}

Reference

BYOK — drop your own provider keys per project.
Custom LLM — point Sonzai at your own endpoint entirely.
Model scope — how provider / model is resolved per call.
Post-processing — what runs in the background, on what model.

Model scope

A Sonzai chat turn picks two models: the chat-completion model the user sees, and the post-processing model that runs the background work afterwards. Each goes through its own resolver cascade. The cascades share the same scope hierarchy:

1. per-call            (highest precedence — passed to agents.chat / sessions.start / agents.process)
2. per-agent           (AgentProfile fields)
3. per-project         (project_config rows in CockroachDB)
4. per-account/tenant  (account_config rows in CockroachDB)
5. system default      (Go constant compiled into the binary)

First non-empty layer wins. Layer 5 always exists, so resolution always produces a concrete answer.

Chat model

What the user sees. Resolved per chat turn.

Layer	Where it lives	Set with
Per-call	`provider` / `model` arg on `agents.chat`, `agents.chat_stream`, `agents.process`, `sessions.start`	the SDK call itself
Per-agent	`AgentProfile.ModelConfig.{provider,model}`	`client.agents.update(agent_id, model_config={...})`
Per-project	Default model for an unconfigured agent in a project	Project settings on the dashboard or `client.providers.set(project_id, ...)`
Per-account / tenant	Org-wide default	(admin endpoint, see Reference)
System default	`gemini-3.1-flash-lite-preview` (`providers.DEFAULT_MODEL`)	constant, not configurable

Setting at each layer

// Per-call: pin a single chat call
await client.agents.chat({
agent:    "agent_abc",
provider: "openai", model: "gpt-5.5",
messages: [{ role: "user", content: "Hello" }],
});

// Per-session: set defaults that every session.turn() inherits
const session = await client.agents.sessions.start("agent_abc", {
userId:   "user_123",
sessionId: "session_abc",
provider: "xai",
model:    "grok-4-1-fast-non-reasoning",
});

// Per-agent: persist on the AgentProfile
await client.agents.update("agent_abc", {
modelConfig: { provider: "gemini", model: "gemini-3.1-pro-preview" },
});

// Per-project: chat default for unpinned agents in this project
await client.providers.set("project_xyz", { provider: "gemini", model: "gemini-3.1-flash-lite-preview" });

Post-processing model

The cheaper-model fleet that runs the batch work behind every turn: fact extraction, dedup, mood updates, personality drift, summarisation, diary, constellation. Resolved per task, per turn, independently of the chat model.

The cascade is documented exhaustively at Post-processing model map. The short version:

Layer	Where it lives
Per-agent	`AgentProfile.PostProcessingProvider` + `PostProcessingModel` (direct override; bypasses the cascade entirely when both set)
Per-project	`project_config.post_processing_model_map` JSONB — `chat_model → {provider, model}` map
Per-account / tenant	`account_config.post_processing_model_map` JSONB — same shape
System default	Go constant, ships with the binary
Wildcard fallback	`gemini-3.1-flash-lite-preview`

The map keys are chat-completion model IDs (or * for wildcard), so post-processing routing depends on what chat model just ran.

Setting at each layer

// Per-agent — direct override that bypasses the rest of the cascade
await client.agents.updatePostProcessingModel("agent_abc", {
post_processing_provider: "gemini",
post_processing_model:    "gemini-3.1-flash-lite-preview",
});

// Per-project — full chat→post map (write-through replacement)
await client.projectConfig.setPostProcessingModelMap("project_xyz", {
"claude-opus-4.6": { provider: "openrouter", model: "anthropic/claude-haiku-4.5" },
"*":               { provider: "gemini",     model: "gemini-3.1-flash-lite-preview" },
});

// Per-account/tenant
await client.accountConfig.setPostProcessingModelMap({
"*": { provider: "gemini", model: "gemini-3.1-flash-lite-preview" },
});

Previewing the resolved model

For UI ("which model would run my diary tonight?") you can ask the resolver what it would pick without firing inference:

const effective = await client.agents.effectivePostProcessingModel("agent_abc", {
chatModel: "claude-opus-4.6",
taskType:  "fact_extraction",
});
console.log(effective.provider, effective.model);

Common patterns

One frontier model per agent, one cheap extractor per project. Set agent ModelConfig to your premium model; set the project post-processing map's * wildcard to gemini/gemini-3.1-flash-lite-preview.
A/B test extractors. Two projects, same agents, different account_config.post_processing_model_map entries — compare quality on the same traffic.
Per-tenant pricing tiers. Free tier defaults the post-processing map to flash-lite at the tenant level; paid tier overrides per-project to a stronger extractor.
One-off override. Pass provider/model on a single agents.chat call without persisting anything.

Reference

Providers — provider IDs and model lists.
BYOK — bring your own provider key per project.
Post-processing — full cascade rules + the system-default map.
Reference → API — REST shape for every endpoint above.

API 参考

身份验证

所有 API 调用都需要使用项目 API 密钥进行 Bearer 身份验证。

Authorization: Bearer YOUR_PROJECT_API_KEY

API Reference

/docs/en/api 页面为英文。常用术语可参考本页。

浏览完整的端点参考（包含 schema、请求/响应示例及可交互的在线调试面板），请访问 /docs/en/api。每个操作均有独立页面，内容由实时 OpenAPI spec 自动生成。

原始规范（用于 Postman、代码生成器及自定义工具）

实时 OpenAPI 3.1 JSON + YAML 公开托管，无需身份验证，每次部署后自动重新生成：

https://api.sonz.ai/docs/openapi.json
https://api.sonz.ai/docs/openapi.yaml

curl -sL https://api.sonz.ai/docs/openapi.json -o openapi.json

错误格式

所有错误响应使用 RFC 7807 application/problem+json 格式，包含 type、title、status、detail 和可选的 instance 字段。

REST API

用于智能体生命周期、实时智能体交互和主动投递的公共 HTTP 端点。记忆、情绪、关系和上下文管理内部机制由平台处理。

仅限服务端。 该 API 不接受浏览器请求。对于 Web 应用，请通过后端代理。请参阅集成指南。

智能体生命周期

创建智能体

POST /api/v1/agents

创建新智能体。返回带有平台生成的 UUID 的智能体。

参数：

name (string): 智能体名称（必填）
personality_prompt (string): 自定义系统提示（可选）
big5 (object): 大五人格评分：openness、conscientiousness、extraversion、agreeableness、neuroticism (0.0-1.0)
speech_patterns (string[]): 语言模式（可选）
true_interests (string[]): 智能体兴趣（可选）
project_id (string): 分配智能体的项目 UUID（可选）
language (string): ISO 语言代码，如 "en"（可选）

响应: { "agent_id": "uuid", "name": "...", ... }

列出智能体

GET /api/v1/agents

列出智能体。通过 project_id 查询参数筛选。

参数：

project_id (string): 按项目筛选（查询参数，可选）

响应: 智能体对象数组

获取智能体

GET /api/v1/agents/{agentId}

通过 ID 获取智能体。

响应: 包含人格、情绪、资料的智能体对象

对话

流式聊天

POST /api/v1/agents/{agentId}/chat

通过 SSE 流式传输与智能体对话。返回 Server-Sent Events。

参数：

messages (CEChatMessage[]): 对话消息
user_id (string): 用户标识符

响应: 聊天完成块的 SSE 流

主动通知

列出通知

GET /api/v1/agents/{agentId}/notifications

列出待处理的主动消息。

参数：

status (string): 按状态筛选：pending | consumed（默认：pending，查询参数）
user_id (string): 按用户筛选（可选，查询参数）
limit (int): 最大结果数（默认：50，最大：500，查询参数）

响应: 主动消息列表，包含 message_id、agent_id、user_id、check_type、intent、generated_message、status、created_at

消费通知

POST /api/v1/agents/{agentId}/notifications/{messageId}/consume

投递后将通知标记为已消费。

响应: 确认

通知历史

GET /api/v1/agents/{agentId}/notifications/history

列出所有状态的通知。

响应: 完整通知历史

智能体生命周期（详细）

CreateAgent

创建具有人格配置的新智能体。生成人格提示、语言模式和情感倾向。

请求：

user_id (string): 所有者用户标识符
agent_name (string): 智能体显示名称
gender (string): "male"、"female" 或 "non_binary"
bio (string): 智能体传记（可选）
avatar_url (string): 头像图片 URL（可选）
big5 (CEBig5Scores): 大五人格评分 (0.0-1.0)
language (string): 主要语言
equipped_outfit (string): 初始装扮 ID（可选）
skills (CESkillLevel[]): 初始技能等级（可选）
model_tier (int32): LLM 模型层级（可选）
project_id (string): 分配智能体的项目（可选）
agent_id (string): 调用方指定的 ID，用于确定性智能体（可选）
personality_prompt (string): 自定义系统提示（可选）
generate_goals (bool): 创建后自动生成目标（可选）
provided_goals (string[]): 直接存储这些目标（可选）
speech_patterns (string[]): 语言模式（可选）
true_interests (string[]): 智能体兴趣（可选）
true_dislikes (string[]): 智能体不喜欢的（可选）
user_display_name (string): 所有者显示名称（可选）
generate_avatar (bool): 创建时自动生成 AI 头像（默认：true，消耗 1 积分）。设为 false 可跳过。

响应: agent_id (UUID)、status ('completed' 或 'in_progress')

GetAgent

检索智能体当前状态，包括人格、情绪和资料。

请求：

agent_id (string): 智能体 UUID

响应: Agent ID、name、bio、gender、avatar_url、Big5 scores、owner、created_at

UpdateAgent

更新智能体字段（名称、传记、头像、人格、兴趣、语言模式）。

请求：

agent_id (string): 智能体 UUID
name (string): 新名称（可选）
bio (string): 新传记（可选）
avatar_url (string): 新头像 URL（可选）
big5 (CEBig5Scores): 更新的 Big5 评分（可选）
true_interests (string[]): 更新的兴趣（可选）
true_dislikes (string[]): 更新的不喜欢的（可选）
speech_patterns (string[]): 更新的语言模式（可选）
personality_prompt (string): 更新的系统提示（可选）

响应: success (bool)

DeleteAgent

永久删除智能体及所有关联数据（记忆、情绪、关系）。

请求：

agent_id (string): 智能体 UUID

响应: success (bool)

RegenerateAvatar

为智能体生成或重新生成 AI 创建的头像。使用 LLM 根据人格数据创建图像提示，然后生成并上传图像。消耗 1 积分。除非禁用，头像在智能体创建时自动生成。

请求：

agent_id (string): 智能体 UUID（URL 参数）
style (string): 可选的风格提示（如 'watercolor anime'、'realistic portrait'）

响应: success (bool)、avatar_url (string)、prompt (string)、generation_time_ms (int64)

UpdateAgentPersonality

当您的产品有意更改智能体设计时，更新智能体的创作大五人格配置。

请求：

agent_id (string): 智能体 UUID
big5 (CEBig5Scores): 更新的大五人格评分（含置信度）

响应: success (bool)

主动行为

ScheduleWakeup

安排智能体在延迟后主动联系用户。

请求：

agent_id (string): 智能体 UUID
user_id (string): 用户标识符
check_type (string): 检查类型：check_in、follow_up、mood_driven
intent (string): 智能体想要联系的原因
delay_hours (int32): 唤醒前的延迟小时数

响应: wakeup_id (string)、scheduled_at (Timestamp)

GetPendingWakeups

检索智能体待处理的唤醒事件。

请求：

agent_id (string): 智能体 UUID

响应: PendingWakeup 列表 (wakeup_id、user_id、check_type、intent、scheduled_at)

流式对话

主要的公共对话 RPC。发送智能体、用户、应用上下文和消息历史；平台自动处理上下文组装和状态更新。

StreamChat

流式传输智能体交互的 AI 响应，同时平台在后台处理内部记忆和状态更新。

请求：

agent_id (string): 智能体 UUID
user_id (string): 用户标识符
session_id (string): 唯一会话 ID
backend_context (BackendContext): 应用状态上下文
messages (CEChatMessage[]): 对话消息
continuation_token (string): 从上一个响应恢复（可选）
request_type (string): "chat"、"guide" 或 "outing"
capabilities (string[]): 已解锁的能力（可选）
language (string): ISO 语言代码（可选）
interaction_role (string): "owner" 或 "non_owner"
skill_levels (map<string, int32>): 技能等级（可选）
max_turns (int32): 每个请求的最大助手轮次数（可选）

响应: StreamChatEvent 流 (delta | message_boundary | complete | side_effects | error)

StreamChatEvent 是一个 oneof，包含以下事件类型：

StreamChatDelta (delta)

content (string): AI 的文本块
message_index (int32): 多消息响应中的索引
is_follow_up (bool): 是否为后续消息
replacement (bool): 如果为 true，则替换所有之前的内容

StreamChatComplete (complete)

full_content (string): 完整响应文本
finish_reason (string): "stop"、"length" 或 "content_filter"
continuation_token (string): 用于继续对话的令牌
message_count (int32): 响应中的消息数量

StreamChatError (error)

message (string): 错误消息
code (string): 错误代码

AI 生成

平台管理的 AI 内容生成，用于传记、目标、人格、日记条目和图像。

GenerateBio

基于人格和上下文使用 AI 生成或重写智能体的传记。

请求：

agent_id (string): 智能体 UUID
user_id (string): 用户标识符
current_bio (string): 用于重写的当前传记（可选）
style (string): 风格：casual、formal、poetic 等（可选）

响应: bio (string)、tone (string)、confidence (double)

GenerateGoals

基于特质、兴趣和记忆为智能体生成人格驱动的目标。

请求：

agent_id (string): 智能体 UUID
agent_name (string): 智能体显示名称
big5 (CEBig5Scores): Big5 评分
true_interests (string[]): 智能体兴趣
true_dislikes (string[]): 智能体不喜欢的
speech_patterns (string[]): 语言模式
recent_memories (CERecentMemory[]): 用于上下文的近期记忆
current_goals (CEGoalSummary[]): 现有目标以避免重复
max_goals (int32): 最大生成目标数
model_config (CEModelConfig): LLM 模型配置（可选）
custom_context (map<string, string>): 应用特定上下文（可选）

响应: CEGeneratedGoal 列表 (type、title、description、priority、related_traits)、reasoning

GeneratePersonality

从模板和 Big5 评分生成语言模式和兴趣。

请求：

template_id (string): 模板标识符
base_prompt (string): 基础人格提示
big5 (CEBig5Scores): Big5 评分
agent_name (string): 智能体名称
gender (string): 智能体性别

响应: speech_patterns (string[])、true_interests (string[])、used_fallback (bool)

GenerateDiary

从对话消息和/或应用事件生成日记条目。

请求：

agent_id (string): 智能体 UUID
user_id (string): 用户标识符
date (string): YYYY-MM-DD 格式的日期
agent_name (string): 智能体显示名称
language (string): 生成内容的语言
messages (CEDiaryMessage[]): 对话消息 (role、content、time)
trigger_type (string): daily_summary、achievement、milestone、breakthrough
trigger_context (CEDiaryTriggerContext): 事件触发上下文（可选）
model (string): LLM 模型覆盖（可选）
temperature (double): 温度覆盖（可选）
timezone (string): 日期处理的时区（可选）

响应: user_id、date、diary (title、body_lines、tags)、generation_time_ms

GenerateImage

从文本提示生成图像并存储到云存储。

请求：

prompt (string): 图像生成提示
negative_prompt (string): 负面提示（可选）
model (string): 使用的模型（可选）
provider (string): 使用的提供商（可选）
output_bucket (string): 输出 GCS 存储桶（可选）
output_path (string): 存储桶中的输出路径（可选）
cdn_domain (string): 公共 URL 的 CDN 域名（可选）

响应: success、image_id、gcs_uri、public_url、mime_type、generation_time_ms、error

语音与媒体

语音匹配、文本转语音、语音聊天和反思能力。

VoiceMatch

根据人格特质为智能体匹配合适的 TTS 语音。

请求：

big5 (CEBig5Scores): 用于匹配的 Big5 评分
preferred_gender (string): 偏好的语音性别（可选）
agent_id (string): 智能体 UUID（如果不提供 big5 则自动查找）

响应: voice_id、voice_name、match_score、reasoning

TextToSpeech

使用 Google Gemini 语音进行文本转语音，具有情感上下文感知。

请求：

text (string): 要转换的文本
voice_name (string): Gemini 语音名称
language (string): 语言代码（可选）
emotional_context (CEEmotionalContext): 情感主题和语调（可选）

响应: audio (bytes)、content_type、voice_name

VoiceChat

单轮语音聊天：转录音频、生成 AI 响应、返回 TTS 音频。

请求：

agent_id (string): 智能体 UUID
user_id (string): 用户标识符
audio (bytes): 原始音频数据
audio_format (string): 音频格式 (opus、pcm、wav)
voice_name (string): TTS 语音名称
continuation_token (string): 从上一轮恢复（可选）
language (string): 语言代码（可选）
application_id (string): 应用标识符（可选）

响应: transcript、response (text)、audio (bytes)、content_type、continuation_token、side_effects_json

ListVoices

列出可用的 Gemini TTS 语音，可选按性别筛选。

请求：

gender (string): 按性别筛选（可选）

响应: CEGeminiVoice 列表 (name、gender)

Reflect

生成关于能力解锁、里程碑或其他事件的 AI 反思。

请求：

agent_id (string): 智能体 UUID
user_id (string): 用户标识符
reflection_type (string): "capability_unlock"、"milestone" 等
capability (string): 能力名称
capability_source (string): 能力来源
context (string): 附加上下文字符串（可选）
new_capabilities_json (bytes): 新能力 JSON（可选）
session_id (string): 用于自动构建上下文的会话 ID（可选）
interaction_role (string): "owner" 或 "non_owner"（默认："owner"）

响应: success (bool)、reflection (string)、side_effects_json (bytes)

流式语音聊天

具有服务端 VAD（语音活动检测）的双向流式语音聊天。客户端持续流式传输音频块；服务器处理语音检测、转录、AI 响应和 TTS。

StreamVoiceChat

双向流式传输：客户端发送初始化 + 音频块，服务器返回转录和 TTS 音频。无需手动停止按钮。

请求：

init (VoiceChatInit): 第一条消息：会话初始化
audio_chunk (VoiceAudioChunk): 后续消息：原始音频数据

VoiceChatInit

agent_id (string): 智能体 UUID
user_id (string): 用户标识符
audio_format (string): "opus"、"pcm"、"wav"（默认："opus"）
sample_rate (int32): 采样率（Hz）（opus 默认：48000）
voice_name (string): TTS 语音名称
language (string): 语言代码（默认："en"）
application_id (string): 应用标识符
continuation_token (string): 从上一个会话恢复（可选）

VoiceAudioChunk

audio (bytes): 原始音频数据（如 Opus 帧）
end_of_speech (bool): 可选的客户端 VAD 提示

服务器响应事件：

VoiceStreamReady

session_id (string): 分配的会话 ID

VoiceStreamVAD

speaking (bool): true = 语音开始，false = 语音结束

VoiceStreamTranscript

text (string): 转录文本
is_final (bool): true = 此次话语的最终转录

VoiceStreamAudio

audio (bytes): 音频数据块
content_type (string): 如 "audio/opus"、"audio/wav"

VoiceStreamTurnComplete

continuation_token (string): 用于继续会话的令牌
side_effects_json (bytes): JSON 序列化的 AgentSideEffects（可选）

VoiceStreamError

message (string): 错误消息
code (string): "vad_error"、"stt_error"、"llm_error"、"tts_error"
fatal (bool): 如果为 true，应关闭会话

分析与搜索

AI 驱动的对话分析、摘要和基于搜索。

AnalyzeConversation

分析对话以提取副作用（人格变化、习惯、记忆等）。

请求：

agent_id (string): 智能体 UUID
agent_name (string): 智能体显示名称
user_id (string): 用户标识符
messages (CEAnalyzeConversationMessage[]): 要分析的消息 (role、content)
is_final (bool): 是否为最后一批消息

响应: success、side_effects_json (bytes)、summary、latency_ms

SummarizeConversation

生成带主题提取的对话简洁摘要。

请求：

messages (CESummarizeConversationMessage[]): 消息 (role、content、time)
agent_name (string): 智能体名称
user_name (string): 用户显示名称
max_summary_length (int32): 最大摘要长度（字符数）

响应: summary (string)、topics (string[])、message_count (int)

GenerateSearchQuery

从主题和类别生成优化的搜索查询，用于网络搜索。

请求：

topic (string): 要搜索的主题
category (string): 上下文类别

响应: query (string)、context (string)

GroundedSearch

使用多个查询执行基于事实的网络搜索，返回带来源的摘要结果。

请求：

queries (string[]): 搜索查询
context (string): 搜索相关性上下文
agent_name (string): 用于响应框架的智能体名称

响应: CEGroundedSearchResult 列表 (query、summary、sources with title/url/snippet)

多智能体对话

智能体之间的对话，用于外出、对话和多智能体场景。

AgentDialogue

在多智能体对话上下文中生成智能体响应（如智能体之间的外出）。

请求：

agent_id (string): 智能体 UUID（响应方智能体）
user_id (string): 用户标识符
messages (CEChatMessage[]): 对话消息
request_type (string): "outing"、"dialogue" 等
scene_guidance (string): 场景特定提示引导
tool_config_json (bytes): 工具配置 JSON（可选）
session_id (string): 用于自动构建上下文的会话 ID（可选）
interaction_role (string): "owner" 或 "non_owner"（默认："owner"）

响应: response (string)、side_effects_json (bytes)

应用事件

通知平台重要的应用事件。平台可能生成日记条目、更新目标或采取其他 AI 操作。当日记创建时触发 OnDiaryGenerated webhook。

TriggerEvent

接受应用事件（成就、里程碑、突破、完成）并触发 AI 内容生成。

请求：

agent_id (string): 智能体 UUID
user_id (string): 用户标识符
event_type (string): "achievement"、"milestone"、"breakthrough"、"level_up"
event_description (string): 供 AI 使用的人类可读上下文
metadata (map<string, string>): 附加上下文（achievement_id、level 等）
language (string): 生成内容的语言（默认："en"）

响应: accepted (bool)、event_id (string)

知识库

项目范围的知识图谱。上传文档或通过 API 推送结构化数据——平台提取实体、构建图谱，并为智能体提供 knowledge_search 工具在对话中查询。

文档

上传文档

POST /projects/{projectId}/knowledge/documents

上传文档（带 'file' 字段的 multipart/form-data，最大 50 MB）。返回 202 及 document_id 并触发异步提取。

参数：

file (multipart): 文档文件

响应: document_id、file_name、file_size、checksum、status、gcs_path

列出文档

GET /projects/{projectId}/knowledge/documents

列出文档。查询参数：limit（默认 50，最大 200）。

响应: documents[]、total

获取文档

GET /projects/{projectId}/knowledge/documents/{docId}

获取单个文档。

响应: KBDocument 对象

删除文档

DELETE /projects/{projectId}/knowledge/documents/{docId}

删除文档。

响应: 204 No Content

事实与图谱

插入事实

POST /projects/{projectId}/knowledge/facts

向知识图谱插入实体和关系。与现有节点进行解析，创建/更新并保存版本历史。

参数：

source (string): 来源标识符（默认：'api'）
facts[] (array): 实体：entity_type、label、properties
relationships[] (array): 边：from_label、to_label、edge_type

响应: processed、created、updated、details[]

列出节点

GET /projects/{projectId}/knowledge/nodes

列出知识图谱节点。查询参数：type（筛选）、limit（默认 100，最大 500）。

响应: nodes[]、total

获取节点

GET /projects/{projectId}/knowledge/nodes/{nodeId}

获取节点及连接的边。查询参数：history=true 获取版本历史。

响应: node、outgoing[]、incoming[]、history[]

删除节点

DELETE /projects/{projectId}/knowledge/nodes/{nodeId}

软删除节点（设置 is_active=false）。

响应: 204 No Content

节点历史

GET /projects/{projectId}/knowledge/nodes/{nodeId}/history

获取节点版本历史。查询参数：limit（默认 50，最大 200）。

响应: history[]、total

搜索

搜索知识库

GET /projects/{projectId}/knowledge/search

带图谱遍历的全文搜索。查询参数：q（必填）、limit、history、type、filters (JSON)。

参数：

q (string): 搜索查询（必填）
limit (int): 最大结果数（默认 20，最大 100）
type (string): 逗号分隔的实体类型筛选
filters (JSON string): 属性筛选对象
history (bool): 包含版本历史

响应: query、results[]（含相关节点）、total

架构

创建架构

POST /projects/{projectId}/knowledge/schemas

创建带字段和可选相似度配置的实体类型架构。

参数：

entity_type (string): 实体类型名称（必填）
fields[] (array): 字段定义：name、type、required
description (string): 架构描述
similarity_config (object): match_fields[]、threshold

响应: KBEntitySchema 对象

列出架构

GET /projects/{projectId}/knowledge/schemas

列出项目的实体架构。

响应: schemas[]、total

更新架构

PUT /projects/{projectId}/knowledge/schemas/{schemaId}

更新实体架构。

参数：

entity_type (string): 更新的实体类型名称
fields[] (array): 更新的字段定义

响应: KBEntitySchema 对象

删除架构

DELETE /projects/{projectId}/knowledge/schemas/{schemaId}

删除实体架构。

响应: 204 No Content

统计

获取知识库统计

GET /projects/{projectId}/knowledge/stats

获取知识库统计信息（文档数量、节点数量、边数量、提取 token 数）。

响应: documents {total、indexed、pending、failed}、nodes {total、active}、edges、extraction_tokens

分析

创建分析规则

POST /projects/{projectId}/knowledge/analytics/rules

创建分析规则（推荐或趋势）。

参数：

rule_type (string): 'recommendation' 或 'trend'
name (string): 规则名称
config (object): 规则配置
enabled (bool): 规则是否激活

响应: KBAnalyticsRule 对象

获取推荐

GET /projects/{projectId}/knowledge/analytics/recommendations

获取推荐。查询参数：rule_id、source_id（均为必填）、limit。

响应: recommendations[]、total

获取趋势

GET /projects/{projectId}/knowledge/analytics/trends

获取趋势聚合。查询参数：node_id（必填）。

响应: trends[]、total

记录反馈

POST /projects/{projectId}/knowledge/analytics/feedback

记录推荐反馈（展示/转化）。

参数：

source_node_id (string): 来源节点 ID
target_node_id (string): 目标节点 ID
rule_id (string): 分析规则 ID
converted (bool): 用户是否转化
score_at_time (float): 推荐展示时的评分

响应: status: 'recorded'

用户引导

预加载用户元数据和内容，使 AI 智能体从首次对话起就"了解"用户。元数据（姓名、公司、职位）成为即时事实；内容块（文本、聊天记录）通过 LLM 提取异步处理。

引导用户

POST /agents/{agentId}/users/{userId}/prime

使用元数据和内容引导用户。返回 202 及任务 ID；内容的 LLM 提取异步运行。

参数：

display_name (string): 用户显示名称
metadata (object): company、title、email、phone、custom (map)
content[] (array): 内容块：type ('text'、'chat_transcript')、body
source (string): 来源标识符（如 'crm'、'linkedin'）

响应: job_id、status ('queued')、facts_created

获取引导状态

GET /agents/{agentId}/users/{userId}/prime/{jobId}

获取引导任务的状态。

响应: ImportJob 对象 (job_id、status、facts_created、error_message 等)

内容

添加内容

POST /agents/{agentId}/users/{userId}/content

添加用于异步 LLM 提取的内容块（如引导后追加聊天记录）。

参数：

content[] (array): 内容块：type、body
source (string): 来源标识符

响应: job_id、status ('queued')

元数据

获取用户元数据

GET /agents/{agentId}/users/{userId}/metadata

获取用户的引导元数据。

响应: UserPrimingMetadata 对象

更新用户元数据

PATCH /agents/{agentId}/users/{userId}/metadata

部分更新引导元数据。更新字段会自动生成新事实。

参数：

display_name (string): 更新的名称
company (string): 更新的公司
title (string): 更新的职位
email (string): 更新的邮箱
phone (string): 更新的电话
custom (map): 自定义键值对（合并）

响应: metadata（已更新）、facts_created

批量导入

批量导入用户

POST /agents/{agentId}/users/import

在单个请求中导入多个用户及其元数据和内容。元数据事实同步创建；内容提取异步运行。

参数：

users[] (array): {user_id、display_name、metadata、content[]} 数组
source (string): 来源标识符

响应: job_id、status ('queued')、total_users、facts_created

获取导入状态

GET /agents/{agentId}/users/import/{jobId}

获取批量导入任务的状态。

响应: ImportJob 对象

列出导入任务

GET /agents/{agentId}/users/imports

列出智能体最近的导入任务。查询参数：limit（默认 20）。

响应: jobs[]、count

共享类型

BackendContext

custom_fields (map<string, string>): 传递到提示的任意应用特定键值对
state_json (bytes): 可选的结构化状态 JSON（透传到提示）

CEBig5Scores

openness (double): 经验开放性 (0.0-1.0)
conscientiousness (double): 组织性和纪律性 (0.0-1.0)
extraversion (double): 社交能量和热情 (0.0-1.0)
agreeableness (double): 温暖和合作性 (0.0-1.0)
neuroticism (double): 情绪敏感性 (0.0-1.0)
confidence (double): 评估置信度 (0.0-1.0)

BFASFacets

从 Big5 分数推导。在人格档案响应中为只读。

intellect (double): Openness 方面 — 知识好奇心 (0.0-1.0)
aesthetic (double): Openness 方面 — 美学敏感度 (0.0-1.0)
industriousness (double): Conscientiousness 方面 — 自律性 (0.0-1.0)
orderliness (double): Conscientiousness 方面 — 对秩序的偏好 (0.0-1.0)
enthusiasm (double): Extraversion 方面 — 积极情绪 (0.0-1.0)
assertiveness (double): Extraversion 方面 — 社交主导性 (0.0-1.0)
compassion (double): Agreeableness 方面 — 共情 (0.0-1.0)
politeness (double): Agreeableness 方面 — 对他人的尊重 (0.0-1.0)
withdrawal (double): Neuroticism 方面 — 退缩倾向 (0.0-1.0)
volatility (double): Neuroticism 方面 — 情绪不稳定性 (0.0-1.0)

BehavioralTraits

从人格推导。在人格档案响应中为只读。

response_length (string): 智能体回复的冗长或简洁程度
question_frequency (string): 智能体提出后续问题的频率
empathy_style (string): 情感支持方式（认同型、解决方案型等）
conflict_approach (string): 处理分歧的方式（顺应型、直接型、调解型等）

MoodState

valence (double): 愉悦/不悦光谱 (0-100)
arousal (double): 激活/能量水平 (0-100)
tension (double): 压力/平静状态 (0-100)
affiliation (double): 社交温暖/亲密度 (0-100)

CEChatMessage

role (string): "user" 或 "assistant"
content (string): 消息文本
timestamp (Timestamp): 消息发送时间

MemoryCandidate

content (string): 记忆内容文本
fact_type (string): preference、commitment、fact、experience、correction
importance (double): 重要性评分 (0.0-1.0)
entities (string[]): 相关实体

Habit

name (string): 习惯名称
category (string): 习惯类别
strength (double): 当前强度 (0.0-1.0)
last_observed (Timestamp): 最后观察时间
is_formed (bool): 习惯是否已完全形成

CEGoal

id (string): 目标标识符
description (string): 目标描述
status (string): "active"、"completed"、"abandoned"
priority (string): 优先级
related_traits (string[]): 相关人格特质
created_at (Timestamp): 目标创建时间

Interest

topic (string): 兴趣主题
category (string): 兴趣类别
confidence (double): 检测置信度 (0.0-1.0)
discovered_at (Timestamp): 兴趣发现时间
research_status (string): "pending" 或 "researched"

CEModelConfig

provider (string): LLM 提供商名称
model (string): 模型标识符
temperature (double): 采样温度
max_tokens (int32): 最大生成 token 数

ProactiveMessage

message_id (string): 唯一消息标识符
agent_id (string): 生成消息的智能体
user_id (string): 目标用户
wakeup_id (string): 关联的唤醒事件
check_type (string): 检查类型 (check_in、follow_up、mood_driven)
intent (string): 智能体想要联系的原因
generated_message (string): 实际消息文本
status (string): pending、consumed、expired、failed_generation
created_at (Timestamp): 生成时间

端点详解

The endpoints below cover everything you might need.

Pattern 1 minimum: sessions.start → loop of session.context() + session.turn() → session.end().
Pattern 2 minimum: just /process (auto-creates a session).

sessions.start — open a Session handle

Opens a session and returns a Session object that owns agentId, userId, sessionId, and provider/model defaults.

const session = await client.agents.sessions.start("agent-id", {
userId: "user-123",
sessionId: "session-abc",
userDisplayName: "Alice",
toolDefinitions: yourTools,                    // optional
provider: "gemini",                            // optional default for .turn()
model: "gemini-3.1-flash-lite-preview",        // optional default for .turn()
});

session.context() — enriched 7-layer context

Fetches the 7-layer enriched context: personality, mood, relevant memories, active goals, habits, relationship state, and proactive signals. Pass a query matching the current topic for best memory recall.

const ctx = await session.context({ query: "What should we talk about?" });

// ctx is a flat object — no nested envelope. Useful fields:
//   personality_prompt        — agent identity / system instructions
//   bio, speech_patterns      — agent identity bits
//   true_interests, true_dislikes
//   big5, dimensions, preferences, behaviors
//   recent_personality_shifts, significant_moments, active_goals, habits
//   current_mood, emotional_state
//   loaded_facts              — recalled facts (each has atomic_text, fact_type, importance)
//   long_term_summaries       — multi-session digests
//   proactive_memories        — pending proactive signals
//   constellation_patterns    — deeper behavioral patterns
//   relationship_narrative, chemistry_score, love_from_agent, love_from_user
//   knowledge.results         — KB hits for the query (only nested key)
//   recent_turns              — buffered messages from this session
//   backend_context           — custom application state (if set)

session.turn() — submit a single turn (Pattern 1)

POST /agents/{agentId}/sessions/{sessionId}/turn — sync mood update inline (~300–500ms), deeper extraction continues in the background (5–15 seconds). Accepts role: "tool" and tool_calls on assistant messages.

const { mood, extraction_id, extraction_status } = await session.turn({
messages: [
  { role: "user", content: userMessage },
  // intermediate tool calls/results here
  { role: "assistant", content: assistantMessage },
],
// provider/model fall back to the session-level defaults; both are optional.
});

Response shape:

{
  "success": true,
  "mood": { "valence": 0.4, "arousal": 0.2, "tension": -0.1, "affiliation": 0.3 },
  "extraction_id": "ext_abc123",
  "extraction_status": "queued"
}

轮询后台抽取

const status = await session.status(extraction_id);
// { extraction_id: "...", state: "queued" | "running" | "done" | "failed" }

在同一响应中获取下一轮上下文

If you can predict the next user query (or just want to pre-warm with a generic query), pass fetchNextContext on .turn() and the server returns an enriched context inside the same response under next_context. This eliminates one roundtrip on the next render.

const { mood, next_context } = await session.turn({
messages: [...],
fetchNextContext: { query: "any query you'd run on the next turn" },
});

// next_context has the same shape as session.context() — use it directly
// to render the system prompt for the next turn without calling /context.

/process — batch ingest a transcript (Pattern 2)

Send a full transcript and run extraction immediately. Auto-creates a session if sessionId is omitted; the response surfaces the auto-generated session_id.

const result = await client.agents.process("agent-id", {
userId: "user-123",
// sessionId omitted — auto-created
messages: [
  { role: "user", content: userMessage },
  { role: "assistant", content: assistantMessage },
  // tool messages allowed too
],
provider: "gemini",                            // optional
model: "gemini-3.1-flash-lite-preview",        // optional
});

console.log(result.session_id);          // auto-generated when not passed
console.log(result.facts_extracted);     // count of facts extracted this call
console.log(result.side_effects);        // { mood_updated: true, ... summary counts }

// Then read the extracted state back via the dedicated endpoints:
const memory = await client.agents.memory.list("agent-id", { userId: "user-123" });
const mood   = await client.agents.getMood("agent-id", { userId: "user-123" });

The response is intentionally a small summary — { success, facts_extracted, side_effects, session_id }. To inspect the extracted facts/personality/mood/habits themselves, call the dedicated read endpoints (see Reading Behavioral Data below).

sessions.end / session.end()

Closes the session. If you call this without messages (after using /turn or /process), it's a finalize-only call. If you call it with messages and skipped /process, this becomes your extraction trigger — functionally equivalent to /process, but lifecycle-scoped and async-capable on tenants where enabled.

// Just close — no extraction needed if you used /turn or /process already.
await session.end({ totalMessages: 12, durationSeconds: 600 });

// OR — pass messages here as the extraction trigger (Option B).
await session.end({
messages: transcript,
totalMessages: transcript.length,
durationSeconds: 600,
});

工具消息

Both /turn and /process accept OpenAI/Anthropic-style tool messages. Sonzai's extractor reads tool results and can capture facts that only appeared in tool output.

{
  "messages": [
    { "role": "user", "content": "Where did my last order ship from?" },
    {
      "role": "assistant",
      "tool_calls": [
        {
          "id": "call_1",
          "type": "function",
          "function": { "name": "order-lookup", "arguments": "{\"limit\":1}" }
        }
      ]
    },
    {
      "role": "tool",
      "tool_call_id": "call_1",
      "content": "{\"order_id\":\"42\",\"origin\":\"Tokyo\",\"carrier\":\"DHL\"}"
    },
    { "role": "assistant", "content": "Your last order shipped from Tokyo via DHL." }
  ]
}

The extractor will surface a fact like "User's last order (#42) shipped from Tokyo via DHL" — a fact that never appeared in the user's or assistant's own text.

Provider / model 覆盖

Both /turn and /process (and sessions.start / sessions.end) accept optional provider and model fields. Resolution order:

Per-call override on /turn or /process
Session-level default set on sessions.start
Tenant default configured in your account
Platform default — gemini-3.1-flash-lite-preview

Omit the fields entirely and the platform default applies.

读取行为数据

After processing, all behavioral data is available via dedicated endpoints.

记忆与事实

const memory = await client.agents.memory.list("agent-id", { userId: "user-123" });
const results = await client.agents.memory.search("agent-id", { query: "hiking" });

人格与情绪

const personality = await client.agents.personality.get("agent-id");
const mood = await client.agents.getMood("agent-id", { userId: "user-123" });
const shifts = await client.agents.personality.getRecentShifts("agent-id");
const moments = await client.agents.personality.getSignificantMoments("agent-id");

目标、习惯与关系

const goals = await client.agents.listGoals("agent-id");
const habits = await client.agents.listHabits("agent-id", { userId: "user-123" });
const interests = await client.agents.getInterests("agent-id");
const relationships = await client.agents.getRelationships("agent-id");

主动通知

The Context Engine schedules proactive outreach (check-ins, follow-ups) based on conversation patterns. Poll for pending notifications and consume them when delivered.

const notifications = await client.agents.notifications.list("agent-id");

for (const notif of notifications) {
await deliverToUser(notif.user_id, notif.message);
await client.agents.notifications.consume("agent-id", notif.message_id);
}

提取了什么

Memory Facts

Atomic facts (preferences, events, commitments) with importance scoring, deduplication, and topic tagging. Sourced from user, assistant, AND tool messages.

Personality Deltas

Big5 trait shifts (openness, conscientiousness, extraversion, agreeableness, neuroticism) with reasoning.

Mood Changes

4D mood delta (valence, arousal, tension, affiliation). Sync mood lands inline on /turn; richer extraction is deferred.

Habit Detection

New and reinforced behavioral patterns — exercise routines, reading habits, social patterns.

Interest Tracking

Topics the user engages with, categorized by domain with confidence and engagement scores.

Relationship Dynamics

Love score changes with reasoning — tracks rapport, trust, and emotional connection.

Proactive Outreach

Scheduled check-ins and follow-ups based on conversation context (e.g., 'ask about the hike tomorrow').

Emotional Themes

Detected emotional tones — joy, creative spark, feeling overwhelmed, seeking connection, etc.

选择抽取模型

When calling /turn or /process, specify which of our LLM providers to use for extraction. Omitting provider/model falls back to the platform default gemini-3.1-flash-lite-preview.

const models = await client.agents.getModels("agent-id");
// {
//   default_provider: "gemini",
//   default_model: "gemini-3.1-flash-lite-preview",
//   providers: [
//     { provider: "gemini", provider_name: "Google Gemini", default_model: "..." },
//     { provider: "zhipu", provider_name: "Zhipu AI", default_model: "..." },
//     ...
//   ]
// }

独立记忆层

选择集成形态

There are three ways to feed conversations into Sonzai. The first two are batch (you send a transcript after the conversation); the third is real-time (you submit each turn as it happens). Pick exactly one per conversation — chaining them runs extraction twice on the same messages.

A. /process — one-shot batch

Single call. Auto-creates a session if you don't pass one. Best for external LLM transcripts, benchmarks, and any flow without a long-lived session lifecycle.

B. sessions.start → end({ messages }) — lifecycle batch

Open a session, do your full conversation off-platform, then close with the transcript on .end(). Use when you want explicit session boundaries, async polling, or session-scoped tools — but still ingest in one shot.

C. sessions.start → turn() × N → end() — real-time

Open a session and submit each exchange via .turn() as the conversation happens. Sync mood lands inline (~300–500ms); deeper extraction runs asynchronously 5–15s later. Best for chat companions, voice AI, and agent frameworks.

	A. `/process`	B. `sessions.end({ messages })`	C. `sessions.turn()` × N
Calls per conversation	1	2 (`start` + `end`)	2 + N (`start` + N × `turn` + `end`)
Sonzai in the hot path?	No	No	Yes — `.context()` and `.turn()` flank each turn
Context per turn	Pre-session only (optional `getContext` call)	Pre-session only (optional `getContext` call)	Fresh, query-specific via `.context()`
Extraction timing	Whole transcript, inline	Whole transcript, inline (or async on tenants where enabled)	Per-turn — sync mood inline, deeper extraction 5–15s later
Lifecycle ownership	Implicit (auto-session)	Explicit	Explicit
Best for	External transcripts, benchmarks, no-lifecycle ingest	Explicit boundaries + async processing, session-scoped tools, batch ingest	Chat companions, voice AI, agent frameworks

A and B are functionally equivalent for fact extraction — both extract facts and side-effects from the full transcript inline. The only differences are lifecycle ergonomics (B gives you an explicit session and supports async polling) and call count.

C is a different shape: Sonzai is part of every turn instead of seeing the conversation only at the end.

Don't mix shapes within one conversation

Calling .turn() per turn (C) and .end({ messages }) with the same transcript (B) extracts the same messages twice. Pick one shape per conversation. The pattern docs below show C and B/A separately.

The rest of this section groups A and B together as Pattern 2: Post-Session Processing (since they share the same "extract a transcript at the end" semantics) and treats C as Pattern 1: Memory Middleware (real-time turn submission).

各层何时运行 —— 抽取轻量，整合自动

/turn, /process, and sessions.end are intentionally lightweight. They extract facts and a session summary from the transcript and persist them — that's it. The expensive work (cross-session dedup, clustering, diary deepening, decay) is scheduled automatically by the platform and is rate-limited so it doesn't run on every call.

Layer	When it runs	Triggered by	Cost
Sync mood update (Pattern 1 `/turn` only)	Inline, ~300–500ms	Your `.turn()` call	Light — one short LLM call
Background extraction (facts, personality, habits)	5–15 seconds after `/turn`	Automatic — no caller action	Light — one LLM call per chunk
Fact extraction + session summary (batch)	Inline, on every `/process` or `sessions.end({ messages })`	Your call	Light — one LLM call per chunk
Post-session consolidation (dedup, crossref, bundle precompute, pattern detection)	~8 hours after the session ends	Automatic	Medium
Daily consolidation + diary	Once per day	Automatic schedule	Medium
Deep consolidation (wakeup/habit dedup, decay, cluster reconcile, weekly summaries)	Daily / weekly	Automatic schedule	Heavy

This means you can call /turn per turn (Pattern 1), or /process once at the end (Pattern 2), without paying for heavy consolidation each time. The platform de-duplicates and consolidates in the background.

Practical implication

Don't try to "save calls" by skipping /turn between turns. Each call only does sync mood + queues deferred extraction (cheap). Skipping it means losing per-turn behavioral signal. The expensive consolidation runs on its own schedule no matter how many times you call.

下一步去哪

Pattern 1: Memory Middleware (real-time)

Per-turn integration for chat companions, voice AI, and agent frameworks. Includes tool calling and multimodal/image handling.

Pattern 2: Post-Session Batch Processing

One-shot ingest via /process or lifecycle-scoped via sessions.end({ messages }). For tutoring, fitness, CRM, journaling, and any flow that doesn't need Sonzai in the hot path.

Endpoint Walkthrough

Reference for sessions.start, session.context, session.turn, /process, sessions.end, and the read endpoints (memory, mood, personality, goals, habits, notifications).

Knowledge Base & Limitations

How the KB shows up in standalone mode and what's not supported vs. managed mode.

知识库与限制

独立模式下的知识库

自动 —— /context 中的 KB 结果

When you call session.context({ query }) (or GET /context), the endpoint searches the agent's knowledge base and includes matching results in a knowledge field automatically.

{
  "personality_prompt": "You are a helpful AI companion...",
  "big5": { "openness": 0.7, "conscientiousness": 0.6, "extraversion": 0.5, "agreeableness": 0.8, "neuroticism": 0.3 },
  "current_mood": { "valence": 0.4, "arousal": 0.2, "tension": -0.1, "affiliation": 0.3 },
  "loaded_facts": [{ "atomic_text": "User prefers morning workouts", "fact_type": "behavioral", "importance": 0.8 }],
  "active_goals": [{ "description": "Run a 5K by June" }],
  "habits": [{ "label": "Daily exercise" }],
  "knowledge": {
    "results": [
      {
        "content": "Refund policy: customers can request a full refund within 30 days...",
        "label": "Refund Policy",
        "type": "policy",
        "source": "policies.pdf",
        "score": 0.92
      }
    ]
  }
}

学习循环 —— 抽取过程发现知识盲区

After /turn or /process extracts side effects, it also searches the KB with topics found in the conversation. If relevant KB content exists that the agent missed, it stores these as proactive signals — the next session.context() call includes them automatically.

Turn 1: session.context() → (no KB results yet)
       ↓
      chat with your LLM
       ↓
      session.turn() → extracts "hiking gear" as topic
                     → searches KB, finds "Hiking Equipment Guide"
                     → stores as proactive signal

Turn 2: session.context() → includes "Hiking Equipment Guide" from KB
                        + any direct search results for the new query
       ↓
      chat with your LLM (now knows about hiking gear!)

显式 —— 给 agent 框架使用的工具端点

const results = await client.agents.knowledgeSearch("agent-id", {
query: "refund policy",
limit: 5,
});

for (const result of results.results) {
console.log(result.label, result.content);
}

You can also expose this as a function tool to your LLM — see Tool Calling in Pattern 1.

与托管模式的差异

Want to use your own model without managing the chat loop? Consider Custom LLM instead. It lets you point Sonzai at any OpenAI-compatible endpoint while keeping streaming, built-in tools, and per-message extraction fully automatic.

没有内建工具执行

Managed mode calls built-in tools (web search, memory recall, image generation) automatically. In standalone mode you must implement tool calling yourself — the tool-calling loop is yours, but the resulting tool messages flow into /turn or /process for extraction. See the Tool Integration guide.

抽取过程不支持流式

session.context(), /turn, and /process are synchronous request-response calls. Streaming is handled by your own LLM. Background extraction is asynchronous but you poll for state, not stream.

延迟的知识库补充

KB enrichment is deferred — extraction detects knowledge gaps but the next session.context() call surfaces them, not the current turn.

手动抽取触发

You must pick one of the three integration shapes per conversation: /process (one-shot batch), sessions.start → sessions.end({ messages }) (lifecycle batch), or sessions.start → session.turn() per turn → session.end() (real-time). Picking none means the transcript is never seen by the Context Engine and no behavioral data is captured. Picking two — for example calling .turn() per turn and passing messages on .end() — runs extraction twice on the same content. (Heavy consolidation runs on its own schedule and doesn't need to be triggered manually.)

仅文本的记忆管线

Sonzai's extraction reads messages as text. Multimodal content (images, audio) must be bridged to text before submission — see Working with Images & Multimodal Input in Pattern 1.

What's the same in both modes

Extraction quality is identical — both modes use the same LLM pipeline for fact extraction, personality shifts, mood, habits, and consolidation. The 7-layer enriched context from session.context() is the same data the managed chat builds internally.

Pattern 1: 记忆中间件（实时）

You control the LLM. Sonzai handles what that LLM knows about the user.

Open a Session once. For every turn: call session.context({ query }) to pull the enriched user profile, build your system prompt, call your own LLM (with your own tools), then call session.turn({ messages }) to submit just the new exchange. Sync mood updates inline (~300–500ms); deeper extraction (facts, personality, habits) lands asynchronously 5–15 seconds later in the background.

This is the same data model mem0 provides (relevant memories injected before generation), extended with personality evolution, mood tracking, habit detection, goal tracking, proactive outreach scheduling, and relationship dynamics.

┌─────────────┐     ┌──────────────────┐     ┌──────────────┐
│  Your App   │     │   Sonzai API     │     │   Your LLM   │
└──────┬──────┘     └────────┬─────────┘     └──────┬───────┘
     │                     │                       │
     │  sessions.start     │                       │
     │────────────────────>│ (prewarms memory)     │
     │  <── Session ───────│                       │
     │                     │                       │
     │  ─── Per turn ──────────────────────────── │
     │                     │                       │
     │  session.context()  │                       │
     │────────────────────>│                       │
     │  <── enriched ctx ──│                       │
     │    personality, mood│                       │
     │    memories, goals  │                       │
     │                     │                       │
     │  Your LLM loop ─────┼──────────────────────>│
     │  + your tools       │                       │
     │  <── reply ─────────┼───────────────────────│
     │                     │                       │
     │  sendToUser(reply) (no waiting on Sonzai)   │
     │                     │                       │
     │  session.turn()     │                       │
     │────────────────────>│ ⇒ sync mood ~300ms    │
     │  <── mood, status ──│ ⇒ background extraction│
     │                     │   (5–15s)             │
     │                     │                       │
     │  ─── Repeat ────────────────────────────── │
     │                     │                       │
     │  session.end()      │                       │
     │────────────────────>│── consolidate         │
     │                     │   long-term memory    │
     └─────────────────────┴───────────────────────┘

What Sonzai's LLM is used for

session.context() and sessions.start use no Sonzai LLM credits — they are pure reads. session.turn(), /process, and sessions.end({ messages }) use Sonzai's LLM for fact extraction + session summary (light, per-call, billed). Heavy background work — cross-session dedup, clustering, diary, decay — runs on auto-scheduled jobs (8h post-session, daily, weekly) and is billed against the same tenant but not per-call. Your chat LLM is entirely your cost.

核心循环

Open the session once with your provider/model defaults. Then for every turn: get context → call your LLM (running tool calls in your own loop) → submit the turn. End the session when done.

import { Sonzai } from "@sonzai-labs/agents";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

async function runConversation(agentId: string, userId: string) {
const sessionId = `session-${Date.now()}`;
const history: { role: string; content: string }[] = [];

// Open a Session handle. agentId/userId/sessionId and provider/model
// defaults live on the handle so you don't repeat them on every call.
const session = await sonzai.agents.sessions.start(agentId, {
  userId,
  sessionId,
  toolDefinitions: yourTools,                   // optional — register session-scoped tool schemas
  provider: "gemini",                           // optional — default for .turn()
  model: "gemini-3.1-flash-lite-preview",       // optional — default for .turn()
});

async function turn(userMessage: string): Promise<string> {
  // Fresh enriched context for this specific message
  const ctx = await session.context({ query: userMessage });

  // Your LLM — swap in any provider you like
  let reply = await yourLLM.chat({
    system: buildSystemPrompt(ctx),
    messages: [...history, { role: "user", content: userMessage }],
    tools: yourTools,
  });

  // Tool-calling loop is entirely yours — Sonzai is OUT of the loop here.
  const toolMessages: any[] = [];
  while (reply.tool_calls?.length) {
    for (const call of reply.tool_calls) {
      const result = await runYourTool(call);
      toolMessages.push(
        { role: "assistant", tool_calls: [call] },
        { role: "tool", tool_call_id: call.id, content: result },
      );
    }
    reply = await yourLLM.chat({
      system: buildSystemPrompt(ctx),
      messages: [...history, { role: "user", content: userMessage }, ...toolMessages],
      tools: yourTools,
    });
  }

  sendToUser(reply.content); // send first; don't block on Sonzai

  // Submit just the new turn. Sync mood ~300ms, deferred extraction
  // (facts, personality, habits) runs asynchronously 5–15s later.
  // Pass the FULL exchange — including tool calls and tool results —
  // so Sonzai can extract facts from tool outputs too.
  const { mood, extraction_id } = await session.turn({
    messages: [
      { role: "user", content: userMessage },
      ...toolMessages,                          // assistant tool_calls + tool results
      { role: "assistant", content: reply.content },
    ],
  });

  history.push({ role: "user", content: userMessage });
  history.push({ role: "assistant", content: reply.content });

  return reply.content;
}

return { turn, end: () => session.end() };
}

// The /context response is a flat object — there is no nested
// `profile` / `behavioral` / `memory` envelope.
function buildSystemPrompt(ctx: any): string {
const facts = (ctx.loaded_facts ?? []).map((f: any) => `- ${f.atomic_text}`).join("\n");
const goals = (ctx.active_goals ?? []).map((g: any) => g.description).join(", ");
return `${ctx.personality_prompt ?? "You are a helpful AI companion."}
Personality (Big5): ${JSON.stringify(ctx.big5 ?? {})}
Current mood: ${JSON.stringify(ctx.current_mood ?? {})}
Active goals: ${goals || "none"}
Relevant memories:
${facts || "none yet"}`;
}

每一轮都拉取最新上下文

The single most important habit in Pattern 1 is calling session.context(query=user_msg) before every LLM call. This is the load-bearing piece that closes the loop — without it, the LLM doesn't get the fresh mood (which lands inline on .turn()) or the freshly-extracted facts (which land 5–15 seconds after .turn()).

while (conversationActive) {
const userMsg = await getUserInput();

// 1. PULL FRESH CONTEXT — happens every turn, before the LLM call.
//    ctx is a flat object — no `profile` / `behavioral` / `memory` envelope.
const ctx = await session.context({ query: userMsg });

// 2. Build system prompt from the context layers
const systemPrompt = renderPromptFromContext(ctx);

// 3. Run YOUR LLM — Sonzai is OUT of the loop here
const reply = await yourLLM.chat({
  system: systemPrompt,
  messages: [...history, { role: "user", content: userMsg }],
});

// 4. Submit the just-completed turn — sync mood + async deferred extraction
await session.turn({
  messages: [
    { role: "user", content: userMsg },
    { role: "assistant", content: reply.content },
  ],
});
}

function renderPromptFromContext(ctx: any): string {
const parts: string[] = [];
if (ctx.personality_prompt) parts.push(ctx.personality_prompt);
if (ctx.big5) parts.push(`Personality (Big5): ${JSON.stringify(ctx.big5)}`);
if (ctx.speech_patterns?.length) parts.push(`Speech patterns: ${ctx.speech_patterns.join(", ")}`);
if (ctx.current_mood) parts.push(`Current mood: ${JSON.stringify(ctx.current_mood)}`);
const facts = (ctx.loaded_facts ?? []).slice(0, 5).map((f: any) => `- ${f.atomic_text ?? ""}`).join("\n");
if (facts) parts.push(`Relevant memories:\n${facts}`);
const kb = (ctx.knowledge?.results ?? []).slice(0, 3).map((r: any) => `- ${r.label}: ${(r.content ?? "").slice(0, 120)}`).join("\n");
if (kb) parts.push(`Knowledge base:\n${kb}`);
return parts.join("\n\n");
}

Save a roundtrip with fetchNextContext

session.turn() accepts a fetch_next_context={"query": next_user_message} argument (TS: fetchNextContext). When set, the server runs the deferred extraction trigger AND fetches the next /context payload in the same response, returning it under next_context. This eliminates the second roundtrip on the next turn — your client already has the context for turn N+1 by the time turn N has finished. Use this when you can predict the next user query (e.g., for the very next render of context).

Context freshness. Mood updates inline on each .turn() call (~300ms), so the very next .context() reflects the new mood. Personality / facts / inventory land 5–15 seconds after .turn() in the background, so they appear within a turn or two of being mentioned.

Why per-turn. State changes between turns. A user mentioning a new pet on turn 3 means turn 4's context should carry that fact. Skipping .context() between turns means the LLM works from stale state — and the value of a memory layer collapses.

Pass the actual user message as query. session.context() uses the query for memory recall, KB search, and proactive signal selection. Passing the raw user message gives the most relevant pull; passing a static placeholder gives generic context regardless of what the user asked.

工具消息会进入抽取流程

The /turn schema accepts OpenAI/Anthropic-style tool messages: role: "tool" for tool results and tool_calls arrays on assistant messages. Pass the entire intermediate exchange — Sonzai's extractor reads tool results and can capture facts that only appeared in tool output (e.g. "user's last order shipped from Tokyo" from an order-lookup tool).

await session.turn({
messages: [
  { role: "user", content: "Where did my last order ship from?" },
  {
    role: "assistant",
    tool_calls: [{ id: "call_1", type: "function", function: { name: "order-lookup", arguments: "{}" } }],
  },
  {
    role: "tool",
    tool_call_id: "call_1",
    content: '{"order_id":"42","origin":"Tokyo","carrier":"DHL"}',
  },
  { role: "assistant", content: "Your last order shipped from Tokyo via DHL." },
],
});

轮询后台抽取

/turn returns immediately after the sync mood pass. The deeper extraction runs asynchronously and reaches done in 5–15s. You can poll the status if you need to gate something on it:

const { extraction_id } = await session.turn({ messages });

// Optional — only poll if you need to wait for facts/personality before doing something
let status = await session.status(extraction_id);
while (status.state !== "done" && status.state !== "failed") {
await new Promise((r) => setTimeout(r, 1000));
status = await session.status(extraction_id);
}

工具调用

Pattern 1 hands the tool-calling loop entirely to you. Sonzai never executes a tool — but it does read tool calls and tool results out of the messages you submit on /turn, so the extractor can capture facts that surfaced inside a tool output. There are two flavors of tools you'll typically wire up.

A. Your own tools

Use whatever your agent framework provides — @function_tool in the OpenAI Agents SDK, tools= on Anthropic, function declarations on Gemini, @tool in LangChain. The pattern is the same: register the tool with your LLM, run the tool-calling loop on your side, and forward the full exchange (including the assistant's tool_calls message and the role: "tool" result message) to session.turn().

from agents import Agent, Runner, function_tool

@function_tool
def get_current_time() -> str:
    """Return the current time."""
    from datetime import datetime, timezone
    return datetime.now(timezone.utc).isoformat(timespec="seconds")

agent = Agent(name="Companion", tools=[get_current_time], model=gemini_model)
result = Runner.run_sync(agent, user_msg)

# Build the tool-aware messages array Sonzai expects.
sonzai_messages = [
    {"role": "user", "content": user_msg},
    {
        "role": "assistant",
        "content": None,
        "tool_calls": [{
            "id": "call_1",
            "type": "function",
            "function": {"name": "get_current_time", "arguments": "{}"},
        }],
    },
    {"role": "tool", "tool_call_id": "call_1", "content": "2026-05-07T07:30:00Z"},
    {"role": "assistant", "content": result.final_output},
]
session.turn(messages=sonzai_messages)

When the assistant says "It's 7:30 AM" and the user replies "Set my morning standup for 8", Sonzai's extractor sees the tool's actual output, not just the assistant's paraphrase — and can capture "user prefers 8 AM standups" with the right grounding.

B. Sonzai's capabilities as tools

You can also wrap Sonzai's own REST endpoints as tools your LLM can call mid-turn. The two most useful are knowledge base search and memory search — both let the LLM pull additional context on demand without you having to inject everything up-front through session.context().

// TypeScript — agents.memory.search is available directly
import { Sonzai } from "@sonzai-labs/agents";
import { tool } from "ai";
import { z } from "zod";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

const kbSearch = tool({
description: "Search the agent's knowledge base.",
parameters: z.object({ query: z.string() }),
execute: async ({ query }) => {
  const res = await sonzai.agents.knowledgeSearch("agent-id", { query, limit: 5 });
  return res.results.map((r) => `- ${r.label}: ${r.content}`).join("\n") || "No matching knowledge.";
},
});

const memorySearch = tool({
description: "Search the user's long-term memory.",
parameters: z.object({ query: z.string() }),
execute: async ({ query }) => {
  const res = await sonzai.agents.memory.search("agent-id", {
    query,
    user_id: "user-123",
    limit: 5,
  });
  return res.results.map((r) => `- ${r.text}`).join("\n") || "No matching memories.";
},
});

Why expose Sonzai endpoints as tools?

session.context() returns the most relevant facts for the current query — a strong default. Exposing kb_search and memory_search as tools lets the LLM decide for itself when to dig deeper (e.g., when the user asks "what did I tell you last week about X?"). It's especially useful for agent frameworks that already think in terms of tools.

When the LLM calls these tools, the result lands in your tool-calling loop just like any other tool. Forward the full exchange to session.turn() and Sonzai's extractor will see the search results too — but be aware that re-extracting facts from a memory_search tool result can create echoes (the user's own past fact resurfaces as if it were new). Either skip extraction for those tool messages on your side, or trust the dedup pass.

For deeper coverage of Sonzai's tool endpoints, see the Tool Integration guide.

What's available as a tool

Sonzai endpoint	SDK method	Useful as an LLM tool?
Knowledge base search	`agents.knowledge_search(agent_id, query, limit)`	Yes — LLM looks up policies, products, docs
Memory search	`agents.memory.search(agentId, { query, userId })` (TS/Go); `agents.memory.list_facts(agent_id, user_id)` (Python)	Yes — LLM looks up past user statements
Mood / personality / habits / goals reads	`agents.get_mood`, `agents.personality.get`, `agents.list_habits`, `agents.list_goals`	Mostly inject via `session.context()` instead — read-only state changes rarely with the user query
Image generation	`generation.generate_image`	Possible, but typically your app exposes this as its own UI action, not as an LLM tool

处理图像与多模态输入

Sonzai's memory pipeline is text-based today. The /turn and /process endpoints accept string content only — DialogueMessage.content is string. Your LLM can be fully multimodal (Gemini, Claude, GPT-4o all accept image URLs and audio natively) but to get image-related facts into Sonzai you need to bridge the multimodal content into text in the messages you send to /turn.

The recommended pattern is dual-output: have your vision-capable LLM produce both (a) the warm reply you show the user and (b) a hidden [MEMORY: ...] line with a detailed factual description. Strip the [MEMORY: ...] line out before showing the user, and embed it in the bridged text you submit to Sonzai.

import OpenAI from "openai";
import { Sonzai } from "@sonzai-labs/agents";

const gemini = new OpenAI({
baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/",
apiKey: process.env.GEMINI_API_KEY!,
});
const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

const SYSTEM_PROMPT_IMAGE_AWARE = `You are a friendly companion. When the user shares an image, respond warmly
to what's emotionally important to THEM.

After your reply, ALWAYS include a single line:
[MEMORY: <detailed factual description of the image — setting, objects,
people, mood, time of day, what the user appears to be doing>]

The user does NOT see the [MEMORY: ...] line.`;

async function processImageTurn(session: any, userMsg: string, imageUrl: string): Promise<string> {
const result = await gemini.chat.completions.create({
  model: "gemini-3.1-flash-lite-preview",
  messages: [
    { role: "system", content: SYSTEM_PROMPT_IMAGE_AWARE },
    {
      role: "user",
      content: [
        { type: "text", text: userMsg },
        { type: "image_url", image_url: { url: imageUrl } },
      ],
    },
  ],
});
const raw = result.choices[0].message.content ?? "";

// Split the dual output
const m = raw.match(/\[MEMORY:\s*([\s\S]+?)\]/);
const memoryNote = m ? m[1].trim() : "";
const reply = raw.replace(/\[MEMORY:[\s\S]+?\]/, "").trim();

sendToUser(reply);

await session.turn({
  messages: [
    { role: "user", content: `${userMsg}\n\n[Image attached: ${memoryNote}, URL: ${imageUrl}]` },
    { role: "assistant", content: reply },
  ],
});
return reply;
}

Why this pattern:

No backend multimodal yet. /turn accepts string content. Text-bridging through your same vision-capable LLM is the cleanest workaround.
Why dual-output (vs. a separate vision call). The same LLM call serves both purposes — no extra cost, no extra latency, no second roundtrip. You're already paying for vision on the assistant turn; let it produce the description too.
Why a hidden line. Keeps user-facing replies emotionally warm — "Oh you have such nice shoulders!" — while still capturing the factual detail (gym, tank top, mirror, time of day) that memory extraction needs.
It's a developer pattern, not a Sonzai field. The [MEMORY: ...] convention is yours to define. Sonzai just sees text. You can use any sentinel — <<MEM>>...<</MEM>>, JSON, whatever your prompt and parser agree on.

Including the URL. Embedding the URL in the bridged text isn't required, but it lets Sonzai later surface the image as a memory artifact ("the photo you shared last week") without re-running vision on the image. Your app keeps using its own image storage; Sonzai just remembers the link as text.

Audio & voice follow the same pattern

Speech-to-text (STT) on your side, send the transcript in messages. Text-to-speech (TTS) is rendered after the assistant text exists, so you forward the assistant text to session.turn() exactly as you would for a text-only chat. See the Voice AI use case below.

Future direction

Sonzai may extend the /turn schema to accept OpenAI's multimodal content blocks directly (content: [{type: "text"}, {type: "image_url"}]) with platform-side vision extraction, removing the manual bridging step. Today, text-bridging via the dual-output pattern is the supported approach.

使用场景：聊天伙伴（OpenAI Agents SDK + Gemini）

The canonical Pattern 1 example. You bring your own agent harness — here the OpenAI Agents SDK — and route it at Gemini via the OpenAI-compat endpoint, so no OPENAI_API_KEY is ever used. Sonzai sits outside the LLM/tool-calling loop entirely: it supplies the system prompt via session.context() and ingests the finished transcript via session.turn(). The Agents SDK does all multi-step reasoning and tool dispatch on your side; Sonzai does memory.

import os
from openai import AsyncOpenAI
from agents import (
    Agent,
    Runner,
    OpenAIChatCompletionsModel,
    function_tool,
    set_tracing_disabled,
)
from sonzai import Sonzai

# The Agents SDK ships traces to OpenAI by default — disable, since we
# have no OpenAI key and aren't talking to OpenAI's servers at all.
set_tracing_disabled(True)

# Point the Agents SDK's AsyncOpenAI client at Gemini's OpenAI-compat URL.
gemini = AsyncOpenAI(
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/",
    api_key=os.environ["GEMINI_API_KEY"],
)
model = OpenAIChatCompletionsModel(
    model="gemini-3.1-flash-lite-preview",
    openai_client=gemini,
)

# Sonzai = memory layer only. It never sees the LLM client.
sonzai = Sonzai(api_key=os.environ["SONZAI_API_KEY"])
session = sonzai.agents.sessions.start(
    "agent-id",
    user_id="user-123",
    session_id="session-abc",
)

@function_tool
def get_current_time() -> str:
    """Return the current time."""
    from datetime import datetime, timezone
    return datetime.now(timezone.utc).isoformat(timespec="seconds")

while True:
    user_msg = input("You: ")
    if not user_msg:
        break

    # 1) Pull enriched context (mood, personality, relevant facts, …) from Sonzai.
    ctx = session.context(query=user_msg)

    mood = ctx.get("current_mood") or "neutral"
    instructions = f"You are a friendly companion. Current mood: {mood}."

    # 2) Run the Agents SDK loop — it handles tool-calling and multi-step reasoning.
    agent = Agent(
        name="Companion",
        instructions=instructions,
        model=model,
        tools=[get_current_time],
    )
    result = Runner.run_sync(agent, user_msg)
    print(f"Assistant: {result.final_output}")

    # 3) Convert the run's items (assistant text + ToolCallItem + ToolCallOutputItem)
    # into Sonzai's tool-aware messages format. See the demo for the implementation.
    sonzai_messages = run_result_to_sonzai_messages(user_msg, result)

    # 4) Submit the turn. `mood` comes back inline (~300ms); facts / personality /
    # inventory are extracted asynchronously and land 5-15s later.
    turn_result = session.turn(messages=sonzai_messages)
    print(f"  -> mood updated: {turn_result.mood}")

session.end()

What's happening on each turn:

Sonzai is out of the LLM loop. The OpenAI Agents SDK runs the model, dispatches tools, and produces result.final_output. Sonzai never sees the LLM client and has no opinion on which model answered.
Mood is real-time. session.turn() returns fresh mood inline in ~300ms — you can render it the moment the response arrives.
Facts, personality drift, and inventory are deferred (5-15s). They run async under the returned extraction_id. Re-poll agents.memory.list_facts, agents.personality.get, etc. on the next turn; whatever didn't land yet will be there shortly.
Tool calls flow through to extraction. Sonzai's tool-aware message format accepts assistant messages with tool_calls plus a tool message carrying the result. The conversion helper packages the Agents SDK's ToolCallItem + ToolCallOutputItem into that shape so extraction can pick up facts from tool outputs too.

Want a working version? See the OpenAI Agents companion demo — a two-pane Streamlit app showing live mood, Big5, recent facts, inventory, and the constellation graph as you chat.

使用场景：语音 AI 助手

STT → enrich → LLM → TTS. Sonzai holds the memory; you own the audio pipeline. Submit the turn while TTS is synthesizing — sync mood is fast enough not to block, and deferred extraction never blocks.

import { Sonzai } from "@sonzai-labs/agents";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

async function processVoiceTurn(
session: any, // Session handle from sonzai.agents.sessions.start
audioBuffer: Buffer
): Promise<Buffer> {
// Your STT
const transcript = await yourSTT.transcribe(audioBuffer);

// Inject memory into a concise voice-friendly system prompt
const ctx = await session.context({ query: transcript });

const systemPrompt = `${ctx.personality_prompt ?? "You are a voice companion."} Keep replies under 2 sentences for voice.
Mood: ${JSON.stringify(ctx.current_mood)}.
Key memory: ${ctx.loaded_facts?.[0]?.atomic_text ?? "none"}.`;

const reply = await yourLLM.chat({ system: systemPrompt, message: transcript });

// Submit the turn while TTS synthesizes (run in parallel)
const [audioResponse] = await Promise.all([
  yourTTS.synthesize(reply),
  session.turn({
    messages: [
      { role: "user", content: transcript },
      { role: "assistant", content: reply },
    ],
  }),
]);

return audioResponse;
}

使用场景：Agent 框架（LangChain / LlamaIndex）

Sonzai injects user context into the agent's system prompt. The framework handles tool calling, multi-step reasoning, and memory of the current conversation; Sonzai handles what the agent knows about the user across sessions. Send the full transcript including any tool messages to session.turn() so extraction can pick up facts from tool results.

import { ChatOpenAI } from "@langchain/openai";
import { SystemMessage, HumanMessage, AIMessage } from "@langchain/core/messages";
import { Sonzai } from "@sonzai-labs/agents";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });
const llm = new ChatOpenAI({ model: "gpt-4o", tools: yourToolSchemas });

async function agentTurn(
session: any,
userInput: string,
messageHistory: (HumanMessage | AIMessage)[]
): Promise<string> {
const ctx = await session.context({ query: userInput });

const messages = [
  new SystemMessage(buildSystemPrompt(ctx)),
  ...messageHistory,
  new HumanMessage(userInput),
];

// Run the agent's full tool-calling loop on your side, then surface
// every intermediate message (assistant tool_calls + tool results)
// to Sonzai so it can extract from them.
const { reply, intermediate } = await runLangchainAgent(llm, messages);

await session.turn({
  messages: [
    { role: "user", content: userInput },
    ...intermediate,
    { role: "assistant", content: reply },
  ],
});

return reply;
}

使用场景：多 LLM 路由

Route to different models based on task type while Sonzai stitches user memory across all of them. The Session-level provider/model default is just a default — every .turn() can override.

import Anthropic from "@anthropic-ai/sdk";
import { GoogleGenAI } from "@google/genai";
import { Sonzai } from "@sonzai-labs/agents";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });
const claude = new Anthropic();
const gemini = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY! });

type TaskType = "creative" | "analytical" | "casual";

function classifyTask(message: string): TaskType {
if (/write|story|poem|imagine/i.test(message)) return "creative";
if (/analyze|compare|explain|why/i.test(message)) return "analytical";
return "casual";
}

async function routedTurn(session: any, userMessage: string): Promise<string> {
const ctx = await session.context({ query: userMessage });
const systemPrompt = buildSystemPrompt(ctx);
const task = classifyTask(userMessage);

let reply: string;

if (task === "creative") {
  const response = await claude.messages.create({
    model: "claude-sonnet-4-6",
    max_tokens: 1024,
    system: systemPrompt,
    messages: [{ role: "user", content: userMessage }],
  });
  reply = response.content[0].type === "text" ? response.content[0].text : "";
} else {
  const response = await gemini.models.generateContent({
    model: "gemini-2.5-flash",
    contents: [{ role: "user", parts: [{ text: systemPrompt + "\n\n" + userMessage }] }],
  });
  reply = response.text ?? "";
}

// Same .turn() call regardless of which chat model answered.
await session.turn({
  messages: [
    { role: "user", content: userMessage },
    { role: "assistant", content: reply },
  ],
});

return reply;
}

使用场景：隐私优先（送入 LLM 前匿名化）

Redact PII from the enriched context before it reaches your LLM. Only structured extracted facts are stored by Sonzai — never raw text.

async function privacyTurn(session: any, userMessage: string): Promise<string> {
const ctx = await session.context({ query: userMessage });

// Scrub PII from facts before they reach your LLM
const sanitizedFacts = (ctx.loaded_facts ?? []).map((f: any) => ({
  ...f,
  atomic_text: redactPII(f.atomic_text), // your PII redaction logic
}));

const sanitizedCtx = { ...ctx, loaded_facts: sanitizedFacts };
const systemPrompt = buildSystemPrompt(sanitizedCtx);

const reply = await yourLLM.chat({ system: systemPrompt, message: userMessage });

// Send unredacted transcript to Sonzai for extraction
// (Sonzai stores structured facts, not raw text)
await session.turn({
  messages: [
    { role: "user", content: userMessage },
    { role: "assistant", content: reply },
  ],
});

return reply;
}

下一步

Pattern 2: Post-Session Batch Processing — when Sonzai shouldn't be in the hot path
Endpoint walkthrough — full reference for sessions.start, context, turn, process, end, and read endpoints
KB & limitations — knowledge base behavior in standalone mode and what's not supported

Pattern 2: 会话后批处理

You own the entire conversation. Sonzai never sees it in real time. When the conversation ends, you send the full transcript to either /process or sessions.end({ messages }). Sonzai extracts facts, updates the user's behavioral profile, and makes the insights available via the API — ready for personalization, analytics, push notifications, or next-session context.

This pattern is ideal when Sonzai being in the hot path is undesirable (or impossible) — latency-sensitive real-time interactions, apps with their own LLM loop already in production, or cases where you want to process transcripts in bulk after the fact.

┌─────────────┐     ┌──────────────────┐     ┌──────────────┐
│  Your App   │     │   Sonzai API     │     │   Your LLM   │
└──────┬──────┘     └────────┬─────────┘     └──────┬───────┘
     │                     │                       │
     │  GET /context       │                       │
     │────────────────────>│ (optional pre-session │
     │  <── user profile ──│  personalization)     │
     │                     │                       │
     │  ══ Your conversation (Sonzai not involved) ═════════│
     │                     │                       │              │
     │  Chat ──────────────┼──────────────────────>│             │
     │  <── reply ─────────┼───────────────────────│             │
     │  [N turns, your loop, your tools]            │             │
     │                     │                       │             │
     │  ════════════════════════════════════════════════════════│
     │                     │                       │
     │  /process or sessions.end({ messages })     │
     │────────────────────>│── extract facts,      │
     │  (full transcript)  │   personality, mood,  │
     │                     │   habits, interests   │
     │  <── extractions ───│   (Sonzai LLM)        │
     │                     │                       │
     │  Use insights       │                       │
     │  (push notif,       │                       │
     │   dashboard update, │                       │
     │   exercises, etc.)  │                       │
     └─────────────────────┴───────────────────────┘

Pick one trigger, not both

/process and sessions.end({ messages }) are functionally equivalent for batch ingest — both extract facts and side-effects from the full transcript inline. Don't do both for the same transcript or extraction runs twice. Use /process if you want a single call (it auto-creates the session and surfaces the generated session_id in the response). Use sessions.start + sessions.end({ messages }) if you want explicit lifecycle, async polling, or session-scoped tools.

核心步骤

Option A — /process only. One call. Auto-creates a session if you don't pass one. Returns the auto-generated session_id so you can correlate later.

import { Sonzai } from "@sonzai-labs/agents";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

async function processTranscript(
agentId: string,
userId: string,
transcript: { role: "user" | "assistant" | "tool"; content: string; tool_calls?: any[] }[]
) {
const result = await sonzai.agents.process(agentId, {
  userId,
  messages: transcript,                          // tool messages allowed
  provider: "gemini",                            // optional override
  model: "gemini-3.1-flash-lite-preview",        // optional override
});

// result.session_id is the auto-created session id when none was passed.
// Read the extracted facts/mood/etc. via the dedicated endpoints below.
return result;
}

Option B — Explicit sessions.start + sessions.end({ messages }). Use this when you want async processing, session-scoped tools, or explicit lifecycle ownership.

async function processTranscript(
agentId: string,
userId: string,
transcript: { role: "user" | "assistant" | "tool"; content: string }[]
) {
const sessionId = `session-${Date.now()}`;

const session = await sonzai.agents.sessions.start(agentId, { userId, sessionId });

// Pass the full transcript on end — extraction happens here, not via /process.
// sessions.end({ messages }) is functionally equivalent to /process({ messages }).
const result = await session.end({
  messages: transcript,
  totalMessages: transcript.length,
});

return result;
}

Pick one. The two options are equivalent for fact extraction — chaining them just runs extraction twice on the same messages.

使用场景：AI 辅导应用

Before the session, pull the student's profile to personalize the curriculum. After the session, extract what was learned and generate targeted practice exercises. One call to /process is enough.

import { Sonzai } from "@sonzai-labs/agents";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

async function beforeTutoringSession(agentId: string, studentId: string, topic: string) {
const ctx = await sonzai.agents.getContext(agentId, {
  userId: studentId,
  query: `${topic} concepts learned struggles difficulty level`,
});

const knownConcepts = (ctx.loaded_facts ?? [])
  .filter((f: any) => f.fact_type === "semantic")
  .map((f: any) => f.atomic_text);

const weakAreas = (ctx.habits ?? [])
  .filter((h: any) => h.label?.toLowerCase().includes("struggle"))
  .map((h: any) => h.label);

return { knownConcepts, weakAreas };
}

async function afterTutoringSession(
agentId: string,
studentId: string,
topic: string,
transcript: { role: "user" | "assistant"; content: string }[]
) {
await sonzai.agents.process(agentId, { userId: studentId, messages: transcript });

const [memory, interests, mood] = await Promise.all([
  sonzai.agents.memory.list(agentId, { userId: studentId }),
  sonzai.agents.getInterests(agentId, { userId: studentId }),
  sonzai.agents.getMood(agentId, { userId: studentId }),
]);

const allFacts = Object.values(memory.contents ?? {}).flat();
const conceptsLearned = allFacts
  .filter((f: any) => f.fact_type === "semantic")
  .map((f: any) => f.atomic_text);
const engagedTopics = (interests ?? []).map((i: any) => i.topic);
const confidenceLevel = mood?.valence ?? 0;

const exercises = await generateExercises({ topic, conceptsLearned, engagedTopics });

await sendStudentReport(studentId, {
  summary: `Covered: ${conceptsLearned.slice(0, 3).join(", ")}`,
  exercises,
  encouragement: confidenceLevel > 0.2 ? "Great session today!" : "Keep going — this takes practice.",
});

return { conceptsLearned, exercises };
}

使用场景：健身应用

Pull the user's fitness context before the workout for a personalized greeting. After the workout, send the session log to Sonzai to track habits, mood, and progress — without Sonzai ever being in the real-time exercise loop.

import { Sonzai } from "@sonzai-labs/agents";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

async function beforeWorkout(agentId: string, userId: string): Promise<string> {
const ctx = await sonzai.agents.getContext(agentId, {
  userId,
  query: "fitness goals workout habits recent exercise progress",
});

const goals = (ctx.active_goals ?? []).map((g: any) => g.description);
const recentHabits = (ctx.habits ?? []).map((h: any) => h.label);

const greeting = await yourLLM.chat({
  system: "Generate a short, energetic workout motivation message (2 sentences max).",
  message: `User goals: ${goals.join(", ")}. Recent habits: ${recentHabits.join(", ")}.`,
});

await playVoiceMessage(userId, greeting);
return greeting;
}

async function afterWorkout(
agentId: string,
userId: string,
workoutTranscript: { role: "user" | "assistant"; content: string }[]
) {
await sonzai.agents.process(agentId, { userId, messages: workoutTranscript });

const [memory, habits, mood] = await Promise.all([
  sonzai.agents.memory.list(agentId, { userId }),
  sonzai.agents.listHabits(agentId, { userId }),
  sonzai.agents.getMood(agentId, { userId }),
]);

const habitsReinforced = habits ?? [];
const allFacts = Object.values(memory.contents ?? {}).flat();
const personalRecords = allFacts.filter((f: any) =>
  /record|pb|best|personal/i.test(f.atomic_text ?? "")
);

await sendPushNotification(userId, {
  title: "Workout complete",
  body: personalRecords.length > 0
    ? `New record: ${personalRecords[0].atomic_text}`
    : `Great session — ${habitsReinforced[0]?.label ?? "keep it up"}!`,
});
}

使用场景：销售 CRM 智能

Your sales team runs calls through their existing tooling (Gong, Zoom, your own recorder). After each call, send the transcript to Sonzai to build a persistent customer profile.

import { Sonzai } from "@sonzai-labs/agents";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

async function processSalesCall(
agentId: string,
customerId: string,
callId: string,
callTranscript: { role: "user" | "assistant"; content: string }[],
durationSeconds: number
) {
// Use the explicit lifecycle so we can pass durationSeconds.
const session = await sonzai.agents.sessions.start(agentId, {
  userId: customerId,
  sessionId: `call-${callId}`,
});

const result = await session.end({
  messages: callTranscript,
  totalMessages: callTranscript.length,
  durationSeconds,
});

// Read extractions back from the analytics endpoints.
const personality = await sonzai.agents.personality.get(agentId);

// ...build CRM update from result + dedicated read endpoints
return result;
}

使用场景：语言学习应用

Track vocabulary mastered, grammar struggles, pronunciation patterns, and learning pace across lessons.

import { Sonzai } from "@sonzai-labs/agents";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

async function afterLanguageLesson(
agentId: string,
studentId: string,
targetLanguage: string,
lessonTranscript: { role: "user" | "assistant"; content: string }[]
) {
await sonzai.agents.process(agentId, { userId: studentId, messages: lessonTranscript });

const [memory, interests, mood] = await Promise.all([
  sonzai.agents.memory.list(agentId, { userId: studentId }),
  sonzai.agents.getInterests(agentId, { userId: studentId }),
  sonzai.agents.getMood(agentId, { userId: studentId }),
]);

const allFacts = Object.values(memory.contents ?? {}).flat();
const newVocab = allFacts
  .filter((f: any) => f.fact_type === "semantic")
  .map((f: any) => f.atomic_text);
const engagementAreas = interests ?? [];
const confidenceDelta = mood?.valence ?? 0;

await updateLearningDashboard(studentId, {
  language: targetLanguage,
  vocabularyAdded: newVocab.length,
  newWords: newVocab,
  strongestArea: engagementAreas[0]?.topic ?? "conversation",
  confidenceTrend: confidenceDelta > 0.1 ? "↑" : confidenceDelta < -0.1 ? "↓" : "→",
});

return { newVocab, engagementAreas };
}

使用场景：心理健康日记

Your app handles the journaling conversation. After each session, send to Sonzai to track mood trends, detect emotional breakthroughs, and surface proactive insights.

import { Sonzai } from "@sonzai-labs/agents";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

async function afterJournalingSession(
agentId: string,
userId: string,
journalTranscript: { role: "user" | "assistant"; content: string }[]
) {
await sonzai.agents.process(agentId, { userId, messages: journalTranscript });

const [mood, notifications] = await Promise.all([
  sonzai.agents.getMood(agentId, { userId }),
  sonzai.agents.notifications.list(agentId),
]);

if ((mood?.valence ?? 0) < -0.4) {
  await sendWellnessAlert(userId, {
    message: "It sounds like you're going through a tough time. We're here for you.",
  });
}

for (const notif of notifications) {
  if (notif.user_id === userId) {
    await scheduleReminder(userId, notif.generated_message, notif.scheduled_for);
  }
}

await updateMoodDashboard(userId, { valence: mood?.valence, energy: mood?.arousal });
}

下一步

Pattern 1: Memory Middleware (real-time) — when you want Sonzai-enriched context per-turn (and tool calling / multimodal handling)
Endpoint walkthrough — full reference for sessions.start, context, turn, process, end, and read endpoints
KB & limitations — knowledge base behavior in standalone mode

AI 伴侣 — 快速入门

本快速入门适用于构建 AI 伴侣 — 拥有真实人格、丰富内心世界，并与用户建立随时间演变关系的角色。典型场景：AI 角色、VTuber、个人伴侣、故事 NPC。

**你将构建的内容：**Luna，一个温暖而好奇的伴侣，能记住你的对话、发展真实的关系，并在适当时机主动联系你。

**你将用到的能力：**大五人格、四维情绪、层级记忆、关系追踪、主动唤醒，以及（可选）语音。

1. 获取 API 密钥

前往 platform.sonz.ai，创建项目并生成 API 密钥。

Authorization: Bearer sk_your_api_key

2. 从描述生成智能体

最快捷的方式：用自然语言描述角色，让平台自动推导人格、说话方式和种子记忆。

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

const agent = await client.agents.generation.generateAndCreate({
name: "Luna",
description: "Luna is a warm, creative dreamer who speaks poetically. She loves stargazing, coffee shops at 2am, and asking the question beneath the question.",
language: "en",
});

console.log(agent.agent_id);
console.log(agent.personality); // full Big5 profile derived from the description

你也可以显式定义角色 — 手动设置大五得分、说话方式和详细简介。参见智能体生成。

3. 为关系预热

在用户首次对话前告知智能体这位用户是谁。预热会创建初始记忆树 — 智能体会自然地引用这些信息。

await client.agents.priming.primeUser("agent-id", "user-123", {
displayName: "Sam",
interests: ["astronomy", "lo-fi music", "photography"],
context: "Sam is a night-owl grad student who tends to overthink. They came to Luna after a tough week.",
});

4. 对话 — 伴侣的标准是流式输出

伴侣应该让人感觉鲜活。务必使用流式输出。

for await (const event of client.agents.chatStream({
agent: "agent-id",
userId: "user-123",
messages: [{ role: "user", content: "I can't sleep again." }],
})) {
const delta = event.choices?.[0]?.delta?.content;
if (delta) process.stdout.write(delta);
}

每轮对话结束后，Sonzai 会自动：

将新的事实、事件和承诺提取到记忆中。
根据对话的情感内容更新情绪（快乐度、能量值、平静度、亲密度）。
如果交互揭示了角色方向上的稳定特征，则微调大五特质。
更新关系状态（亲密度得分、化学反应、连续互动天数）。

这些都无需你手动管理。

5. 读取情绪和关系状态

若要驱动 UI — 一个小情绪指示器、按关系等级设置的解锁门控，或根据化学反应变化的主题 — 请获取当前状态。

// 当前情绪（四维）——与人格档案分开读取
const mood = await client.agents.getMood("agent-id", { userId: "user-123" });
console.log(mood.happiness, mood.energy, mood.calmness, mood.affection);

// 当前人格档案（大五、维度、说话方式）
const personality = await client.agents.personality.get("agent-id");
console.log(personality.profile.big5);

6. 让 Luna 主动联系

主动唤醒是伴侣区别于普通聊天机器人的关键。平台会根据关系上下文自动安排唤醒 — 你也可以显式触发。

// Poll periodically (or register a webhook).
const pending = await client.agents.notifications.list("agent-id", {
userId: "user-123",
status: "pending",
});

for (const n of pending.notifications) {
// Render n.content in your UI; mark consumed when shown.
await client.agents.notifications.consume("agent-id", n.notificationId);
}

7. 为 Luna 添加语音（可选）

从全局语音目录中选择一个语音或克隆一个，然后流式播放 TTS 或双工音频。完整接口参见语音。

const voices = await client.voices.list({ language: "en" });
await client.agents.update("agent-id", { voiceId: voices[0].voiceId });

下一步

当前 SDK 版本：TypeScript 1.1.3 · Python 1.1.4 · Go 1.2.0（截至 2026-04-17）

AI 员工与个人 AI — 快速入门

本快速入门适用于构建 AI 员工或个人 AI — 帮助用户完成工作的任务型智能体。典型场景：支持工程师、销售开发代表、收件箱助手、入职引导。

**你将构建的内容：**一个客户支持智能体，它能 (1) 跨会话记住每位用户，(2) 通过自定义工具创建工单和查询订单状态，(3) 从知识库中回答产品问题。

可以跳过的内容：情绪系统。情绪在后台仍然运行，但除非你主动启用，否则不会影响回复。人格保持最简 — 一个专业语调配置就足够了。

1. 创建项目并获取 API 密钥

前往 platform.sonz.ai，创建项目并生成 API 密钥。所有请求使用 Bearer 认证：

Authorization: Bearer sk_your_api_key

2. 创建智能体

给智能体设置一个最简的专业人格 — 高尽责性、中等宜人性、低神经质。这就是任务型智能体所需的全部配置。

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

const agent = await client.agents.create({
name: "Atlas",
bio: "Atlas is a calm, precise support engineer who answers product questions and handles tickets.",
big5: {
  openness: 0.55,
  conscientiousness: 0.85,
  extraversion: 0.5,
  agreeableness: 0.7,
  neuroticism: 0.2,
},
});

console.log(agent.agent_id);

3. 创建每用户实例

对于服务多个终端用户的任务型智能体，使用实例为每位用户在同一智能体定义下提供独立的记忆作用域。

const instance = await client.agents.instances.create("agent-id", {
name: "user-42",
description: "Support context for user 42",
});

所有以 instance_id = "user-42" 为作用域的记忆、自定义状态和通知，与其他用户的上下文完全隔离。

4. 播种智能体对该用户的了解

预加载用户信息，让智能体的第一次回复就能体现上下文 — 无冷启动问题。

await client.agents.memory.seed("agent-id", {
userId: "user-42",
memories: [
  { text: "User's name is Priya Kapoor.", factType: "fact" },
  { text: "Priya is on the Enterprise plan, renewed 2026-03-15.", factType: "fact" },
  { text: "Priya reported a billing issue last week (ticket #4821, resolved).", factType: "event" },
],
});

5. 注册自定义工具

工具让 LLM 在推理过程中调用你的后端。Sonzai 不执行工具 — 它返回工具调用，由你的后端执行，然后你在下一轮将结果传回。

await client.agents.sessions.setTools("agent-id", {
userId: "user-42",
tools: [
  {
    name: "create_ticket",
    description: "Create a support ticket for the user.",
    parameters: {
      type: "object",
      properties: {
        subject: { type: "string" },
        priority: { type: "string", enum: ["low", "normal", "high"] },
      },
      required: ["subject"],
    },
  },
  {
    name: "lookup_order",
    description: "Fetch the latest order status by order ID.",
    parameters: {
      type: "object",
      properties: { orderId: { type: "string" } },
      required: ["orderId"],
    },
  },
],
});

6. 上传知识库

将产品文档、内部 FAQ 或操作手册指向智能体。知识库以项目为作用域 — 项目中的所有智能体均可搜索。

import { readFileSync } from "node:fs";

const buf = readFileSync("./product-manual.pdf");
await client.knowledge.uploadDocument("project-id", "product-manual.pdf", buf, "application/pdf");

启用 knowledge_search 能力后，智能体在对话过程中会自动搜索知识库。

7. 对话

流式获取回复。智能体会自动使用记忆、知识库和工具。

for await (const event of client.agents.chatStream({
agent: "agent-id",
userId: "user-42",
instanceId: instance.instance_id,
messages: [{ role: "user", content: "Hi, did my latest invoice go through?" }],
})) {
const delta = event.choices?.[0]?.delta?.content;
if (delta) process.stdout.write(delta);
}

回复结束后，记忆提取会自动运行 — 智能体会记住发生的事情，无需你做任何操作。

8. 轮询主动通知（可选）

智能体可以安排后续跟进 — 例如"明天回来查看工单 #4821"。定期轮询通知队列，或注册 webhook。

const pending = await client.agents.notifications.list("agent-id", { userId: "user-42", status: "pending" });

下一步

任务完成、通知和 SLA 的实时事件回调。

完整集成指南

深入了解每个 SDK 接口和集成路径。

当前 SDK 版本：TypeScript 1.1.3 · Python 1.1.4 · Go 1.2.0（截至 2026-04-17）

企业级智能体 — 快速入门

本快速入门适用于构建企业级 AI 智能体 — 嵌入业务工作流的智能体。典型场景：CRM 副驾驶、一线支持、内部知识助手、销售资质审核机器人、合规审查员。

**你将构建的内容：**一个销售资质审核智能体，按工作空间运行，通过 webhook 接收 CRM 的交易事件，从产品文档中检索信息，以自定义状态追踪工作流阶段，并在每次发版前运行评估标准。

**你将用到的能力：**多实例隔离、项目级知识库、自定义状态、webhook、工具和评估运行。

1. 创建项目、API 密钥和 webhook 密钥

在 platform.sonz.ai 中创建项目，生成 API 密钥和 webhook 签名密钥。企业部署通常按环境（开发、预发布、生产）划分 API 密钥作用域。

export SONZAI_API_KEY=sk_...
export SONZAI_WEBHOOK_SECRET=whsec_...

2. 创建智能体

使用中性专业人格。保持 neuroticism 低值，避免情绪漂移影响回复。

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

const agent = await client.agents.create({
name: "Pilot-SDR",
bio: "Pilot-SDR qualifies inbound leads, answers product questions, and hands off to humans when appropriate. Professional, concise, and never embellishes.",
big5: {
  openness: 0.5,
  conscientiousness: 0.9,
  extraversion: 0.55,
  agreeableness: 0.65,
  neuroticism: 0.15,
},
});

3. 创建每工作空间实例

每个客户工作空间获得独立实例。以 instance_id = workspace-id 为作用域的记忆、自定义状态和通知保持隔离 — 这对多租户 SaaS 和合规至关重要。

const workspace = await client.agents.instances.create("agent-id", {
name: "acme-corp",
description: "Workspace for Acme Corp account",
});

作用域模式和生命周期管理参见多实例。

4. 将产品文档上传到知识库

知识库以项目为作用域 — 项目中的所有智能体搜索同一语料库。可上传 PDF、推送结构化实体，或两者兼用。

import { readFileSync } from "node:fs";

const pdf = readFileSync("./product-one-pager.pdf");
await client.knowledge.uploadDocument("project-id", "product-one-pager.pdf", pdf, "application/pdf");

// Or push structured facts
await client.knowledge.insertFacts("project-id", {
entities: [
  { label: "Plan: Growth", properties: { price: 299, seats: 10 } },
  { label: "Plan: Enterprise", properties: { price: "custom", seats: "unlimited" } },
],
});

5. 用自定义状态追踪工作流阶段

每笔交易或案例都处于某个阶段。将其存储为自定义状态，让智能体在每次对话时都能看到它，并据此推断下一步行动。

await client.agents.customStates.create("agent-id", {
key: "deal_stage",
value: "discovery",
scope: "per-user-instance",
userId: "[email protected]",
instanceId: workspace.instance_id,
});

6. 注册人工交接和 CRM 同步工具

企业级智能体始终需要一个转给人工的出口，以及写回系统记录的能力。

await client.agents.sessions.setTools("agent-id", {
userId: "[email protected]",
instanceId: workspace.instance_id,
tools: [
  {
    name: "handoff_to_human",
    description: "Escalate this conversation to a human rep and stop the agent.",
    parameters: { type: "object", properties: { reason: { type: "string" } }, required: ["reason"] },
  },
  {
    name: "update_deal_stage",
    description: "Advance the deal to a new stage in the CRM.",
    parameters: {
      type: "object",
      properties: {
        stage: { type: "string", enum: ["discovery", "qualified", "demo", "proposal", "closed_won", "closed_lost"] },
      },
      required: ["stage"],
    },
  },
],
});

7. 注册 webhook 以接收 CRM 事件

当 CRM 中的交易阶段发生变化时，将事件推送到 Sonzai。智能体在下一次对话时将其视为"工作流事件"并自然地做出响应。

await client.webhooks.register("on_wakeup_ready", {
webhookUrl: "https://api.yourcorp.com/sonzai/wakeups",
});

await client.webhooks.register("on_recurring_event_due", {
webhookUrl: "https://api.yourcorp.com/sonzai/schedules",
});

每个 webhook 请求均使用 HMAC-SHA256 签名 — 在执行操作前用你的密钥验证。完整事件目录、重试策略和验证示例参见 Webhook 与通知。

8. 对话

for await (const event of client.agents.chatStream({
agent: "agent-id",
userId: "[email protected]",
instanceId: workspace.instance_id,
messages: [{ role: "user", content: "Which plan fits a team of 12?" }],
})) {
const delta = event.choices?.[0]?.delta?.content;
if (delta) process.stdout.write(delta);
}

9. 通过评估运行把控发版质量

在发布提示词变更或新智能体版本前，用评估标准运行测试。对人格漂移、事实准确性和工具调用正确性进行评分。

// 启动模拟 + 评分运行，立即返回。
const ref = await client.agents.runEvalAsync("agent-id", {
templateId: "template_lead_qualification_v3",
simulationConfig: { turnsPerScenario: 6 },
});

// 评估完成后读取运行记录（或通过 streamEvents 实时流式获取）。
const result = await client.evalRuns.get(ref.runId);
console.log(result.scoreOverall, result.scoresByCategory);

构建评分标准和模拟用户参见评估。

下一步

包含所有端点和 Schema 的完整 REST API。

当前 SDK 版本：TypeScript 1.1.3 · Python 1.1.4 · Go 1.2.0（截至 2026-04-17）

自定义状态与工作流事件

您将构建什么

一个追踪用户进度分数和等级的自定义状态，每次会话后更新
一个工作流事件触发器，当用户达到里程碑时触发，使智能体做出反应
批量读取用户仪表板的所有自定义状态
一个用于从后端进行幂等状态更新的 upsert 模式

什么是自定义状态？

自定义状态是一个键值记录，作用域为 智能体 + 用户（或仅智能体）。值可以是任何 JSON 可序列化类型：字符串、数字、布尔值、数组或嵌套对象。

与记忆（从对话中提取的非结构化文本）不同，自定义状态是您从后端显式写入的结构化数据。智能体可以在对话中通过 get_custom_state 工具读取它们，因此它始终知道用户当前的等级、连续天数、余额等。

自定义状态（您写入）

结构化 JSON 数据
您的后端控制
任务进度、分数、里程碑
通过 SDK 或 REST 更新

记忆（自动提取）

自由格式文本事实
平台从对话中提取
偏好、事件、目标
每条消息后自动更新

1. 创建自定义状态

首次为用户设置状态时调用 create。后续写入应使用 upsert（见步骤 3）以实现幂等更新。

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });
const AGENT_ID = "agent_abc";
const USER_ID  = "user_123";

const state = await client.agents.customStates.create(AGENT_ID, {
userId: USER_ID,
key:    "user_progress",
value: {
  tier:            "silver",
  score:           2340,
  score_to_next:   3000,
  streak_days:     12,
  milestones:      ["first_chat", "50_tasks", "7_day_streak"],
},
});

console.log("Created:", state.state_id, state.key);

2. 在对话中读取状态

当智能体可以访问 get_custom_state 工具时（当存在自定义状态时自动启用），它会在对话开始时获取当前状态。您也可以随时从后端读取。

// Read by key from your backend
const state = await client.agents.customStates.getByKey(AGENT_ID, {
userId: USER_ID,
key: "user_progress",
});

const progress = state.value as {
tier: string; score: number; score_to_next: number; streak_days: number;
};

console.log(`${progress.tier} tier · ${progress.score}/${progress.score_to_next} pts · ${progress.streak_days}-day streak`);

在对话中，智能体调用 get_custom_state("user_progress") 并将进度数据自然地融入其响应——无需提示注入。

3. 幂等更新状态

当用户状态发生变化时——会话结束后、购买后或按计划——从后端使用 upsert。upsert 会在状态不存在时创建，存在时替换。

// Called after each work session ends
async function onSessionEnd(userId: string, sessionScore: number) {
const current = await client.agents.customStates.getByKey(AGENT_ID, {
  userId,
  key: "user_progress",
}).catch(() => null);

const tiers = ["bronze", "silver", "gold", "platinum"];
const prev = (current?.value ?? { tier: "bronze", score: 0, score_to_next: 1000, streak_days: 0 }) as {
  tier: string; score: number; score_to_next: number; streak_days: number; milestones: string[];
};

const newScore    = prev.score + sessionScore;
const promoted    = newScore >= prev.score_to_next;
const tierIndex   = tiers.indexOf(prev.tier);
const newTier     = promoted ? (tiers[tierIndex + 1] ?? prev.tier) : prev.tier;

await client.agents.customStates.upsert(AGENT_ID, {
  userId,
  key: "user_progress",
  value: {
    tier:          newTier,
    score:         promoted ? newScore - prev.score_to_next : newScore,
    score_to_next: promoted ? prev.score_to_next * 1.5 : prev.score_to_next,
    streak_days:   prev.streak_days + 1,
    milestones:    prev.milestones,
  },
});

if (promoted) {
  // Notify the agent so it can congratulate the user next session
  await client.agents.triggerBackendEvent(AGENT_ID, {
    userId,
    eventType: "tier_promotion",
    payload: { new_tier: newTier, previous_tier: prev.tier },
  });
}
}

4. 触发工作流事件

工作流事件允许您的后端告知智能体对话之外发生的事情。下次用户聊天时，智能体会看到待处理的事件并自然地做出反应。

// Trigger from your backend when something notable happens
await client.agents.triggerBackendEvent(AGENT_ID, {
userId: USER_ID,
eventType: "task_complete",
payload: {
  task_name:     "Q1 Revenue Analysis",
  deliverable:   "Revenue Report",
  category:      "Analytics",
  time_taken:    "3h 42m",
},
});

// Next time the user opens a conversation:
// Agent: "I see you finished the Q1 Revenue Analysis! That report is a key
//         deliverable. Want to discuss the findings or start the next task?"

事件传递

工作流事件会被排队，并在下一次对话轮次中传递。它们不会中断活跃会话。智能体在下一次 chat 或 chatStream 调用开始时消费待处理事件，并将其融入开场消息或首次响应中。

5. 列出用户的所有状态

适用于构建管理仪表板、用户资料页面或调试。返回智能体 + 用户对的所有自定义状态。

const { states } = await client.agents.customStates.list(AGENT_ID, {
userId: USER_ID,
});

for (const state of states) {
console.log(`[${state.key}]`, JSON.stringify(state.value, null, 2));
}
// [user_progress]  { tier: "silver", score: 340, ... }
// [preferences]    { theme: "dark", notifications: true }
// [daily_summary]  { last_active: "2025-03-20", sessions_today: 2 }

6. 更新特定字段

当您想通过 state_id 更改状态时使用 update。与 upsert 不同，update 执行部分合并——您只需传递要更改的字段。

// Add a milestone without overwriting the whole state
const state = await client.agents.customStates.getByKey(AGENT_ID, {
userId: USER_ID,
key: "user_progress",
});

const progress = state.value as { milestones: string[]; [k: string]: unknown };

await client.agents.customStates.update(AGENT_ID, state.state_id, {
value: {
  ...progress,
  milestones: [...progress.milestones, "100_tasks"],
},
});

7. 删除状态

通过 ID 或键删除状态。下次对话时，智能体将无法访问该状态。

// Delete by key (finds and removes the state)
await client.agents.customStates.deleteByKey(AGENT_ID, {
userId: USER_ID,
key: "user_progress",
});

// Or delete by state_id if you already have it
await client.agents.customStates.delete(AGENT_ID, stateId);

常见模式

入职状态

在注册时创建一个 onboarding 状态，值为 { step: 0, completed: false }。智能体会在早期对话开始时检查它，并自然地引导用户完成设置。

订阅上下文

存储 { plan: 'pro', expires_at: '...' }，这样智能体就知道应该提供或推销哪些功能，无需在每次聊天请求中传递。

每日摘要缓存

在每天结束时写入一个 daily_summary 状态，包含关键指标。智能体在第二天对话开始时引用用户的活动——"昨天您完成了 3 项任务，连续 12 天打卡。准备好继续了吗？"

下一步

阅读自定义状态与工具参考了解完整 API
添加库存追踪，用资产组合丰富状态
设置 Webhook 以在智能体触发特定事件时通知您的后端
探索人格了解事件如何影响智能体的情感演化

资源库存 + 知识库

您将构建什么

一个包含 market_price 和层级信息的 software_license 实体知识库架构
一个通过 bulkUpdate 将实时定价数据推送到知识库的成本同步管道
一个具有 inventory 能力的智能体，在对话中自动追踪已分配的工具
一个将用户分配与当前知识库定价数据关联的组合查询

1. 定义实体架构

架构告诉知识库应为每种实体类型提取和索引哪些字段。为 software_license 创建一个架构，以便平台知道如何存储和搜索您的许可证数据。

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });
const PROJECT_ID = "proj_abc123";

await client.knowledge.createSchema(PROJECT_ID, {
entity_type: "software_license",
display_name: "Software License",
fields: [
  { name: "market_price",  type: "number",  indexed: true  },
  { name: "tier",          type: "string",  indexed: true  },
  { name: "category",      type: "string",  indexed: true  },
  { name: "license_type",  type: "string",  indexed: false },
  { name: "trend_30d",     type: "string",  indexed: false },
],
});

您只需创建一次架构。之后平台会自动验证和索引该类型的每个实体。

2. 注入初始数据

使用 insertFacts 插入第一批实体。这也是在上线前加载历史数据的方式。包含关系信息，以便知识库能推荐替代或互补工具。

await client.knowledge.insertFacts(PROJECT_ID, {
entities: [
  {
    type: "software_license",
    label: "Figma Enterprise",
    properties: {
      market_price: 75,
      tier: "Enterprise",
      category: "Design Tools",
      license_type: "per-seat-annual",
      trend_30d: "+5%",
    },
  },
  {
    type: "software_license",
    label: "Slack Business+",
    properties: {
      market_price: 12.50,
      tier: "Business",
      category: "Communication",
      license_type: "per-seat-monthly",
      trend_30d: "+3%",
    },
  },
  {
    type: "category",
    label: "Design Tools",
    properties: { vendor_count: 18, avg_seat_cost: 45 },
  },
],
relationships: [
  { from_label: "Figma Enterprise",  to_label: "Design Tools", edge_type: "belongs_to" },
  { from_label: "Slack Business+",   to_label: "Communication", edge_type: "belongs_to" },
  { from_label: "Figma Enterprise",  to_label: "Slack Business+", edge_type: "commonly_bundled" },
],
source: "seed_v1",
});

3. 使用 bulkUpdate 保持定价更新

按计划（如每日 cron）运行成本同步任务，从供应商数据源获取当前定价并推送到知识库。bulkUpdate 通过标签匹配将属性合并到现有节点——无需删除和重新插入。

// cost-sync.ts — run daily
import { Sonzai } from "@sonzai-labs/agents";
import { fetchLatestPricing } from "./vendor-api"; // your data source

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });
const PROJECT_ID = "proj_abc123";

async function syncPricing() {
const pricing = await fetchLatestPricing(); // [{ name, price, trend }]

await client.knowledge.bulkUpdate(PROJECT_ID, {
  updates: pricing.map((license) => ({
    entity_type: "software_license",
    label: license.name,
    properties: {
      market_price: license.price,
      trend_30d: license.trend,
      last_synced: new Date().toISOString(),
    },
    // upsert: true — creates the node if it doesn't exist yet
    upsert: true,
  })),
});

console.log(`Synced ${pricing.length} license prices`);
}

syncPricing();

批量大小

小于等于 100 项的批次会同步处理（即时响应）。更大的批次会排队异步处理——响应中包含一个可轮询完成状态的任务 ID。

4. 在智能体上启用库存功能

在智能体上启用 inventory 和 knowledge 能力。这会自动为智能体提供 sonzai_inventory_update 和 sonzai_inventory 工具——无需提示工程。

const AGENT_ID = "agent_xyz";

await client.agents.updateCapabilities(AGENT_ID, {
inventory: true,   // enables sonzai_inventory_update + sonzai_inventory tools
knowledge: true,   // enables knowledge_search tool
project_id: PROJECT_ID,  // which KB to join against
});

您也可以在仪表板中设置：前往 智能体 > 您的智能体 > 能力 并启用库存。

5. 让智能体在对话中追踪资源

启用库存后，当用户提到他们使用的工具或订阅时，智能体会自动调用 sonzai_inventory_update。您只需正常聊天——平台会处理知识库解析和存储。

// Your backend chat endpoint
for await (const event of client.agents.chatStream(AGENT_ID, {
userId: "user_123",
messages: [
  {
    role: "user",
    content: "We just provisioned 10 Figma Enterprise seats at $75/seat.",
  },
],
})) {
// The agent streams its reply — and internally calls
// sonzai_inventory_update({ action: "add", item_type: "software_license",
//   description: "Figma Enterprise", properties: { plan: "Enterprise",
//   purchase_price: 75, quantity: 10 } })
// The platform resolves the KB node, stores the link, and the agent
// continues the conversation without interruption.
process.stdout.write(event.choices?.[0]?.delta?.content ?? "");
}

知识库解析原理

平台在知识库中搜索物品描述。如果恰好匹配一个节点，则自动关联。如果有多个候选，响应会返回 status: "disambiguation_needed" 和候选列表，以便智能体向用户确认。

6. 查询丰富后的组合

使用 mode="value" 获取每个用户资源与最新知识库定价数据的关联。平台自动计算 gain_loss：(market_price - purchase_price) x quantity。

const portfolio = await client.agents.inventory.query(AGENT_ID, "user_123", {
mode: "value",
project_id: PROJECT_ID,
});

// portfolio.items:
// [
//   {
//     fact_id: "fact_abc",
//     item_label: "Figma Enterprise",
//     kb_node_id: "node_xyz",
//     user_properties: { plan: "Enterprise", purchase_price: 75, quantity: 10 },
//     market_properties: { market_price: 80, tier: "Enterprise", trend_30d: "+5%" },
//     gain_loss: 50,
//   },
// ]
// portfolio.totals: { "market_price:sum": 800, "*:count": 10 }

console.log(`Portfolio value: $${portfolio.totals?.["market_price:sum"]}`);
console.log(`Total cost change: $${portfolio.items.reduce((s, i) => s + i.gain_loss, 0)}`);

您还可以使用 mode="aggregate" 和 aggregations 参数获取组合级别的汇总，无需列出每个资源——适用于拥有大量订阅的组织。

// Aggregate: total count + total subscription cost, grouped by item_type
const agg = await client.agents.inventory.query(AGENT_ID, "user_123", {
mode: "aggregate",
aggregations: "market_price:sum,*:count",
group_by: "item_type",
project_id: PROJECT_ID,
});
// agg.totals: { "market_price:sum": 875, "*:count": 12 }
// agg.groups: [{ group: "software_license", values: { sum: 875, count: 12 } }]

7. 批量导入现有订阅

如果用户已有一组现有订阅（来自 CSV、采购系统导出等），请批量导入而非等待智能体在对话中发现每个资源。

await client.agents.inventory.batchImport(AGENT_ID, "user_123", {
item_type: "software_license",
project_id: PROJECT_ID,
items: [
  {
    description: "Figma Enterprise",
    properties: { quantity: 10, plan: "Enterprise", purchase_price: 75 },
  },
  {
    description: "GitHub Enterprise",
    properties: { quantity: 25, plan: "Enterprise", purchase_price: 21 },
  },
],
});

每批最多 1,000 项

批量端点每次调用最多处理 1,000 项。对于更大的导入，请拆分为多次调用或使用仪表板中的 CSV 导入功能。

下一步

在知识库中设置推荐规则，向用户推荐替代工具
添加趋势追踪（7天/30天/90天）以支持"最大成本增长"报告
在 仪表板 > 智能体 > 您的智能体 > 用户 > 选择用户 > 库存/资产 中查看每用户库存
阅读知识库参考了解架构、分析规则和全文搜索

Medication Reminders

等待翻译. This page is currently in English pending Chinese translation. See CONTRIBUTING for translation workflow.

This tutorial walks through a full medication-reminder implementation: define a medication entity type in your knowledge base, seed medications per user, create a Scheduled Reminder linking each medication to a cadence, and the agent proactively messages the user at the scheduled time — naming the medication and dosage in its own voice.

Tenant-agnostic primitive. The Sonzai platform has no medication-specific code. This tutorial wires two generic primitives — Inventory and Scheduled Reminders — into a medication use case. The same pattern works for watering plants, exercise reminders, bill payments, or any recurring-with-structured-data use case.

This is not a medical device. Reminders are a user-experience feature, not a clinical safety mechanism. Do not rely on Sonzai scheduled reminders as the sole adherence path for patients where missed doses cause harm.

1. Define a medication entity in your knowledge base

Create a schema for the medication entity type so the platform knows how to store and index each drug's properties. The name and ndc_code fields are indexed for fast lookup; dosage, instructions, and prescribed_by are stored but not indexed (they are fetched whole at fire time).

See the Resource Inventory + Knowledge Base tutorial for full schema semantics, field types, and upsert behaviour.

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });
const PROJECT_ID = "proj_abc123";

await client.knowledge.createSchema(PROJECT_ID, {
entity_type: "medication",
display_name: "Medication",
fields: [
  { name: "medication_name", type: "string",  indexed: true  },
  { name: "dosage",          type: "string",  indexed: false },
  { name: "instructions",    type: "string",  indexed: false },
  { name: "prescribed_by",   type: "string",  indexed: false },
  { name: "ndc_code",        type: "string",  indexed: true  }, // optional; National Drug Code
],
});

You only need to create the schema once per project. All subsequent medication items written for any user will be validated and indexed against this definition.

2. Seed a medication for the user

Insert one medication into the user's inventory using inventory.create. Store the returned inventory_item_id — you will pass it to the schedule in the next step.

const AGENT_ID = "agent_abc";
const USER_ID  = "user_123";

const item = await client.agents.inventory.create(AGENT_ID, USER_ID, {
item_type: "medication",
label: "Ibuprofen",
project_id: PROJECT_ID,
properties: {
  medication_name: "ibuprofen",
  dosage:          "500mg",
  instructions:    "take with food",
  prescribed_by:   "Dr. Tan",
},
});

const inventoryItemId = item.inventory_item_id;
// e.g. "inv_01HX8FKZQ3..."
console.log(inventoryItemId);

3. Create a schedule linked to the medication

Create a twice-daily schedule at 08:00 and 20:00 Asia/Singapore, with active_window.hours set as a belt-and-braces quiet-hours guard. Pass the inventory_item_id returned in step 2. The platform will fetch the live item properties at every fire — no re-registration required when the dosage changes.

const schedule = await client.schedules.create(AGENT_ID, USER_ID, {
cadence: {
  simple: { frequency: "daily", times: ["08:00", "20:00"] },
  timezone: "Asia/Singapore",
},
active_window: {
  hours: { start: "07:00", end: "22:00" },
},
intent: "remind the user to take their ibuprofen at the correct dose",
check_type: "reminder",
inventory_item_id: inventoryItemId,
metadata: { reminder_category: "medication" },
});

const scheduleId = schedule.schedule_id;
console.log(scheduleId);          // "sched_01HX..."
console.log(schedule.next_fire_at);       // "2026-05-02T00:00:00Z"
console.log(schedule.next_fire_at_local); // "2026-05-02T08:00:00+08:00"

What each field controls:

Field	Role
`cadence.simple.times`	Wall-clock fire times in the schedule's timezone
`cadence.timezone`	Per-user IANA zone; the platform does not auto-detect the user's location
`active_window.hours`	Quiet-hours guard; fires computed outside the window are skipped, not deferred
`intent`	The why the agent grounds its message in — written as a short natural-language instruction
`inventory_item_id`	Links to the medication's structured properties, fetched live at every fire
`metadata`	Opaque developer tags surfaced to the agent as "Additional context" in the wakeup block

4. What the user sees

When the schedule fires at 08:00 Singapore time, the platform assembles a structured intent block and delivers it to the agent as a proactive wakeup. The agent composes its opening message in its own voice using the intent and the injected inventory properties. A typical output might look like:

"Morning — quick reminder, it's 8 o'clock. Time for your 500mg of ibuprofen, and remember to take it with food."

Exact wording depends on the agent's personality configuration. The agent is not given a fixed template — it receives the intent and inventory data and decides how to phrase it naturally.

Updating the dosage. When a doctor reduces the ibuprofen dose from 500mg to 250mg, update the inventory item:

await client.agents.inventory.update(AGENT_ID, USER_ID, inventoryItemId, {
properties: {
  dosage: "250mg",
},
});
// No schedule edit required.
// The next scheduled fire automatically reads "250mg" from the live item.

This separation is intentional: inventory is the source of truth for the what; the schedule is the source of truth for the when. They change independently. Changing the dose never touches the schedule row; moving a reminder time never touches the medication item.

5. Bounded courses (14-day antibiotic)

For medications with a fixed course length, use starts_at and ends_at to auto-disable the schedule when the course completes. Here is a 3x/day amoxicillin course that fires every 8 hours over 14 days:

const amoxItem = await client.agents.inventory.create(AGENT_ID, USER_ID, {
item_type: "medication",
label: "Amoxicillin",
project_id: PROJECT_ID,
properties: {
  medication_name: "amoxicillin",
  dosage:          "500mg",
  instructions:    "complete the full course even if you feel better",
  prescribed_by:   "Dr. Tan",
},
});

const amoxSchedule = await client.schedules.create(AGENT_ID, USER_ID, {
cadence: {
  simple: { frequency: "interval_hours", interval_hours: 8 },
  timezone: "Asia/Singapore",
},
active_window: {
  hours: { start: "07:00", end: "23:00" },
},
intent: "remind the user to take their amoxicillin — emphasise completing the full course",
check_type: "reminder",
inventory_item_id: amoxItem.inventory_item_id,
metadata: { reminder_category: "medication" },
starts_at: "2026-05-01T00:00:00Z",
ends_at:   "2026-05-15T00:00:00Z",
});

After ends_at passes, the schedule is automatically disabled (enabled flips to false). The inventory item for amoxicillin remains as a historical record and can be queried via the Memory API. No cleanup is required.

6. Multiple medications

Create one schedule per medication. Three daily medications = three schedules. Fires that land at the same wall-clock time produce separate proactive messages by design — each message is grounded in its own medication's inventory item.

Avoid simultaneous fires. If you want the user to receive distinct messages rather than a burst, stagger the times across schedules:

Medication	Schedule times
Metformin	`["08:00", "20:00"]`
Atorvastatin	`["08:15"]`
Vitamin D	`["08:30"]`

Alternative: compose a "morning routine" item. If you prefer a single message covering all morning medications, create one inventory item of type medication_routine (define its own schema) with a medications property that lists all drugs and doses. Attach that single item to a single 08:00 schedule. The agent receives all the structured data in one wakeup block and can address all medications in a single message.

7. Track adherence (optional)

Conversational signal

When the user replies "I took it, thanks" or similar, the agent's memory layer auto-captures this as a fact on the user. You can query recent user responses to a medication reminder via the Memory API:

// Query recent memory facts mentioning medication adherence
const memories = await client.agents.memory.query(AGENT_ID, USER_ID, {
query: "medication taken ibuprofen",
limit: 10,
});

for (const fact of memories.facts) {
console.log(fact.content);    // "User confirmed taking 500mg ibuprofen on 2026-05-02"
console.log(fact.created_at); // ISO-8601 timestamp
}

Explicit acknowledgement

For a harder signal, add a POST /adherence/{scheduleId} endpoint in your tenant backend that your mobile or web app calls when the user taps a confirmation button. This gives you a structured event log independent of the conversational memory layer. Sonzai does not provide this endpoint — it lives in your own backend and stores data in your own database.

8. Timezone changes when the user travels

Patch the schedule's cadence.timezone whenever the user's preferred timezone changes. Future fires are immediately recomputed in the new zone; past fire history is not modified.

// User travelling from Singapore to Los Angeles
await client.schedules.patch(AGENT_ID, USER_ID, scheduleId, {
cadence: {
  simple: { frequency: "daily", times: ["08:00", "20:00"] },
  timezone: "America/Los_Angeles",
},
});
// Next fire: 08:00 PDT (Los Angeles) — not 08:00 SGT

9. Quiet-hours for caregivers and night shifts

The active_window.hours field ensures fires outside permitted hours are silently skipped. Two common scenarios:

Caregiver setting — no overnight messages for a patient.

{
  "active_window": {
    "hours": { "start": "07:00", "end": "21:00" }
  }
}

Any cadence tick after 21:00 or before 07:00 is discarded. A twice-daily schedule with times ["08:00", "20:00"] would still fire at both times; adding a 22:00 dose would be silently skipped.

Night-shift user — active overnight, sleeping during the day.

{
  "active_window": {
    "hours": { "start": "22:00", "end": "06:00" }
  }
}

When start is greater than end, the window wraps midnight. This user receives reminders from 22:00 to 05:59 the next morning, and any cadence ticks during daytime hours are skipped.

See Scheduled Reminders — Active window for the full reference including days_of_week filtering.

Next steps

Scheduled Reminders — full primitive reference: cadence shapes, DST handling, previewing upcoming fires, pause/resume/delete, error codes.
Resource Inventory + Knowledge Base — KB schema depth, bulk updates, mode="value" portfolio queries, batch import.
Memory — how the agent tracks user responses from proactive conversations and surfaces them in future interactions.

记忆感知对话

你将构建什么

一个智能体自动提取和存储事实的流式聊天循环
一个在首次对话前注入现有用户数据的预填流程
一个查找智能体对特定话题了解情况的记忆搜索 API 调用
一个审计智能体记忆增长情况的事实时间线查询

记忆的工作原理

聊天时记忆完全自动化。每次消息交换后，平台会：

运行提取流水线，识别对话中的事实、偏好、承诺和事件
使用替代链与现有记忆去重（旧事实被退役，而不是删除）
为全文搜索和时序查询建立索引
在下一次对话时，自动在 token 预算内获取最相关的记忆

你永远不需要管理向量存储、编写提取提示词或实现检索逻辑。平台处理所有这些。

1. 聊天并让记忆积累

开始聊天。记忆提取在响应流式完成后自动发生。你这边无需任何特殊处理。

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });
const AGENT_ID = "agent_abc";
const USER_ID  = "user_123";

// 第一次对话——智能体尚无记忆
for await (const event of client.agents.chatStream(AGENT_ID, {
userId: USER_ID,
messages: [
  { role: "user", content: "My name is Mia. I'm allergic to peanuts and I love hiking." },
],
})) {
process.stdout.write(event.choices?.[0]?.delta?.content ?? "");
}
// 平台提取：name="Mia"，allergy="peanuts"，interest="hiking"

// 第二次对话——智能体回忆起上述所有内容
for await (const event of client.agents.chatStream(AGENT_ID, {
userId: USER_ID,
messages: [
  { role: "user", content: "What snacks should I bring on my next hike?" },
],
})) {
process.stdout.write(event.choices?.[0]?.delta?.content ?? "");
}
// 智能体知道 Mia 热爱徒步且对花生过敏——无需重新介绍。

记忆是每用户独立的

从用户 A 对话中提取的事实永远不会暴露给用户 B。每次聊天调用都要传入 userId（或 user_id / UserID），让平台正确地限定记忆范围。

2. 从现有数据预填记忆

如果用户在你的系统中有历史记录——CRM 档案、入职答案、历史订单——在第一次对话前注入它，让智能体感觉已经认识他们。

// 在入职期间或 CRM 导入后调用一次
await client.agents.memory.seed(AGENT_ID, {
userId: USER_ID,
memories: [
  {
    content: "Mia is a 32-year-old UX designer based in Berlin.",
    type: "user_fact",
  },
  {
    content: "Mia subscribed to the Pro plan on 2024-11-03.",
    type: "shared_experience",
    occurred_at: "2024-11-03T00:00:00Z",
  },
  {
    content: "Mia prefers email over SMS for notifications.",
    type: "user_preference",
  },
  {
    content: "Mia mentioned she wants to get into trail running.",
    type: "user_goal",
  },
],
});

支持的记忆类型：user_fact、user_preference、shared_experience、user_goal、commitment、time_sensitive。

3. 搜索智能体知道的内容

直接查询记忆存储，查找智能体对某个话题提取到的内容。适用于构建面向用户的"我的智能体记得什么？"功能，或用于调试。

const results = await client.agents.memory.search(AGENT_ID, {
query: "diet restrictions food allergies",
userId: USER_ID,
limit: 10,
});

for (const fact of results.facts) {
console.log(`[${fact.type}] ${fact.content} (confidence: ${fact.confidence})`);
}
// [user_fact] Mia is allergic to peanuts (confidence: 0.97)
// [user_preference] Mia prefers nut-free snacks on hikes (confidence: 0.85)

4. 浏览记忆树

记忆树是一个 7 级层级结构，按类别组织事实（/identity/traits、/preferences/interests、/episodes/sessions 等）。你可以逐节点遍历它。

// 获取顶层节点
const tree = await client.agents.memory.list(AGENT_ID, {
userId: USER_ID,
includeContents: false,  // 只获取节点元数据，不获取事实文本
});

for (const node of tree.nodes) {
console.log(`${node.path} — ${node.fact_count} facts`);
}
// /identity/traits — 3 facts
// /preferences/interests — 5 facts
// /episodes/sessions — 12 facts
// /temporal — 2 facts

// 深入某个节点
const identityNode = await client.agents.memory.list(AGENT_ID, {
userId: USER_ID,
parentId: "node_identity_traits_id",
includeContents: true,  // 包含事实文本
});

你可以在仪表盘的智能体 → 你的智能体 → 用户 → 选择用户 → 记忆 → 树状探索器中交互式地探索记忆树。

5. 检查事实时间线

时间线按时间顺序展示每个事实——创建、更新或被替代的时间。用它来审计记忆增长，或构建"对话历史"视图。

const timeline = await client.agents.memory.timeline(AGENT_ID, {
userId: USER_ID,
// 可选：缩小日期范围
start: "2025-01-01T00:00:00Z",
end:   "2025-12-31T23:59:59Z",
});

for (const entry of timeline.entries) {
console.log(
  `${new Date(entry.created_at).toLocaleDateString()} — ${entry.type}: ${entry.content}`
);
}

6. 直接列出提取的事实

对于管理 UI 或合规导出，直接列出用户的所有原始事实，无需经过树状层级。支持按 factType（TS）/ fact_type（Python/Go）过滤。

// 该用户的所有事实（分页）
const facts = await client.agents.memory.listFacts(AGENT_ID, {
userId: USER_ID,
limit: 50,
offset: 0,
factType: "user_preference",  // 可选过滤器
});

console.log(`Total facts: ${facts.total}`);
for (const f of facts.facts) {
console.log(`  ${f.content}`);
}

GDPR / 被遗忘权

要删除某用户的所有记忆，调用 client.agents.memory.reset(agentId, { userId })。这会创建墓碑记录以防止已删除的事实被重新浮现；数据会立即从检索中移除。

7. 回溯历史（时光机）

时光机让你查看智能体在过去任意时间点对用户的了解——适用于调试智能体为何说了某些话，或审计其理解如何演化。

const snapshot = await client.agents.getTimeMachine(AGENT_ID, {
userId: USER_ID,
at: "2025-03-01T00:00:00Z",  // 智能体在这个时刻知道什么？
});

console.log("Known facts at 2025-03-01:");
for (const fact of snapshot.facts) {
console.log(`  ${fact.content}`);
}

替代机制的工作原理

当事实更新时，旧记录被退役（而不是删除），并创建一个带有 SupersedesID 指针的新记录。时光机重放这条链以重建任意时间戳时的状态。

下一步

阅读记忆与上下文参考文档，了解完整的 7 级层级结构
配置对话，通过自动会话管理处理多轮聊天
探索情绪与心情，了解智能体的情感状态如何随记忆演化
添加自定义状态，在记忆旁边存储结构化应用数据

Scheduled Reminders

等待翻译. This page is currently in English pending Chinese translation. See CONTRIBUTING for translation workflow.

What you'll build

A daily 09:00 Asia/Singapore check-in schedule that fires a proactive agent message every morning
An every-4-hours schedule with a quiet-hours active window that skips fires outside allowed hours
A bounded interval_hours course constrained by starts_at and ends_at — useful for multi-week programs
An understanding of how the same primitive powers the full Medication Reminders worked example

Scheduled Reminders are a first-class primitive: the platform recomputes next_fire_at after every fire, respects DST transitions automatically, and injects inventory context live at fire time so your agent always has current data.

1. Create a schedule

Register a schedule by calling POST /api/v1/agents/{agentId}/users/{userId}/schedules. The body describes when to fire (cadence), what the agent should do (intent), and optional scoping fields (active_window, inventory_item_id, starts_at, ends_at).

Here is a minimal daily 09:00 SGT check-in:

{
  "cadence": {
    "simple": { "frequency": "daily", "times": ["09:00"] },
    "timezone": "Asia/Singapore"
  },
  "intent": "check in on how the user is feeling",
  "check_type": "reminder"
}

And a full example with all optional fields:

{
  "cadence": {
    "simple": { "frequency": "daily", "times": ["09:00"] },
    "timezone": "Asia/Singapore"
  },
  "active_window": {
    "hours": { "start": "08:00", "end": "22:00" },
    "days_of_week": ["mon", "tue", "wed", "thu", "fri"]
  },
  "intent": "check in on how the user is feeling",
  "check_type": "reminder",
  "inventory_item_id": "01HX8F...",
  "metadata": { "campaign": "daily_checkin_v2" },
  "starts_at": "2026-05-01T00:00:00Z",
  "ends_at": "2026-05-14T23:59:59Z"
}

The response includes schedule_id, next_fire_at (UTC), and next_fire_at_local (the same instant expressed in the schedule's timezone — useful for displaying to the user).

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });
const AGENT_ID = "agent_abc";
const USER_ID  = "user_123";

const schedule = await client.schedules.create(AGENT_ID, USER_ID, {
cadence: {
  simple: { frequency: "daily", times: ["09:00"] },
  timezone: "Asia/Singapore",
},
intent: "check in on how the user is feeling",
check_type: "reminder",
});

console.log(schedule.schedule_id);      // "sched_01HX..."
console.log(schedule.next_fire_at);     // "2026-05-02T01:00:00Z"
console.log(schedule.next_fire_at_local); // "2026-05-02T09:00:00+08:00"

2. Cadence shapes

Two mutually exclusive shapes are supported: simple and cron. Exactly one must be present in the cadence object.

Simple cadence

Field	Type	Required	Description
`frequency`	`"daily"` \| `"weekly"` \| `"interval_hours"`	Yes	Recurrence pattern
`times`	`string[]`	Yes for `daily`/`weekly`	Wall-clock times in `HH:MM` (24-hour), evaluated in the schedule's timezone
`days_of_week`	`string[]`	Yes for `weekly`	`"mon"`, `"tue"`, `"wed"`, `"thu"`, `"fri"`, `"sat"`, `"sun"`
`interval_hours`	`number`	Yes for `interval_hours`	Minimum 1, maximum 24
`timezone`	IANA string	Yes	Applied to `times` and `days_of_week` evaluation

A weekly schedule fires on the specified days at each listed time. A daily schedule fires every day at each listed time. An interval_hours schedule fires repeatedly at that interval starting from starts_at (or schedule creation if starts_at is omitted), bounded by the active window.

Cron cadence

Field	Type	Required	Description
`expression`	`string`	Yes	Standard 5-field cron (`min hour dom month dow`)
`timezone`	IANA string	Yes	Cron fields are evaluated in this zone

Standard 5-field cron — no seconds field. Example: "0 9 * * 1-5" fires at 09:00 on weekdays.

Rate limits. Cadences that resolve to more than one fire per minute are rejected with CADENCE_TOO_FREQUENT. Cadences that produce more than 96 raw ticks per 24-hour rolling window (before active-window filtering) are rejected with CADENCE_TOO_DENSE. For most use cases interval_hours: 1 (24 raw ticks/day) is the densest practical setting.

3. Timezones

Every schedule requires a timezone field containing a valid IANA timezone name (e.g. "Asia/Singapore", "America/New_York", "Europe/London"). Offsets like "+08:00" are not accepted.

All cadence math — wall-clock time evaluation, days_of_week membership, DST skip logic — runs in the schedule's own timezone. The result is stored and returned as next_fire_at in UTC. next_fire_at_local is a convenience field that expresses the same instant with the zone offset applied.

When a user travels or changes their preferred timezone, patch the schedule timezone directly:

// User moved from Singapore to London
await client.schedules.update(AGENT_ID, USER_ID, scheduleId, {
cadence: {
  simple: { frequency: "daily", times: ["09:00"] },
  timezone: "Europe/London",
},
});

DST handling. On spring-forward transitions, a wall time that falls into the clocks-forward gap (e.g. 02:30 in a zone that jumps 02:00 → 03:00) is non-existent. The platform skips that occurrence and fires at the next valid occurrence. On fall-back transitions, a wall time that exists twice is never double-fired — the platform fires once and advances.

4. Active window (quiet hours + allowed days)

The active_window field restricts which fires actually produce a proactive wakeup. Fires computed by the cadence that land outside the window are skipped, not deferred — the cadence grid stays perfectly predictable and no backlog accumulates.

{
  "active_window": {
    "hours": { "start": "08:00", "end": "22:00" },
    "days_of_week": ["mon", "tue", "wed", "thu", "fri"]
  }
}

Both sub-fields are optional within active_window. You may specify hours only, days_of_week only, or both.

Overnight windows. When start is greater than end, the window wraps midnight. For example {"start": "22:00", "end": "06:00"} allows fires from 22:00 to 05:59 the next morning. This is useful for night-shift workers or schedules targeting early-morning time zones where local midnight matters.

Allowed days. Values must be lowercase three-letter abbreviations: "mon", "tue", "wed", "thu", "fri", "sat", "sun". Day membership is evaluated in the schedule's timezone, so a fire at 23:30 Friday Singapore time stays Friday even when stored as 15:30 UTC (Saturday in some zones).

Empty days array. Passing "days_of_week": [] (an explicit empty list) is rejected with INVALID_ACTIVE_WINDOW — it would produce a schedule that can never fire. To allow all days, omit the days_of_week field entirely.

5. Linking an inventory item

Pass inventory_item_id on the create (or patch) body to associate a schedule with a specific item from the user's resource inventory. The item's properties are injected live at fire time — not at schedule creation — so any mid-program updates to the item (e.g. a medication dosage change, a price update) are automatically reflected in the agent's proactive message without requiring any schedule modification.

{
  "cadence": {
    "simple": { "frequency": "daily", "times": ["08:00"] },
    "timezone": "Asia/Singapore"
  },
  "intent": "remind the user to take their morning medication",
  "check_type": "reminder",
  "inventory_item_id": "01HX8FKZQ3..."
}

At fire time the platform fetches the current item properties and appends them to the intent block the agent receives. The Medication Reminders tutorial shows a complete worked example including how to structure medication inventory items for maximum agent context.

Graceful degradation. If the referenced inventory item is deleted before a fire occurs, the schedule continues firing. The intent block is delivered without the Reference item section — the agent receives the intent and metadata fields as normal. No error is surfaced to the user; the schedule itself is not affected.

6. Bounded courses (starts_at / ends_at)

Use starts_at and ends_at to create a time-bounded program. Both fields are optional and accept RFC 3339 UTC timestamps.

{
  "cadence": {
    "simple": {
      "frequency": "interval_hours",
      "interval_hours": 4
    },
    "timezone": "Asia/Singapore"
  },
  "active_window": {
    "hours": { "start": "08:00", "end": "22:00" }
  },
  "intent": "prompt the user to log a pain score",
  "check_type": "check_in",
  "starts_at": "2026-05-01T00:00:00Z",
  "ends_at": "2026-05-14T23:59:59Z"
}

starts_at — no fire is produced before this timestamp. Cadence expansion begins from this point. If omitted, the schedule starts immediately.
ends_at — once this timestamp passes, the schedule is automatically disabled (enabled flips to false). The row is not deleted, so the audit trail and historical fire log remain accessible.

Passing ends_at that is less than or equal to starts_at returns INVALID_WINDOW. Passing a past ends_at at creation time also returns INVALID_WINDOW — a schedule that has already expired cannot be created.

7. Pausing, editing, deleting

Operation	How	Behavior
Pause	`PATCH enabled: false`	Schedule stops producing fires within 1 minute. `next_fire_at` is frozen.
Resume	`PATCH enabled: true`	`next_fire_at` is recomputed from the current time. No backfill occurs for fires that were missed while paused.
Edit	`PATCH cadence`, `active_window`, `starts_at`, or `ends_at`	Changes take effect on the next expansion cycle (within ~1 minute). The current in-flight fire (if any) is not affected.
Delete	`DELETE /schedules/{id}`	Hard delete. The row, all fire history, and all pending wakeups are removed immediately. This operation is irreversible.

Typical pause/resume flow:

// Pause
await client.schedules.update(AGENT_ID, USER_ID, scheduleId, { enabled: false });

// Resume — next_fire_at is recomputed from now
await client.schedules.update(AGENT_ID, USER_ID, scheduleId, { enabled: true });

// Delete
await client.schedules.delete(AGENT_ID, USER_ID, scheduleId);

8. Preview upcoming fires

GET /api/v1/agents/{agentId}/users/{userId}/schedules/{id}/upcoming?limit=N returns the next N computed fire times as an array of UTC timestamps. The preview applies the active window, so what you see is exactly what will fire.

For example, a 4-hourly schedule (interval_hours: 4) with an 08:00–22:00 active window produces at most 4 fires per calendar day (08:00, 12:00, 16:00, 20:00 local) — not 6 (which would be the raw cadence count before filtering). The preview array reflects this.

[
  "2026-05-01T00:00:00Z",
  "2026-05-01T04:00:00Z",
  "2026-05-01T08:00:00Z",
  "2026-05-01T12:00:00Z"
]

const upcoming = await client.schedules.upcoming(AGENT_ID, USER_ID, scheduleId, {
limit: 10,
});

for (const fireAt of upcoming) {
console.log(fireAt); // UTC ISO-8601 string
}

9. What the agent receives

When a schedule fires, the platform constructs a structured intent block and delivers it to the agent as a proactive wakeup. The block looks like this:

[PROACTIVE WAKEUP — SCHEDULED REMINDER]

Why you're reaching out: check in on how the user is feeling

Scheduled fire time (user's local): 2026-05-02T09:00:00+08:00

Reference item (from inventory): Daily Vitamin D
  dosage: 1000 IU
  form: softgel
  timing_notes: take with food

Additional context:
  campaign: daily_checkin_v2

Key points:

[PROACTIVE WAKEUP — SCHEDULED REMINDER] — the stable header the agent detects to know it is initiating a conversation, not responding to one.
Why you're reaching out — verbatim content of the intent field you set on the schedule. Write this as a short natural-language instruction to the agent. The agent composes the actual opening message in its own voice — no prompt template is exposed; you control intent, not wording.
Scheduled fire time (user's local) — the next_fire_at_local value at fire time. Useful for agents that want to acknowledge the time explicitly ("Good morning" vs "Good afternoon").
Reference item (from inventory) — present only if inventory_item_id was set and the item still exists. The item's label and all of its properties are included. Item properties are fetched live at fire time.
Additional context — present only if metadata was set. All metadata key-value pairs are rendered here. Use this for campaign tracking, A/B variant labels, or any additional instruction to the agent that doesn't belong in the core intent.

There is no prompt template field. Clients control agent behavior through intent, inventory_item_id, and metadata. The agent is free to adapt its tone, greeting, and language based on the user's personality and the conversation history it already has.

Error codes

Code	Meaning
`CADENCE_AMBIGUOUS`	Both `simple` and `cron` were provided. Exactly one is required.
`CADENCE_MISSING`	Neither `simple` nor `cron` was provided.
`CADENCE_TOO_FREQUENT`	Cadence resolves to more than one fire per minute.
`CADENCE_TOO_DENSE`	Cadence produces more than 96 raw ticks per 24-hour window.
`INVALID_CRON`	The cron expression is not a valid 5-field cron.
`INVALID_TIMEZONE`	The `timezone` value is not a recognized IANA timezone name.
`INVALID_TIME`	A value in `times` is not in `HH:MM` 24-hour format, or the time does not exist (e.g. DST gap).
`INVALID_DAY_OF_WEEK`	A value in `days_of_week` is not one of the recognized three-letter abbreviations.
`INVALID_ACTIVE_WINDOW`	`active_window` is structurally invalid — most commonly `days_of_week: []` (explicit empty).
`INVALID_WINDOW`	`ends_at` is not after `starts_at`, or `ends_at` is in the past.
`NO_ALLOWED_FIRE`	The cadence + active_window combination produces no reachable fires in the next 90 days.
`INVENTORY_NOT_FOUND`	The `inventory_item_id` does not exist in the user's inventory.

Next steps

Medication Reminders — a full worked example using Scheduled Reminders to drive a medication adherence program, including inventory schema design for medication items and multi-dose daily schedules.
Resource Inventory + Knowledge Base — how to design inventory schemas and push live data, powering the inventory_item_id linkage described above.
Memory-Aware Chat — how the agent remembers user responses from previous proactive conversations and incorporates them into future interactions.

扩展性

状态、工具与用户知识

自定义状态

在聊天时注入 LLM 上下文的灵活键值存储。用它将环境状态、任务进度、资源或任何应用数据直接传入每次 AI 响应。

作用域模型

全局状态

每实例——在一个实例的所有用户间共享。适用于环境状态、配置、智能体状态、全局事件。

每用户状态

每实例 + 每用户——限定于实例中的某一用户。适用于已分配任务、工作流进度、用户偏好、活跃工具。

实例

所有状态都限定于某个 instanceId——你的智能体的一个部署上下文（例如工作区或环境）。省略 instanceId 则使用默认实例。参见实例了解如何管理多个上下文。

创建状态

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: "sk-..." });

// 全局状态（所有用户共享）
await client.agents.customStates.create("agent-id", {
key:         "current_status",
value:       "Processing requests",
scope:       "global",
contentType: "text",
instanceId:  "workspace-1",    // 省略则使用默认实例
});

// 每用户状态
await client.agents.customStates.create("agent-id", {
key:         "assigned_tasks",
value:       { reports: 1, reviews: 3 },
scope:       "user",
contentType: "json",
userId:      "user-123",
instanceId:  "workspace-1",
});

插入或更新（Upsert）

upsert 在键不存在时创建状态，或在键已存在时更新值。幂等——可安全地在每次更新周期调用。

await client.agents.customStates.upsert("agent-id", {
key:   "workflow_stage",
value: "review_started",
scope: "user",
userId: "user-123",
});

按键获取

通过复合键检索特定状态。

const state = await client.agents.customStates.getByKey("agent-id", {
key:    "assigned_tasks",
scope:  "user",
userId: "user-123",
});

console.log(state.value);     // { reports: 1, reviews: 3 }
console.log(state.updatedAt); // ISO 时间戳

列出状态

// 某实例的所有全局状态
const globals = await client.agents.customStates.list("agent-id", {
scope:      "global",
instanceId: "workspace-1",
});

// 特定用户的所有每用户状态
const userStates = await client.agents.customStates.list("agent-id", {
scope:  "user",
userId: "user-123",
});

按键删除

await client.agents.customStates.deleteByKey("agent-id", {
key:    "assigned_tasks",
scope:  "user",
userId: "user-123",
});

工具

工具让 LLM 在推理过程中调用函数。Sonzai 处理带 sonzai_ 前缀的内置工具。自定义工具由你定义并由你的后端执行——Sonzai 将调用作为副作用暴露出来。

使用自己的 LLM？

如果你使用独立记忆模式（自带 LLM），Sonzai 暴露你可以接入自己的智能体框架（LangChain、Vercel AI SDK、Gemini 函数调用等）的工具模式。详见工具集成指南。

内置工具（能力）

按智能体切换平台管理的能力。这些在智能体创建时启用或通过能力 API 更新。

sonzai_memory_recall（始终开启）

在推理时搜索存储的记忆。自动注入上下文。

sonzai_remember_name（可切换）

为未来的对话持久化用户姓名。默认开启。

sonzai_web_search（可切换）

通过 Google 进行实时网络搜索。默认开启。

sonzai_inventory（可切换）

读取用户资源项目并与知识库数据关联。

// 在智能体创建时设置能力
const agent = await client.agents.create({
agentId: "your-stable-uuid",  // 推荐——使创建操作幂等
name: "Luna",
big5: { openness: 0.75, conscientiousness: 0.6, extraversion: 0.8,
        agreeableness: 0.7, neuroticism: 0.3 },
toolCapabilities: {
  webSearch:       true,
  rememberName:    true,
  imageGeneration: false,
  inventory:       true,
},
});

// 或更新现有智能体的能力
await client.agents.update("agent-id", {
toolCapabilities: {
  webSearch: false,
  inventory: true,
},
});

保留前缀

sonzai_ 前缀为系统保留。你的自定义工具不得使用——API 会拒绝它们。

自定义工具（智能体级别）

与智能体一起存储的持久工具，在每次聊天中可用，与会话或实例无关。

// 创建自定义工具
await client.agents.createCustomTool("agent-id", {
name: "check_inventory",
description: "Check the user's current tasks and their statuses",
parameters: {
  type: "object",
  properties: {
    item_type: {
      type: "string",
      description: "Filter by category: active, pending, completed",
    },
  },
},
});

// 列出所有自定义工具
const tools = await client.agents.listCustomTools("agent-id");

// 更新工具的描述或参数
await client.agents.updateCustomTool("agent-id", "check_inventory", {
description: "Check and summarize the user's tasks by category",
});

// 删除工具
await client.agents.deleteCustomTool("agent-id", "check_inventory");

会话级工具（临时）

为特定会话动态注入工具。会话工具与智能体级工具合并——同名的会话工具优先。会话结束时丢弃。

选项 1——为现有会话设置

await client.agents.sessions.setTools("agent-id", "session-id", [
{
  name: "execute_action",
  description: "Execute an action from the agent's capabilities",
  parameters: {
    type: "object",
    properties: {
      action_name: { type: "string" },
      target:      { type: "string" },
    },
    required: ["action_name"],
  },
},
]);

选项 2——随聊天调用内联传入

for await (const event of client.agents.chatStream({
agent:    "agent-id",
messages: [{ role: "user", content: "Check my tools" }],
userId:   "user-123",
toolDefinitions: [
  {
    name:        "check_inventory",
    description: "List the agent's active tools",
    parameters:  { type: "object", properties: {} },
  },
],
})) {
// 处理事件...
}

处理工具调用

当 LLM 决定调用自定义工具时，它会作为 SSE 流中的副作用出现。你的后端执行工具并在下一条消息中返回结果。

1. 接收工具调用

const toolCalls: { name: string; arguments: Record<string, unknown> }[] = [];

for await (const event of client.agents.chatStream({
agent:    "agent-id",
messages: [{ role: "user", content: "What tasks do I have?" }],
userId:   "user-123",
})) {
// 将内容流式传输给用户
const content = event.choices?.[0]?.delta?.content;
if (content) process.stdout.write(content);

// 从副作用中收集工具调用
const calls = event.sideEffects?.externalToolCalls ?? [];
toolCalls.push(...calls);
}

2. 执行并返回结果

// 在你的后端执行工具调用
const toolResults: string[] = [];
for (const call of toolCalls) {
const result = await myBackend.executeTool(call.name, call.arguments);
toolResults.push(result);
}

// 在下一条聊天消息中返回结果
for await (const event of client.agents.chatStream({
agent:    "agent-id",
userId:   "user-123",
messages: [
  { role: "user",  content: "What tasks do I have?" },
  { role: "tool",  content: toolResults.join("\n") },
],
})) {
process.stdout.write(event.choices?.[0]?.delta?.content ?? "");
}

工具作用域总结

类型	作用域	持久性	管理方式
内置（`sonzai_`）	所有实例	平台管理	SDK 能力、仪表盘
智能体级自定义	所有实例	持久	SDK、仪表盘
会话级	每会话	临时	SDK（内联或 setTools）

用户范围知识（预填）

在用户第一次对话之前预填智能体对用户的了解。适用于 CRM 数据、用户档案、历史购买记录，或任何从第一天起就应该可用的用户特定知识。

预填用户

primeUser 异步地为用户预填事实。返回一个任务 ID，你可以轮询它来检查提取何时完成。

const job = await client.agents.priming.primeUser(
"agent-id",
"user-123",
{
  display_name: "Jane Smith",
  metadata: {
    company: "Acme Corp",
    title:   "Product Manager",
    email:   "[email protected]",
    custom:  { tier: "premium", region: "us-west" },
  },
  content: [
    {
      type:    "text",
      content: "Jane joined in 2024 and prefers concise answers. She focuses on mobile growth.",
    },
  ],
  source: "crm_import",
},
);

// 检查预填何时完成
const status = await client.agents.priming.getPrimeStatus(
"agent-id", "user-123", job.jobId
);
console.log(status.status); // "pending" | "processing" | "completed" | "failed"

获取和更新用户元数据

// 获取用户的存储元数据
const meta = await client.agents.priming.getMetadata("agent-id", "user-123");
console.log(meta.displayName, meta.company, meta.customFields);

// 更新元数据（局部更新——未指定字段保留原值）
await client.agents.priming.updateMetadata("agent-id", "user-123", {
custom: { tier: "enterprise", lastContact: "2026-03-28" },
});

批量导入（多个用户）

一次导入多个用户。返回任务 ID 以追踪进度。

const job = await client.agents.priming.batchImport("agent-id", {
users: [
  {
    userId:      "user-1",
    displayName: "Alice",
    metadata:    { company: "Acme", custom: { tier: "pro" } },
    content:     [{ type: "text", content: "Alice is a power user." }],
  },
  {
    userId:      "user-2",
    displayName: "Bob",
    metadata:    { company: "Globex" },
  },
],
source: "bulk_crm_sync",
});

// 轮询导入状态
const status = await client.agents.priming.getImportStatus(
"agent-id", job.jobId
);
console.log(status.processed, status.total, status.failed);

// 列出最近的导入任务
const jobs = await client.agents.priming.listImportJobs("agent-id", 10);

实际应用

自定义状态和工具是每种使用场景的两大扩展点——但你存储什么以及暴露哪些工具差异显著。

状态是场景和世界层数据。 存储在聊天之间持久化并驱动世界的事物：任务进度、可解锁内容标志、场景状态、角色携带的世界内物品库存。

工具是表达性行为。 角色可以在你的应用中做的事—— 表达情绪、更换服装、移动到不同场景、赠送礼物。描述要生动，这样 LLM 才能自然地调用它们。

await client.agents.sessions.setTools("agent-id", {
  userId: "user-123",
  tools: [
    {
      name: "change_scene",
      description: "Move to a new location in the story. Use when the scene has run its course or a new chapter begins.",
      parameters: { type: "object", properties: { location: { type: "string" } }, required: ["location"] },
    },
  ],
});

不要包含移交工具。 伴侣永远不应该推给人工—— 关系本身就是产品。