在独立模式下知识库的行为，以及与托管模式相比哪些功能不被支持。

独立模式下的知识库

自动 —— /context 中的 KB 结果

When you call session.context({ query }) (or GET /context), the endpoint searches the agent's knowledge base and includes matching results in a knowledge field automatically.

{
  "personality_prompt": "You are a helpful AI companion...",
  "big5": { "openness": 0.7, "conscientiousness": 0.6, "extraversion": 0.5, "agreeableness": 0.8, "neuroticism": 0.3 },
  "current_mood": { "valence": 0.4, "arousal": 0.2, "tension": -0.1, "affiliation": 0.3 },
  "loaded_facts": [{ "atomic_text": "User prefers morning workouts", "fact_type": "behavioral", "importance": 0.8 }],
  "active_goals": [{ "description": "Run a 5K by June" }],
  "habits": [{ "label": "Daily exercise" }],
  "knowledge": {
    "results": [
      {
        "content": "Refund policy: customers can request a full refund within 30 days...",
        "label": "Refund Policy",
        "type": "policy",
        "source": "policies.pdf",
        "score": 0.92
      }
    ]
  }
}

After /turn or /process extracts side effects, it also searches the KB with topics found in the conversation. If relevant KB content exists that the agent missed, it stores these as proactive signals — the next session.context() call includes them automatically.

Turn 1: session.context() → (no KB results yet)
       ↓
      chat with your LLM
       ↓
      session.turn() → extracts "hiking gear" as topic
                     → searches KB, finds "Hiking Equipment Guide"
                     → stores as proactive signal

Turn 2: session.context() → includes "Hiking Equipment Guide" from KB
                        + any direct search results for the new query
       ↓
      chat with your LLM (now knows about hiking gear!)

显式 —— 给 agent 框架使用的工具端点

const results = await client.agents.knowledgeSearch("agent-id", {
query: "refund policy",
limit: 5,
});

for (const result of results.results) {
console.log(result.label, result.content);
}

You can also expose this as a function tool to your LLM — see Tool Calling in Pattern 1.

与托管模式的差异

Want to use your own model without managing the chat loop? Consider Custom LLM instead. It lets you point Sonzai at any OpenAI-compatible endpoint while keeping streaming, built-in tools, and per-message extraction fully automatic.

没有内建工具执行

Managed mode calls built-in tools (web search, memory recall, image generation) automatically. In standalone mode you must implement tool calling yourself — the tool-calling loop is yours, but the resulting tool messages flow into /turn or /process for extraction. See the Tool Integration guide.

抽取过程不支持流式

session.context(), /turn, and /process are synchronous request-response calls. Streaming is handled by your own LLM. Background extraction is asynchronous but you poll for state, not stream.

延迟的知识库补充

KB enrichment is deferred — extraction detects knowledge gaps but the next session.context() call surfaces them, not the current turn.

手动抽取触发

You must pick one of the three integration shapes per conversation: /process (one-shot batch), sessions.start → sessions.end({ messages }) (lifecycle batch), or sessions.start → session.turn() per turn → session.end() (real-time). Picking none means the transcript is never seen by the Context Engine and no behavioral data is captured. Picking two — for example calling .turn() per turn and passing messages on .end() — runs extraction twice on the same content. (Heavy consolidation runs on its own schedule and doesn't need to be triggered manually.)

仅文本的记忆管线

Sonzai's extraction reads messages as text. Multimodal content (images, audio) must be bridged to text before submission — see Working with Images & Multimodal Input in Pattern 1.

What's the same in both modes

Extraction quality is identical — both modes use the same LLM pipeline for fact extraction, personality shifts, mood, habits, and consolidation. The 7-layer enriched context from session.context() is the same data the managed chat builds internally.

知识库与限制