Knowledge Base & Limitations
How the knowledge base behaves in standalone mode, plus what isn't supported vs. managed mode.
Knowledge base in standalone mode
Automatic — KB results in /context
When you call session.context({ query }) (or GET /context), the endpoint searches the agent's knowledge base and includes matching results in a knowledge field automatically.
{
"personality_prompt": "You are a helpful AI companion...",
"big5": { "openness": 0.7, "conscientiousness": 0.6, "extraversion": 0.5, "agreeableness": 0.8, "neuroticism": 0.3 },
"current_mood": { "valence": 0.4, "arousal": 0.2, "tension": -0.1, "affiliation": 0.3 },
"loaded_facts": [{ "atomic_text": "User prefers morning workouts", "fact_type": "behavioral", "importance": 0.8 }],
"active_goals": [{ "description": "Run a 5K by June" }],
"habits": [{ "label": "Daily exercise" }],
"knowledge": {
"results": [
{
"content": "Refund policy: customers can request a full refund within 30 days...",
"label": "Refund Policy",
"type": "policy",
"source": "policies.pdf",
"score": 0.92
}
]
}
}Learning loop — extraction detects knowledge gaps
After /turn or /process extracts side effects, it also searches the KB with topics found in the conversation. If relevant KB content exists that the agent missed, it stores these as proactive signals — the next session.context() call includes them automatically.
Turn 1: session.context() → (no KB results yet)
↓
chat with your LLM
↓
session.turn() → extracts "hiking gear" as topic
→ searches KB, finds "Hiking Equipment Guide"
→ stores as proactive signal
Turn 2: session.context() → includes "Hiking Equipment Guide" from KB
+ any direct search results for the new query
↓
chat with your LLM (now knows about hiking gear!)
Explicit — tool endpoint for agent frameworks
const results = await client.agents.knowledgeSearch("agent-id", {
query: "refund policy",
limit: 5,
});
for (const result of results.results) {
console.log(result.label, result.content);
}You can also expose this as a function tool to your LLM — see Tool Calling in Pattern 1.
Limitations vs. managed mode
Want to use your own model without managing the chat loop? Consider Custom LLM instead. It lets you point Sonzai at any OpenAI-compatible endpoint while keeping streaming, built-in tools, and per-message extraction fully automatic.
No built-in tool execution
Managed mode calls built-in tools (web search, memory recall, image generation) automatically. In standalone mode you must implement tool calling yourself — the tool-calling loop is yours, but the resulting tool messages flow into /turn or /process for extraction. See the Tool Integration guide.
No streaming on extraction
session.context(), /turn, and /process are synchronous request-response calls. Streaming is handled by your own LLM. Background extraction is asynchronous but you poll for state, not stream.
Deferred knowledge base enrichment
KB enrichment is deferred — extraction detects knowledge gaps but the next session.context() call surfaces them, not the current turn.
Manual extraction trigger
You must pick one of the three integration shapes per conversation: /process (one-shot batch), sessions.start → sessions.end({ messages }) (lifecycle batch), or sessions.start → session.turn() per turn → session.end() (real-time). Picking none means the transcript is never seen by the Context Engine and no behavioral data is captured. Picking two — for example calling .turn() per turn and passing messages on .end() — runs extraction twice on the same content. (Heavy consolidation runs on its own schedule and doesn't need to be triggered manually.)
Text-only memory pipeline
Sonzai's extraction reads messages as text. Multimodal content (images, audio) must be bridged to text before submission — see Working with Images & Multimodal Input in Pattern 1.
What's the same in both modes
Extraction quality is identical — both modes use the same LLM pipeline for fact extraction, personality shifts, mood, habits, and consolidation. The 7-layer enriched context from session.context() is the same data the managed chat builds internally.
Endpoint Walkthrough
Reference for sessions.start, session.context, session.turn, /process, sessions.end — plus the read endpoints that surface the extracted behavioral data.
Tool Integration for BYO-LLM
When using standalone memory mode, your LLM handles chat generation but may need to search knowledge and memory on demand. Sonzai exposes tool schemas compatible with OpenAI function calling, so you can wire them into any agent framework.