Post-processing model map
Configure the smaller models that run behind chat completions to extract memory, drift personality, and update mood.
Behind every chat turn, Sonzai runs a fleet of smaller models that:
- Extract facts from the user message and the agent reply
- Drift personality scores in response to interactions
- Update mood dimensions (happiness, energy, calmness, affection)
- Summarise sessions and compact older memory
These run after the user-facing reply is streamed, on the post-processing model map — a per-project config that maps the chat-completion model to the smaller model the extractor should use.
The map
Stored under the post_processing_model_map project-config key. Each
entry is a PostProcessingModelEntry with two fields:
type PostProcessingModelEntry = {
provider: string; // e.g. "gemini"
model: string; // e.g. "gemini-3.1-flash-lite-preview"
};The map keys are chat-completion model IDs plus a special *
wildcard that catches any chat model not explicitly listed:
{
"claude-3-5-sonnet": { "provider": "gemini", "model": "gemini-3.1-flash-lite-preview" },
"gpt-4-turbo": { "provider": "openrouter", "model": "anthropic/claude-3-haiku" },
"*": { "provider": "gemini", "model": "gemini-3.1-flash-lite-preview" }
}When extraction needs to run for a chat that used claude-3-5-sonnet,
the extractor uses Gemini Flash Lite. When it sees a chat model not in
the map, the * wildcard kicks in.
The wildcard key is exported as sonzai.PostProcessingWildcardKey (Go)
and the equivalent constant in the other SDKs so you don't have to
hard-code "*" in your provisioning scripts.
Reading the current map
const map = await client.projectConfig.getPostProcessingModelMap("project_xyz");
for (const [chatModel, entry] of Object.entries(map ?? {})) {
console.log(chatModel, "→", entry.provider, entry.model);
}Setting a map (or a single-key default)
Pass a full map; the call is a write-through replacement, not a merge. Most projects only need a wildcard entry pointing at a cheap model:
await client.projectConfig.setPostProcessingModelMap("project_xyz", {
"*": { provider: "gemini", model: "gemini-3.1-flash-lite-preview" },
});When to override per chat model
The wildcard is enough for most projects. Reach for an explicit entry when:
- A particular chat model produces output the default extractor mishandles (e.g. tool-call traces from a verbose model that need a stronger extractor to keep facts atomic).
- You're A/B-ing two extractors and want one chat model to route through each for comparison.
- Cost: cheaper chat models can run a cheaper extractor; flagship chat models may warrant a stronger extractor on the same trace.
Provider availability
An entry's provider/model must match a real provider Sonzai has
configured for your project — see Providers.
Setting a non-existent provider here makes extraction fail
asynchronously after the user-facing reply has already streamed; you'll
see it in the agent's extraction_status on the next turn.
Reference
- Providers — the chat-completion provider list (independent of post-processing).
- Self-improvement — the full picture of what the extractor does on each turn.
- Reference → API — REST endpoint shapes for the
project-configget/set/delete calls.
Model scope
Where the chat model and the post-processing model are configured — tenant, project, agent, session, and per-call — and which layer wins.
Custom LLM
Bring your own model while keeping the full managed experience — built-in tools, streaming, per-message extraction, personality evolution, and all behavioral systems. Sonzai calls your OpenAI-compatible endpoint instead of the default provider.