Skip to main content

Post-processing model map

Configure the smaller models that run behind chat completions to extract memory, drift personality, and update mood.

Behind every chat turn, Sonzai runs a fleet of smaller models that:

  • Extract facts from the user message and the agent reply
  • Drift personality scores in response to interactions
  • Update mood dimensions (happiness, energy, calmness, affection)
  • Summarise sessions and compact older memory

These run after the user-facing reply is streamed, on the post-processing model map — a per-project config that maps the chat-completion model to the smaller model the extractor should use.

The map

Stored under the post_processing_model_map project-config key. Each entry is a PostProcessingModelEntry with two fields:

type PostProcessingModelEntry = {
  provider: string; // e.g. "gemini"
  model:    string; // e.g. "gemini-3.1-flash-lite-preview"
};

The map keys are chat-completion model IDs plus a special * wildcard that catches any chat model not explicitly listed:

{
  "claude-3-5-sonnet": { "provider": "gemini",  "model": "gemini-3.1-flash-lite-preview" },
  "gpt-4-turbo":       { "provider": "openrouter", "model": "anthropic/claude-3-haiku" },
  "*":                 { "provider": "gemini",  "model": "gemini-3.1-flash-lite-preview" }
}

When extraction needs to run for a chat that used claude-3-5-sonnet, the extractor uses Gemini Flash Lite. When it sees a chat model not in the map, the * wildcard kicks in.

The wildcard key is exported as sonzai.PostProcessingWildcardKey (Go) and the equivalent constant in the other SDKs so you don't have to hard-code "*" in your provisioning scripts.

Reading the current map

const map = await client.projectConfig.getPostProcessingModelMap("project_xyz");
for (const [chatModel, entry] of Object.entries(map ?? {})) {
console.log(chatModel, "→", entry.provider, entry.model);
}

Setting a map (or a single-key default)

Pass a full map; the call is a write-through replacement, not a merge. Most projects only need a wildcard entry pointing at a cheap model:

await client.projectConfig.setPostProcessingModelMap("project_xyz", {
"*": { provider: "gemini", model: "gemini-3.1-flash-lite-preview" },
});

When to override per chat model

The wildcard is enough for most projects. Reach for an explicit entry when:

  • A particular chat model produces output the default extractor mishandles (e.g. tool-call traces from a verbose model that need a stronger extractor to keep facts atomic).
  • You're A/B-ing two extractors and want one chat model to route through each for comparison.
  • Cost: cheaper chat models can run a cheaper extractor; flagship chat models may warrant a stronger extractor on the same trace.

Provider availability

An entry's provider/model must match a real provider Sonzai has configured for your project — see Providers. Setting a non-existent provider here makes extraction fail asynchronously after the user-facing reply has already streamed; you'll see it in the agent's extraction_status on the next turn.

Reference

  • Providers — the chat-completion provider list (independent of post-processing).
  • Self-improvement — the full picture of what the extractor does on each turn.
  • Reference → API — REST endpoint shapes for the project-config get/set/delete calls.

On this page