The four supported chat-completion providers — gemini, openai, xai, custom — with model IDs, context windows, and how to pick one at runtime.

Sonzai routes chat completions through one of four providers. The IDs are exported as constants from the sonzai.providers module in the SDKs — import those rather than hand-typing strings, so they stay in sync as the catalog evolves. Use client.list_models() for the live set enabled on your tenant at runtime.

`gemini` — Google Gemini (default)

The platform default. gemini-3.1-flash-lite-preview is providers.DEFAULT_MODEL, and is also the wildcard fallback for the post-processing cascade.

Model	Context window	Notes
`gemini-3.1-flash-lite-preview`	1M	Default. Vision + tools + JSON mode + streaming. Compaction at 450k / 500k.
`gemini-3-flash-preview`	2M	Fallback on 429. Same feature set.
`gemini-3.1-pro-preview`	2M	Fallback on 429. Strongest Gemini model — pair with a cheaper post-processing entry.

`openai` — OpenAI

Default gpt-5.5; the 5.4 family is the cheaper workhorse and 5 / 5-mini / 5-nano cover even cheaper or smaller-context tiers. The fallback chain on quota exhaustion is gpt-5.5 → gpt-5.4 → gpt-5.4-mini → gpt-5.

Model	Context window	Use it when
`gpt-5.5`	1.05M	Default. The current OpenAI frontier — vision + tools + streaming + JSON mode.
`gpt-5.4`	1.05M	Cheaper than 5.5, same context window.
`gpt-5.4-mini`	1.05M	The cheap workhorse. Recommended for high-throughput tenants.
`gpt-5`	400k	Frozen Aug-2025 snapshot. Kept for tenants pinned to it; new agents should default to 5.5.
`gpt-5-mini` / `gpt-5-nano`	400k	Smaller-context tiers; same generation as `gpt-5`.

`xai` — xAI (Grok)

Reasoning and non-reasoning variants in the Grok 4 family. grok-4-1-fast-non-reasoning is the default; reasoning models are opt-in for tasks that benefit from deeper chain-of-thought.

Model	Context window	Reasoning
`grok-4-1-fast-non-reasoning`	2M	No
`grok-4-1-fast-reasoning`	2M	Yes
`grok-4.20-0309-non-reasoning`	2M	No
`grok-4.20-0309-reasoning`	2M	Yes

All Grok 4 entries support streaming, tools, and JSON mode. None support vision today.

`custom` — bring-your-own-LLM (BYOM)

Point Sonzai at any OpenAI-compatible chat-completions endpoint. The Mind Layer keeps owning memory, personality, mood, and post-processing — only the chat-completion call gets routed through your endpoint.

See Custom LLM for the full setup. This is distinct from BYOK — BYOK uses Sonzai's provider integrations but with your billing key; BYOM uses your own inference stack entirely.

Picking a provider in code

Pass provider and model on the chat call. Both are optional — omit them and Sonzai uses the agent's default, falling back through the scope cascade.

import { Sonzai } from "@sonzai-labs/agents";

const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

await client.agents.chat({
agent:    "agent_abc",
messages: [{ role: "user", content: "Hello" }],
provider: "openai",
model:    "gpt-5.5",
});

Listing what's available at runtime

client.list_models() (Python / TS / Go expose the same shape) returns the live set of providers and models enabled on your tenant — useful for building a model-picker UI or for asserting that a provider you depend on is wired up before a deploy.

const result = await client.listModels();
for (const p of result.providers) {
console.log(p.provider, p.models.map((m) => m.id));
}

Reference

BYOK — drop your own provider keys per project.
Custom LLM — point Sonzai at your own endpoint entirely.
Model scope — how provider / model is resolved per call.
Post-processing — what runs in the background, on what model.

Providers

`gemini` — Google Gemini (default)

`openai` — OpenAI

`xai` — xAI (Grok)

`custom` — bring-your-own-LLM (BYOM)

Picking a provider in code

Listing what's available at runtime

Reference

On this page

Providers

gemini — Google Gemini (default)

openai — OpenAI

xai — xAI (Grok)

custom — bring-your-own-LLM (BYOM)

Picking a provider in code

Listing what's available at runtime

Reference

On this page

`gemini` — Google Gemini (default)

`openai` — OpenAI

`xai` — xAI (Grok)

`custom` — bring-your-own-LLM (BYOM)