Skip to main content

Pricing

You only pay when
your agent is active.

Your Sonzai agents are always on — companions, characters, NPCs, front-of-house — ready to engage your users 24/7. We bill only for real conversations and interactions, measured in work units. No seats. No platform fees. Idle agents cost nothing.

Before we talk price

What's a work unit?

Engineers call them tokens. Same thing.

A work unit is one small piece of an interaction — roughly three-quarters of a word your agent reads or writes. Every time a user talks to your agent and the agent thinks, recalls, and replies, work units are consumed. They're the cleanest measure of “a real conversation, with a real user.” You only pay when they're ticking up.

Rule of thumb
  • ~750 words 1,000 work units
  • A typical user message is ~30 work units.
  • A typical AI reply is ~150 work units.
  • A long back-and-forth conversation is usually 2K–10K work units — a few cents at most on a cost-efficient model.
Why “work units” instead of “tokens”: every conversation your agent has is a small piece of value for your product. A live agent in a real interaction is a profit centre. An idle agent costs you nothing.

What you're actually paying for.

We charge usage-based — your bill goes up only when your agents are actively engaging your users. Here's every line your invoice can contain, and why each one exists.

Work units

(industry term: tokens)

Provider price + 25%

Every word your agent reads or writes during a conversation is a work unit. You're billed only for actual user interactions — never for an idle agent. The 25% is our service fee: it covers per-user and per-agent memory, the consolidation jobs that keep memory coherent, our proprietary self-learning models, and the ML engineers and AI specialists running it all.

  • Managed keys: provider price + 25%, one invoice
  • BYOK (recommended): pay the provider directly, we still charge the 25% service fee
  • BYOK lets you use your own discounts, rate limits, and audit the bill

Live agents

always on, 24/7

$0.50 / agent / month

An agent is the persona — its memory, personality, tools, and rules. Once it exists, it's always on, ready to engage your users around the clock. You're charged only for personas a real user actually talked to this month. Drafts, tests, and dormant agents are free.

  • Unlimited users per agent (no per-seat charge)
  • Dormant or draft agents don't bill
  • Usage report you can hand to finance

Onboarding

one-time, not a consultancy

Kept deliberately low

We aren't a consultancy. We're forward-deployed engineers whose only job is to get your application onto Sonzai quickly, and using it to its limit — so the relationship layer drives real outcomes inside your product. The price reflects productised onboarding, not slideware.

  • Forward-deployed engineers in your codebase, not a retainer
  • Targeted: maximise AI inside your existing application
  • We profit when your agents run — not when we bill hours
Book a free call
Where the 25% goes

We're not pocketing it.

The model provider charges us per work unit. We charge you the same number plus 25%. That 25% pays for everything that turns a stateless LLM into a relational agent that gets better at your users over time: per-user and per-agent memory, the background jobs that keep memory coherent, the proprietary self-learning models we train per agent-user pair, and the ML engineers and AI specialists who keep the whole stack running.

~5%
Hosting & infrastructure

Always-on API, MCP server, regional failover, monitoring, security.

~4%
Per-user & per-agent memory

Each user gets a private memory namespace; each agent has its own state too — indexed, encrypted, retrievable in milliseconds.

~4%
Consolidation & background jobs

Summarisation, importance scoring, deduplication, forgetting curves, embedding refresh — the slow work that keeps memory coherent.

~5%
Self-learning models

Sonzai trains proprietary models per agent-user relationship and stores them with the agents. The moat — and not a fresh-LLM swap.

~5%
ML engineers & AI specialists

The team that trains, tunes, and on-calls the relationship layer day-to-day. Model evals, drift, retraining, incidents.

~2%
Profit

Yes, really. Small profit so we can keep building. No SaaS-style markup.

Approximate — actual split shifts with scale, region, and the model you choose. The 25% is our service fee for the relationship layer — it applies whether you use Sonzai-managed keys or bring your own (BYOK). The difference with BYOK is that you pay the provider directly, with their rates, your discounts, and your own rate limits — and the bill is fully auditable on the provider side.

A real example.

A consumer app with about 100,000 users talking to three live agent personas — say a companion, a guide, and an NPC — running on GPT-5.5-mini. A per-seat platform would bill this north of six figures a month. Sonzai bills the work units.

GPT-5.5-mini at ~$0.25/M input + $1.00/M output. ~70M input and ~30M output work units across the month (100M total), plus our 25% service fee. No per-seat charges, no per-user charges.

3 live agents (personas)$1.50
100M work units (GPT-5.5-mini, blended)$47.50
Sonzai service fee (25%)$11.88
Estimated monthly total$60.88
Onboarding

Forward-deployed engineers. Not a consultancy.

We don't sell discovery decks or six-month engagements. We embed forward-deployed engineers inside your codebase with one focused job: get your application onto Sonzai, fast — and maximise the AI inside your product from day one.

The price reflects that. It's calibrated for productised onboarding, not consulting hours. Our incentive is aligned — we only make money when your agents are running and your users keep talking to them, so we want you live, integrated, and getting value fast.

01

We onboard you into Sonzai, not the other way around

The engagement is scoped to one outcome: your application running on the relationship layer, with the harness wired into the parts of your product that matter. No generic engineering rebuilds.

02

Forward-deployed: engineers + ML expertise in your codebase

Architecture decisions, integrations with your proprietary systems, model and memory tuning to your domain — done with you, in your repo. On-site if you need it.

03

Targeted at maximising AI inside your application

We're not here to bill hours. We're here to make sure Sonzai's memory, personality, mood, and self-learning models are pulling their weight everywhere they should — and to leave you a team that can keep extending it.

No tricks. No hidden tiers.

One pricing model: pay for the work units your agents actually use, plus a small margin so we can keep them awake and ready 24/7. If you need procurement, a private deployment, or help scoping your first relational AI agent, talk to us — we'll quote the build transparently and separate from runtime.