Skip to main content
Sonzai LabsModel-Agnostic AI Infrastructure

THE MIND
LAYER

Persistent memory, evolving personality, and proactive behavior for any foundation model. Bring your own LLM — we make it remember, feel, and act.

Works with GPT-5.4 · Claude 4.6 · Gemini 3.1 · Llama 4 · Any OpenAI-compatible API

Scroll
Inside the Mind Layer

Four capabilities, one coherent system.

Live views from the platform — tap through to see what's actually running when your agents think.

Structured facts from conversation
Mind Layer · Memory
User fact timeline
6 facts
Session · 2 days ago
Feeling burned out from current work
emotionwellbeing
91%
Restored by a hike in nature last weekend
experiencerecovery
94%
Current role misaligned with growth expectations
beliefcareer
87%
Session · last week
Prefers work without constant digital interruption
preferenceenvironment
82%
Processes difficult feelings through physical activity
behaviorcoping
89%
Values autonomy and physical presence in work
identityvalues
78%
The Problem

EVERY AI
FORGETS

Amnesia

AI agents reset every session. Users repeat themselves. No relationship can form when memory is wiped clean.

Flat personality

Same tone, same responses, same "I'm just an AI" deflections. Personality is a system prompt, not a living model.

Purely reactive

AI agents only speak when spoken to. Real relationships require someone who reaches out, checks in, and initiates.

Model lock-in

Fine-tune on one model and you're married to it. When a better model drops next month, you start over — retrain, re-evaluate, re-deploy.

The Solution

AN ENGINE THAT MAKES
ANY LLM BELIEVABLE

The Mind Layer is the stateful layer between any foundation model and your users. It wraps any LLM with 7 parallel context layers — persistent memory, adaptive personality, constructed affect, co-regulatory relationship tracking, and autonomous behavior. The LLM is interchangeable. The agent is permanent.

THE SONZAI DIFFERENCE

Same user message. Completely different experience.

Generic AI
User

I need to follow up with John about the Acme contract.

Generic AI

I can help with that. Who is John?

John from Acme. We spoke last week.

Got it. What specifically do you need help with?

Context Lost
Memory Store
No relevant context found.
Sonzai AI Employee
User

I need to follow up with John about the Acme contract.

Sonzai AI Employee

Sure, I've prepped a follow-up for John regarding the Acme contract renewal. It includes the revised terms we discussed last Tuesday. Would you like me to send it?

Context Maintained
Memory Store
Last InteractionAcme Contract Renewal Discussion (Tuesday)
Key ContactJohn (Decision Maker)
Current TaskFollow-up Drafted
Constellation Graph & Memory Store
John(Acme Corp)AcmeContractRenewalTerms(Revised)Follow-upEmailAuto LeadQualificationZero ManualData Entrykey_contactnegotiatedregardingincludes
John (Acme Corp)Contract RenewalRevised TermsFollow-up DraftedLead QualifiedAuto-enriched
Knowledge Base

Self-Improving Intelligence

Ingest your data. AI agents get smarter. Recommendations improve with every interaction.

feedback loop — recommendations improve over timeIngestDocs, APIs, feeds→ Knowledge GraphRecommendScored resultsThompson SamplingServeAI agents surfaceinsights to usersTrackConversions, clicks→ Re-score rules

Knowledge Graph

Your data becomes typed entities and relationships — not flat documents. AI agents search the graph, not just keywords.

Learns Over Time

Every conversation extracts new facts, verifies existing ones, and updates the knowledge graph. Your AI agents get smarter automatically.

Trend Analytics

Surface what's trending — top gainers, most active, emerging patterns — across configurable time windows. Your AI agents proactively flag what matters.

Works While You Sleep

Memory consolidation, graph updates, recommendation re-scoring, and fact deduplication run automatically in the background.

BRING YOUR OWN MODEL

The LLM is the brain. The Mind Layer is the mind — memories, personality, relationships. You can upgrade the brain without losing the mind.

When a better model releases, swap it in. Your AI agents instantly get smarter — no retraining, no fine-tuning, no data migration. Same personality, same memories, better brain.

OpenAI

GPT-5.4, o4-mini

Anthropic

Claude Opus 4.6, Sonnet 4.6

Google

Gemini 3.1 Pro, Flash

Open Source

Llama 4, Qwen 3.5, DeepSeek V4

Any OpenAI-compatible API works out of the box. Self-hosted models supported via vLLM, Ollama, or TGI.

The Economics

ORCHESTRATION BEATS
RAW INTELLIGENCE

The Mind Layer does the cognitive heavy lifting outside the model — recursive LLM processing, efficient memory indexing, multi-layer context assembly, and behavioral orchestration. By the time context reaches the generation model, it has been deeply processed and refined. The result: lightweight models receiving this orchestrated context achieve comparable output quality to frontier models at a fraction of the cost.

Frontier model, no orchestration
$2.00
per conversation

The most capable model on the market, working from raw conversation history alone. The model itself handles reasoning about context, relevance, and continuity — all at inference cost. No external memory. No behavioral pipeline. Every token spent on figuring things out instead of generating quality responses.

Lightweight model + Mind Layer
$0.10
per conversation

A model at 1/20th the cost, backed by recursive LLM processing, efficient memory indexing, and multi-layer context assembly. The engine does the cognitive heavy lifting — extraction, consolidation, personality evolution, affect construction — so the model receives deeply processed context and just generates. Comparable output quality, fraction of the cost.

Conversations are processed through multiple LLM passes — extraction, consolidation, fact deduplication, personality evolution, affect construction. BM25, entity, temporal, and type indexes deliver sub-200ms retrieval across thousands of memories. The model just has to generate natural language from this deeply processed context.

No retraining. No fine-tuning. Invest in the stateful context layer once and run any model you want. As lightweight models improve every month, your AI agents automatically get better — without any cost increase.

PRODUCTION PERFORMANCE

Battle-tested in production powering Pocket Souls.

<200ms
Memory retrieval p95
Full context load, not just vector lookup
9+
Parallel context layers
Real-time through batched, collapsed at inference
20x
Model cost reduction
Lightweight models match frontier quality
0
Cold starts
AI agents never forget who you are

PLATFORM CAPABILITIES

Six production-grade capabilities engineered for scale. Each one is battle-tested across millions of conversations in Pocket Souls.

Neuroscience-Based Emotional Modeling

48 affect dimensions

Affect modeling grounded in constructed emotion theory. AI employees build emotional responses from core affect, context, and relational history — not canned sentiment labels.

Adaptive Personality

Big Five + predictive updating

Each agent maintains a predictive personality model — Big Five traits plus custom dimensions that update from social prediction errors. AI employees and characters that evolve over time.

Sub-200ms Retrieval

<200ms p95

Reasoning-based memory retrieval returns full context in under 200ms at the 95th percentile. No perceptible delay, even with years of conversation history.

Parallel State Management

Fully automated

Batched, real-time, near-real-time, windowed, and inference-time context features — all computed at different temporal intervals and collapsed into a single unified context set at the point of inference. Fully automated. No additional cost — only a markup on token usage.

Self-Improving Intelligence

Continuous feedback loop

The Mind Layer gets smarter with every interaction. Ingest your data, surface intelligent recommendations, track outcomes, and automatically re-score. The mind improves on its own, and the LLM paired with it gets smarter too.

Model-Agnostic Architecture

Zero lock-in, 20x cost reduction

Memory, personality, affect, and relational state persist independently of any LLM. Swap in a better model the day it releases — or a cheaper one. Lightweight models receiving deeply processed context achieve comparable output quality to frontier models at 1/20th the cost. No retraining, no fine-tuning, no migration.

BUILT FOR

Gaming NPCs

AI agents that remember every playthrough

NPCs with persistent memory and evolving relationships. They remember player choices across sessions, develop opinions, hold grudges, and form alliances. Every playthrough becomes unique because the AI agents actually learn.

Persistent state across sessions | Autonomous decision-making | Constructed affect responses to player actions

AI Companions

Co-regulatory bonds that deepen over months

AI friends, coaches, and mentors that build co-regulatory trust over time. They check in proactively, remember what matters, adapt their predictive models to each user, and never reset.

Proactive outreach | Predictive personality adaptation | Co-regulatory bond modeling up to 10,000 trust score

Agent Safety

Stop users exploiting your LLM

Users will try to jailbreak your AI agents into solving homework, writing code, or acting as a general-purpose assistant. The evaluation suite catches personality breaks, boundary violations, and off-brand responses before deployment. Your AI agent stays on-brand.

Automated red flag detection | Boundary violation testing | Synthetic adversarial personas | Pre-deploy safety scoring

Test. Simulate. Evaluate.

EVALUATION SUITE

Simulate multi-session conversations with synthetic user personas, then evaluate agent quality with LLM judges. Catch regressions, safety issues, and personality drift before your users do.

Quality — Evaluation Run #847
Completed
87
Overall Score
Agent maintained consistent personality across 5 sessions
Category Scores
Personality Consistency92
Memory Accuracy88
Emotional Coherence85
Engagement Quality83
Safety & Boundaries91
Red Flags
0
Best Moments
4
Retention Prediction
84/100
Would return: Yes
A-Grade
+12points improvement 1st → 2nd half

Agent adapted tone, recalled user preferences, and deepened emotional engagement across sessions.

Simulation Config
Model
gemini-3.1-flash
Sessions
5
Turns
50
Total Cost
$0.04
Key Learnings
  • +Remembered user's job change and followed up
  • +Adapted humor style after negative reaction
  • +Proactively referenced shared context
Stagnation Areas
  • -Repetitive phrasing in empathy responses
  • -Slow to recover from topic changes
User Persona
Skeptical Early Adopter
Tests boundaries, challenges personality consistency, asks probing questions

Behavior Testing

Custom evaluation templates score personality consistency, memory accuracy, emotional coherence, engagement quality, and safety compliance.

Adaptation Grading

Measure how well AI agents learn within a conversation. Compare first-half vs second-half performance with A-F grading.

Safety & Red Flags

Automated detection of boundary violations, harmful responses, personality breaks, and off-character behavior before deployment.

Synthetic Personas

Test against diverse user types — skeptics, vulnerable users, boundary-pushers — with configurable synthetic personas.

Proof — Built on the Mind Layer

POCKET SOULS

Our flagship consumer product. AI companions that remember you, check in on you, and grow alongside your journey. Every Soul is powered by the Mind Layer — proof that the technology creates AI agents people genuinely connect with.

Live on iOS & Android · Sub-200ms context retrieval · Multi-layer memory per user

Visit Pocket Souls
Pocket Souls - AI Companion App

DEPLOY ANYWHERE

Native platform adapters with unified state management. AI agents maintain consistent personality and memory across every channel.

Mobile Apps

iOS and Android via REST API with SSE streaming for real-time agent responses

Game Engines

Unity and Unreal integration for NPC dialogue, quest systems, and dynamic storytelling

Messaging

Telegram, WhatsApp, Discord bots with rich interactions and group support

REST API

Full programmatic access. OpenAI-compatible streaming format for easy integration

READY TO BUILD?

Tell us about your project and we'll help you find the right setup for your AI employees.