あなたが LLM とチャットループを所有。Sonzai はメモリ・ムード・パーソナリティ・関係性を所有。ターンごと — sessions.start → session.context() + あなたの LLM + session.turn() のループ → sessions.end。

既存のチャットループはそのまま残します。各 LLM 呼び出しの直前にユーザーメッセージへ関連したコンテキストを Sonzai に問い合わせ、LLM の応答後にそのやり取りだけを session.turn() に提出します。ムードは ~300–500 ms でインラインに反映されます。より深い抽出 — ファクト、パーソナリティのドリフト、習慣検出、ゴール更新 — は背景で 5〜15 秒かけて非同期に走ります。 Sonzai はあなたのツール実行を一切見ず、モデルも代わりに選びません。

チャットコンパニオン、音声エージェント、エージェントフレームワーク（OpenAI Agents SDK / LangChain / LiveKit）、そして Sonzai を採用する前から本番稼働中だった LLM ループのある場面に最適です。

使うべき時

カスタムツール、評価、プロンプトテンプレ、特定プロバイダなど、本番稼働中の LLM ループがすでにある。
ターンごとに新鮮なコンテキストが必要 — 会話の頭で 1 回引いて済ませたくない。
ムード・ファクト・パーソナリティ・習慣・ゴール・関係性のシグナルは欲しいが、LLM 選択とツール実行の主導権は手放したくない。

切り替えるべき時

1 つの会話の中で .turn() のレイテンシを毎回待っていられない — パターン 5: スタンドアローン・バッチに切り替え。
Sonzai が LLM 呼び出しを所有してもよい — パターン 1: マネージドランタイムに切り替えてほとんどのコードを削除できます。

アーキテクチャ

┌─────────────┐     ┌──────────────────┐     ┌──────────────┐
│  Your App   │     │   Sonzai API     │     │   Your LLM   │
└──────┬──────┘     └────────┬─────────┘     └──────┬───────┘
     │                     │                       │
     │  sessions.start     │                       │
     │────────────────────>│ (prewarms memory)     │
     │  <── Session ───────│                       │
     │                     │                       │
     │  ─── Per turn ──────────────────────────── │
     │                     │                       │
     │  session.context()  │                       │
     │────────────────────>│                       │
     │  <── enriched ctx ──│                       │
     │    personality, mood│                       │
     │    memories, goals  │                       │
     │                     │                       │
     │  Your LLM loop ─────┼──────────────────────>│
     │  + your tools       │                       │
     │  <── reply ─────────┼───────────────────────│
     │                     │                       │
     │  sendToUser(reply)  (no waiting on Sonzai)  │
     │                     │                       │
     │  session.turn()     │                       │
     │────────────────────>│ ⇒ sync mood ~300ms    │
     │  <── mood, status ──│ ⇒ background extract  │
     │                     │   (5–15s)             │
     │                     │                       │
     │  ─── Repeat ────────────────────────────── │
     │                     │                       │
     │  session.end()      │                       │
     │────────────────────>│── consolidate         │
     │                     │   long-term memory    │
     └─────────────────────┴───────────────────────┘

エンドツーエンドの例

最小構成のループ：セッションを開き、ターンごとにコンテキストを取得して LLM を呼び、やり取りを提出する。終了時にセッションをクローズします。

import { Sonzai } from "@sonzai-labs/agents";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });

async function runConversation(agentId: string, userId: string) {
const sessionId = `session-${Date.now()}`;
const history: { role: string; content: string }[] = [];

// Session handle bundles agentId/userId/sessionId + provider/model
// defaults so you don't repeat them on every call.
const session = await sonzai.agents.sessions.start(agentId, {
  userId,
  sessionId,
  provider: "gemini",
  model:    "gemini-3.1-flash-lite-preview",
});

async function turn(userMessage: string): Promise<string> {
  // 1. Pull fresh, query-relevant context BEFORE the LLM call.
  const ctx = await session.context({ query: userMessage });

  // 2. Your LLM, your tools — Sonzai is OUT of the loop here.
  const reply = await yourLLM.chat({
    system:   buildSystemPrompt(ctx),
    messages: [...history, { role: "user", content: userMessage }],
  });

  sendToUser(reply.content);

  // 3. Submit the exchange. Sync mood ~300ms; deeper extraction
  //    (facts, personality, habits) runs asynchronously 5–15s later.
  await session.turn({
    messages: [
      { role: "user",      content: userMessage },
      { role: "assistant", content: reply.content },
    ],
  });

  history.push({ role: "user",      content: userMessage });
  history.push({ role: "assistant", content: reply.content });
  return reply.content;
}

return { turn, end: () => session.end() };
}

// /context returns a flat object — read what you need, drop the rest.
function buildSystemPrompt(ctx: any): string {
const facts = (ctx.loaded_facts ?? []).map((f: any) => `- ${f.atomic_text}`).join("\n");
return [
  ctx.personality_prompt ?? "You are a helpful AI companion.",
  `Personality (Big5): ${JSON.stringify(ctx.big5 ?? {})}`,
  `Current mood: ${JSON.stringify(ctx.current_mood ?? {})}`,
  facts ? `Relevant memories:\n${facts}` : "",
].filter(Boolean).join("\n\n");
}

もっとも重要な習慣

LLM 呼び出しの前に必ず session.context(query=user_msg) を呼ぶ — 毎ターンです。ループを閉じる肝心の一歩がこれで、抜けると LLM は古い状態で動くことになり、メモリ層の価値が崩壊します。

fetchNextContext で 1 往復節約

session.turn() は fetchNextContext: { query: nextMessage } を受け付けます（Python: fetch_next_context={"query": ...}）。指定するとレスポンスの next_context に次の /context ペイロードが入ってくるため、ターン N が終わった時点でクライアントはターン N+1 のコンテキストをすでに持っています。

次に読む

パターン 1: メモリミドルウェア（深掘りガイド）

ツール呼び出し、マルチモーダル/画像のブリッジ、デュアル出力プロンプト、Sonzai の KB とメモリ検索を LLM ツールとして公開、遅延抽出のポーリング。

エンドポイント逐次解説

sessions.start / session.context / session.turn / /process / sessions.end と読み取り系エンドポイントの完全リファレンス。

パターン 5: スタンドアローン・バッチ

同じメモリモデル。ただしターンごとではなく終了時に会話全体を一括投入する形。

パターン 4: スタンドアローンメモリ（リアルタイム）

使うべき時

切り替えるべき時

アーキテクチャ

エンドツーエンドの例

次に読む

On this page