Memory & Context
Agents remember what matters. Every conversation is analyzed to extract facts, events, and commitments — stored and recalled automatically.
Memory Categories
Memories are stored in four categories:
- Facts: Permanent knowledge: user preferences, personal details, background. Example: "User is a software engineer who loves Thai food."
- Events: Episodic memories of shared experiences. Example: "We talked about their trip to Japan last week."
- Commitments: Promises and plans the agent made. Example: "Promised to ask about their job interview next time."
- Summaries: Consolidated conversation summaries. Auto-generated between sessions.
How Memory Works
Memory is fully automatic — you don't need to manage it. The platform analyzes each conversation and extracts facts, events, and commitments. Before each response, the most relevant memories are assembled and included in context automatically.
No orchestration needed
Just call chat. The platform handles memory extraction, storage, and retrieval on every interaction.
Self-Improving Memory
Memory is not a static store. Behind every chat call, several closed feedback loops run automatically to keep memory accurate, organized, and relevant over time. None of this requires code on your side.
- Importance feedback — facts the agent actually uses in its responses get reinforced; facts that are loaded but ignored gradually fade.
- Confidence reinforcement — when a fact is recalled and confirmed in conversation, its confidence climbs steadily toward certainty.
- Natural forgetting — facts gradually decay over time, but never below a floor. Emotionally significant and identity-defining facts decay much more slowly than neutral ones.
- Per-user retrieval policy — the platform learns each agent–user pair's retrieval preferences from session feedback. After a few weeks, retrieval is tuned specifically to that user's patterns.
- Memory association — memories that get accessed together strengthen the link between them. Frequently-traversed paths in the memory graph get faster over time.
- Adaptive retrieval budget — retrieval runs against a self-tuning time budget. Quality stays consistent; users always feel responsive.
- Next-session prediction — at session end, the platform predicts which topics the user is likely to bring up next time and pre-warms context for them.
For a complete walk-through of every mechanism — including consolidation, clustering, dedup, boundary detection, narrative arcs, breakthroughs, and the rollout safety system — see How Agents Improve Over Time.
Automatic Consolidation, Dedup, and Cleanup
The platform actively reshapes memory in the background so it stays compact and navigable as it grows. You do not call any of these — they run on the right cadence by default.
Thematic clustering
New facts are automatically grouped into semantic clusters. Clusters split when they get heterogeneous, merge when they drift together, and retire when empty — keeping the cluster set balanced without any tuning.
Reversible deduplication
When two facts turn out to be the same thing, the platform merges them and records the merge with a full audit trail. Every merge can be reversed if a later signal contradicts it. Memory is never destroyed — it is reorganized, with every reorganization tracked.
Conflict resolution
When new information contradicts existing memory ("I moved to Berlin last month" overriding "I live in Paris"), the platform reasons about the conflict and chooses the right action — keep both as new information, combine them, supersede the old one, or discard a strict duplicate. When a contradiction can't yet be resolved cleanly, both versions are preserved so nothing is lost prematurely.
Source-anchored facts
Facts that cannot be traced back to an actual quote in the conversation are rejected before they enter storage. The agent cannot hallucinate memories — every stored fact is verified to be anchored in a real message from a real speaker.
Pruning
Branches with low combined confidence, importance, and recency are pruned. The platform never deletes high-value memories, but it stops surfacing branches that have nothing to contribute.
Tree self-organization
The hierarchical memory tree restructures itself over time. Frequently-accessed branches gradually move closer to the root for faster retrieval. Overcrowded nodes split into balanced subtrees. The shape of memory ends up reflecting how it's actually used.
Narrative arc compression
Entities and themes that recur across multiple sessions are compressed into named narrative arcs. A long-running thread (e.g., "the user's startup launch") becomes one arc instead of twenty individual facts — long-horizon conversations stay coherent without exploding the context window.
Boundary Detection and Episodes
Conversations don't have neat boundaries — sometimes a user says "anyway, on a totally different note..." and sometimes they pivot mid-paragraph. The platform detects these shifts automatically and uses them to organize memory into coherent episodes.
A two-stage check runs lightweight signals first and only escalates to a deeper semantic check when those signals are ambiguous. The signal weights are calibrated per agent–user pair from session-end audits, so episode detection improves with use.
When the agent retrieves memory, it can pull "all memories from this episode" rather than just keyword-matched fragments — giving narrative continuity to its replies.
Seed Memory
Pre-load what an agent knows about a user before their first conversation using memory.seed().
import { Sonzai } from "@sonzai-labs/agents";
const client = new Sonzai({ apiKey: "sk-..." });
await client.agents.memory.seed("agent-id", {
userId: "user-123",
memories: [
{ text: "User's name is Jane Smith", factType: "fact" },
{ text: "Jane is a senior product manager at Acme Corp", factType: "fact" },
{ text: "Jane lives in San Francisco and enjoys hiking", factType: "fact" },
],
});Search Memory
Search memories for a user-agent pair by keyword or semantic query.
const results = await client.agents.memory.search("agent-id", {
userId: "user-123",
query: "hiking trip",
limit: 10,
});
for (const mem of results.memories) {
console.log(mem.content, mem.type, mem.createdAt);
}List & Browse
List all memories or browse them by category.
// List all memories (paginated)
const memories = await client.agents.memory.list("agent-id", {
userId: "user-123",
type: "fact", // "fact" | "event" | "commitment" | "summary"
limit: 20,
offset: 0,
});
// Browse by category
const facts = await client.agents.memory.listFacts("agent-id", {
userId: "user-123",
});Memory Timeline
Get a chronological view of a user's memory history.
const timeline = await client.agents.memory.timeline("agent-id", {
userId: "user-123",
from: "2026-01-01",
to: "2026-03-31",
});
for (const entry of timeline.entries) {
console.log(entry.date, entry.summary);
}Browse Memory Tree
Navigate the hierarchical memory structure for a user.
// Browse the memory tree at a given path
const nodes = await client.agents.memory.browse("agent-id", {
userId: "user-123",
path: "/facts", // optional: filter by path
});Reset Memory
Clear all memories for a user-agent pair — useful for testing or when a user wants a fresh start.
await client.agents.memory.reset("agent-id", {
userId: "user-123",
});In Practice
Memory is central for every use case, but what you do with it looks different depending on what you're building. Pick your track.
Memory is the relationship arc. Companions accumulate shared history — "our first chat about astronomy," "the week they were stressed about finals" — and the agent references it naturally to deepen connection.
Seed sparingly. Don't pre-load everything you know about the user; let most memory form organically through conversation. Seed only durable identity facts (name, key interests, context for why they showed up).
Browse the timeline to drive UI. The timeline endpoint surfaces episodic memories with dates — render this as a "shared memories" view in your app so users can see the relationship's history.
const timeline = await client.agents.memory.timeline("agent-id", {
userId: "user-123",
limit: 30,
});
for (const entry of timeline.entries) {
// Render each episode as a moment in your companion UI
render(entry.date, entry.summary, entry.moodBefore, entry.moodAfter);
}No reset button in companion apps unless you really mean "forget me" — that breaks the relationship.