Memory
Agents remember facts, events, and commitments extracted automatically from every conversation, recalled on demand via search, timeline, and tree browsing.
Memory is the persistence layer behind every agent relationship. Each conversation is analyzed to extract facts, events, and commitments — stored in a structured tree and recalled automatically before the next response. Memory also composes directly with Scheduled Reminders: when a reminder fires and the user replies, the reply is captured as a new memory fact. It feeds Agent Insights too — habits, goals, and interests are derived signals aggregated over memory facts.
What you can build with it
- Relationship-arc companions — agents that reference shared history ("the week you were stressed about finals") to deepen connection over months
- Context-aware work assistants — skip re-asking for role, preferences, and recent tickets by seeding from CRM data on first run
- Compliance-ready enterprise agents — every recalled fact carries source message IDs, making the agent's reasoning auditable at review time
- Adherence dashboards — query memory after reminder fires to build medication or habit compliance views without a separate database
- Shared-memories UX — render the timeline endpoint as a browsable "our story" view inside companion or wellness apps
Quickstart
Search a user's memory by semantic query, then list the top-level tree for context.
import { Sonzai } from "@sonzai-labs/agents";
const client = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });
// Semantic search — scoped to a specific user
const results = await client.agents.memory.search("agent-id", {
query: "hiking trip",
userId: "user-123", // optional: omit to search across all users for this agent
limit: 10,
});
for (const mem of results.results) {
console.log(mem.content, mem.factType, mem.score);
}
// Browse the tree
const tree = await client.agents.memory.list("agent-id", {
userId: "user-123",
limit: 20,
});Core concepts
Memory tree structure
Memory is organized as a hierarchical tree of nodes, each with a NodeID, Title, Summary, and optional child nodes. Nodes act as thematic containers — "Jane's work life," "travel experiences" — and hold atomic facts beneath them. You can navigate the tree by passing parentID to list, or fetch a subtree with includeContents: true to pull a node's facts in one call.
Key MemoryNode fields: NodeID, AgentID, UserID, ParentID, Title, Summary, Importance, CreatedAt, UpdatedAt.
Facts vs summaries
Facts are atomic, source-anchored statements ("User is a senior product manager at Acme Corp"). Every fact traces back to a specific message in a real conversation — the agent cannot hallucinate memories. Summaries are auto-generated consolidations written at session end, giving long conversations a compact digest. Both live in the tree and both appear in search results.
Timeline queries
timeline returns a chronological view organized by session — each TimelineSession carries session_id, facts, first_fact_at, last_fact_at, and fact_count. Use it to render episodic history in your UI or to audit what was extracted from a specific time window.
Reset and scoping
reset deletes all memory for an agent–user pair and is irreversible. Use it for testing, privacy-right-to-erasure flows, or account handoffs. All write operations (seed, createFact, search) accept an instanceId to scope memory to a workspace or tenant, preventing cross-boundary leakage in multi-tenant deployments.
Sync vs async memory recall
Supplementary memory recall — the extra fact lookups that enrich each turn beyond the agent's automatic working set — runs synchronously by default: every fact lands in the current turn before generation starts. Switch to async when first-token latency matters more than completeness; recall races a deadline, and slow hits spill into the next turn.
memory_mode is an agent-wide capability. Set it once via update_capabilities(); every subsequent chat uses that mode until you change it. There is no equivalent at agent-creation time — create the agent first, then flip the mode.
// Read current capabilities
const caps = await client.agents.getCapabilities("agent-id");
console.log(caps.memoryMode); // "sync" or "async"
// Switch to async for lower first-token latency
await client.agents.updateCapabilities("agent-id", { memoryMode: "async" });
// Switch back to sync
await client.agents.updateCapabilities("agent-id", { memoryMode: "sync" });When to pick async: high-volume voice agents, mobile clients on slow networks, or any setup where missing one or two enrichment facts is preferable to a 200ms latency spike. The agent's automatic working set still lands on every turn — only supplementary recall slips.
Pending capabilities
AgentCapabilities.pendingCapabilities is a list of capability changes that have been queued by the platform but not yet applied — for example, a tier upgrade that will unlock music or video generation. Each entry carries a capability name (string) and an optional context string with human-readable detail. Read it via get_capabilities() to surface upgrade status in your UI.
const caps = await client.agents.getCapabilities("agent-id");
for (const pending of caps.pendingCapabilities ?? []) {
console.log(pending.capability, pending.context);
// e.g. "musicGeneration" "Scheduled for activation on plan upgrade"
}Full API
All methods are on client.agents.memory.* (TS/Python) or client.Agents.Memory (Go). Full request/response shapes live in the API reference.
| Method | Returns | Description |
|---|---|---|
list(agentID, opts) | MemoryTreeResponse | Browse the memory tree, optionally rooted at a parentID. Pass memory_type to filter results to a specific memory category: "factual", "episodic", "semantic", "procedural", "identity", "temporal", or "relational". This is a post-fetch filter applied on the result set — it does not reduce server-side I/O, so the limit applies before filtering. |
search(agentID, opts) | MemorySearchResponse | Semantic/keyword search; returns Results[] with content, factType, score. Pass userId (user_id in Python/JSON) to scope results to a single user; omit to search across all users for the agent. |
timeline(agentID, opts) | MemoryTimelineResponse | Chronological sessions with first_fact_at, last_fact_at, fact_count |
listFacts(agentID, opts) | FactListResponse | Paginated flat list of atomic facts; response has Facts, TotalCount, HasMore |
reset(agentID, opts) | MemoryResetResponse | Delete all memory for an agent–user pair |
createFact(agentID, opts) | AtomicFact | Manually insert a fact tagged source_type="manual" |
updateFact(agentID, factID, opts) | AtomicFact | Patch content, type, importance, or confidence of an existing fact |
deleteFact(agentID, factID) | void | Remove a single fact by ID |
seed(agentID, opts) | SeedMemoriesResponse | Bulk-import initial memories without an AI generation step |
deleteWisdomFact(agentID, factID) | DeleteWisdomResponse | Remove a wisdom-layer fact |
getWisdomAudit(agentID, factID) | WisdomAuditResponse | Full audit trail for a wisdom fact |
getFactHistory(agentID, factID) | FactHistoryResponse | Version history for a specific fact |
Combines with other features
With Scheduled Reminders — responses populate memory
When a scheduled reminder fires and the user replies, the memory layer auto-captures the reply as a fact. Query those facts later to build a compliance view or adherence dashboard without an extra database.
// After a week of daily medication reminders, query the captured replies
const memories = await client.agents.memory.search("agent-id", {
query: "medication taken ibuprofen",
limit: 10,
});
for (const result of memories.results) {
console.log(result.content, result.score);
// "User confirmed taking 500mg ibuprofen at 08:14" 0.89
}The full reminder-to-memory flow is shown in the Medication Reminders tutorial.
With Conversations — every turn writes memory
Memory is fully automatic during chat — you do not call any write endpoint yourself. The platform analyzes each conversation turn, extracts facts, events, and commitments, and stores them in the tree. The next time you call chat for that agent–user pair, the most relevant memories are assembled into context automatically.
// Just call chat — memory extraction and retrieval happen on every turn
const stream = client.agents.chat.stream("agent-id", {
userId: "user-123",
messages: [{ role: "user", content: "I've been training for a half marathon." }],
});
// After the conversation, the fact "user is training for a half marathon"
// is stored automatically — no extra call needed.
const results = await client.agents.memory.search("agent-id", {
query: "running training marathon",
limit: 5,
});
console.log(results.results[0].content);
// "User is training for a half marathon"With Agent Insights — memory is the raw material
Habits, goals, interests, and mood trends are derived signals the context engine aggregates over memory facts. Memory is what the engine reads; Agent Insights is what the engine produces. Search memory for raw facts, then call Agent Insights to see what those facts have been distilled into.
// 1. Fetch raw memory facts about fitness
const facts = await client.agents.memory.search("agent-id", {
query: "exercise fitness running",
limit: 10,
});
// 2. Fetch the derived habit signal the engine built from those facts
const habits = await client.agents.listHabits("agent-id", {
userId: "user-123",
});
console.log(habits.habits);
// [{ label: "Daily runner", frequency: "daily", confidence: 0.91 }]Tutorials
- Memory — end-to-end walkthrough — covers seed, search, timeline, manual facts, and reset.
Next steps
- Agent Insights — derived signals (habits, goals, interests) built on top of memory facts.
- Scheduled Reminders — proactive messages whose user replies flow back into memory.
- Conversations — every chat turn is the primary source of memory writes.
User Personas
Reusable tenant-level persona templates that shape how the agent treats different types of users — tone, pace, formality.
Agent Insights
Fetch what the agent has learned about each user — habits, goals, interests, relationships, diary entries, and constellation clusters. Derived signals from the context engine.