From CRM / CSV
Migrate structured tabular data — Salesforce and HubSpot contacts, product ownership tables, subscription rosters — into Sonzai. Two patterns, one for contact rosters, one for inventory-style facts.
Two CSV patterns, two endpoints
Structured data in Sonzai splits into two very different import shapes:
- Contact roster — one row per person, with attributes like company, title, email. You want one Sonzai user per row. → Use
batch_importwith metadata fields. - Inventory table — many rows per person, each describing something the user owns, bought, subscribes to, or holds. You want many facts per user. → Use
structured_importonprime_user.
They're often used together: import contacts first with batch_import, then enrich each contact with their inventory via structured_import.
Pattern 1: Contact roster → users
Salesforce / HubSpot / Pipedrive / custom CRM exports look like this:
contact_id,first_name,last_name,email,company,title,phone,last_contacted
c_001,Mia,Tanaka,[email protected],Acme,Platform Lead,+81-90-...,2025-11-02
c_002,Ren,Park,[email protected],Beta Labs,CTO,+82-10-...,2025-10-20Map each row to a batchImportUser and send in chunks:
import csv, os
from sonzai import Sonzai
sonzai = Sonzai(api_key=os.environ["SONZAI_API_KEY"])
AGENT_ID = "agent_abc"
def load_contacts(path: str):
with open(path) as f:
return list(csv.DictReader(f))
def to_sonzai_user(row):
return {
"user_id": row["contact_id"],
"display_name": f"{row['first_name']} {row['last_name']}".strip(),
"metadata": {
"company": row.get("company") or None,
"title": row.get("title") or None,
"email": row.get("email") or None,
"phone": row.get("phone") or None,
"custom": {
"last_contacted": row.get("last_contacted") or "",
},
},
}
def migrate(path: str):
rows = load_contacts(path)
for i in range(0, len(rows), 200):
users = [to_sonzai_user(r) for r in rows[i : i + 200]]
job = sonzai.agents.priming.batch_import(
AGENT_ID, source="crm", users=users,
)
print(f"batch {i//200}: {job.total_users} users, "
f"{job.facts_created} facts from metadata")Because the metadata fields (company, title, email, phone) are first-class, Sonzai generates facts from them synchronously — the response's facts_created count is real and non-zero even without any content blocks.
Pattern 2: Inventory table → structured facts
For a table like "which products does each customer own", use structured_import. Each row becomes one fact shaped as User owns <label> with all the row's columns attached as structured metadata.
Sample CSV:
customer_id,product_sku,product_name,quantity,purchase_date,is_active
c_001,SKU-001,Hiking Backpack,1,2025-09-01,true
c_001,SKU-045,Water Bottle,2,2025-10-12,true
c_002,SKU-001,Hiking Backpack,1,2025-08-20,falseYou call prime_user once per customer, passing that customer's rows as a CSV string plus a column mapping.
import fs from "fs";
import { parse, stringify } from "csv-parse/sync";
import { Sonzai } from "@sonzai-labs/agents";
const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });
const AGENT_ID = "agent_abc";
async function importInventory(path: string) {
const allRows: Record<string, string>[] =
parse(fs.readFileSync(path), { columns: true, skip_empty_lines: true });
// Group by customer
const byCustomer = new Map<string, Record<string, string>[]>();
for (const r of allRows) {
if (!byCustomer.has(r.customer_id)) byCustomer.set(r.customer_id, []);
byCustomer.get(r.customer_id)!.push(r);
}
for (const [userId, rows] of byCustomer) {
// Serialize just this customer's rows back to CSV
const csv = [
"product_sku,product_name,quantity,purchase_date,is_active",
...rows.map(r => `${r.product_sku},${r.product_name},${r.quantity},${r.purchase_date},${r.is_active}`),
].join("\n");
await sonzai.agents.priming.primeUser(AGENT_ID, userId, {
source: "crm_inventory",
structured_import: {
entity_type: "product",
content_csv: csv,
column_mapping: {
product_sku: { property: "sku" },
product_name: { property: "name", is_label: true },
quantity: { property: "quantity", type: "number" },
purchase_date: { property: "purchased_on" },
is_active: { property: "active", type: "boolean" },
},
},
});
}
}Column mapping rules
Each entry in column_mapping maps a CSV header to a metadata key on the resulting fact:
property— the metadata key the column's value ends up under.is_label— exactly one column should be markedis_label: true. Its value becomes the human-readable name of the item (used to build the fact's text, e.g."User owns Hiking Backpack", and to attempt knowledge base resolution).type— coerces values to"number"or"boolean". Default is string. Unparseable numbers fall back to string without error.
Omitted columns are ignored. Extra columns in the CSV not present in column_mapping are silently dropped.
Optional: resolve against a knowledge base
Pass project_id inside structured_import to have Sonzai attempt to match each row's label against knowledge-base nodes of the given entity_type. Resolved rows get kb_node_id and kb_label stamped on their fact metadata, making them joinable with your product catalog.
{
"structured_import": {
"entity_type": "product",
"project_id": "proj_xyz",
"content_csv": "...",
"column_mapping": { ... }
}
}The response's kb_resolved and unresolved counts tell you the match rate. Upload your product catalog to the KB first (see Knowledge Base) and it'll work on subsequent imports.
Python SDK note
The sync Python SDK's prime_user() doesn't currently accept structured_import as a keyword. Use the async client (AsyncSonzai) which forwards extra kwargs, or fall back to an HTTP call via requests.post(). TypeScript and curl work as shown above.
Verify
# After either pattern, inspect the agent's users
curl -s "https://api.sonz.ai/api/v1/agents/agent_abc/users?limit=10" \
-H "Authorization: Bearer $SONZAI_API_KEY" | jq '.users[] | {user_id,display_name,metadata}'
# For structured_import, spot-check the inventory facts
curl -s "https://api.sonz.ai/api/v1/agents/agent_abc/memory/facts?user_id=c_001&source_type=structured_import" \
-H "Authorization: Bearer $SONZAI_API_KEY" | jq '.facts[] | {content,metadata}'Tips
- Combine patterns. Run
batch_importfirst for all contacts (metadata), then loop withprime_user+structured_importfor their inventory. The second call adds facts without disturbing the user record. - Label choice matters. The
is_labelcolumn becomes both the fact text ("User owns X") and the KB resolution query. Pick the column with human-readable names, not IDs —"Hiking Backpack"resolves;"SKU-001"doesn't. - No LLM cost.
structured_importskips LLM extraction entirely. Facts are deterministic from your CSV. That makes this the cheapest and most reliable path when you have clean tabular data. - Encoding. UTF-8. If your CRM export is UTF-16 / Latin-1, convert first — the CSV parser expects UTF-8.
What's next
- Knowledge base — upload a product catalog so
structured_importcan resolve rows to KB nodes. - Inventory — the live API for querying and updating these facts after migration.
- Custom JSON — if your CRM has conversational notes you also want to migrate.
From custom JSON
The catch-all migration guide for homegrown chat stores, proprietary databases, Responses API logs, and anything else that has users and messages. Anything with (user_id, messages[]) fits.
Knowledge base documents
Migrate your document corpus — PDFs, DOCX, Markdown, plain text — into Sonzai's knowledge graph. The extractor builds a deduplicated graph of entities and relationships that agents search during conversations.