Migrate structured tabular data — Salesforce and HubSpot contacts, product ownership tables, subscription rosters — into Sonzai. Two patterns, one for contact rosters, one for inventory-style facts.

Two CSV patterns, two endpoints

Structured data in Sonzai splits into two very different import shapes:

Contact roster — one row per person, with attributes like company, title, email. You want one Sonzai user per row. → Use batch_import with metadata fields.
Inventory table — many rows per person, each describing something the user owns, bought, subscribes to, or holds. You want many facts per user. → Use structured_import on prime_user.

They're often used together: import contacts first with batch_import, then enrich each contact with their inventory via structured_import.

Pattern 1: Contact roster → users

Salesforce / HubSpot / Pipedrive / custom CRM exports look like this:

contact_id,first_name,last_name,email,company,title,phone,last_contacted
c_001,Mia,Tanaka,[email protected],Acme,Platform Lead,+81-90-...,2025-11-02
c_002,Ren,Park,[email protected],Beta Labs,CTO,+82-10-...,2025-10-20

Map each row to a batchImportUser and send in chunks:

import csv, os
from sonzai import Sonzai

sonzai = Sonzai(api_key=os.environ["SONZAI_API_KEY"])
AGENT_ID = "agent_abc"

def load_contacts(path: str):
  with open(path) as f:
      return list(csv.DictReader(f))

def to_sonzai_user(row):
  return {
      "user_id": row["contact_id"],
      "display_name": f"{row['first_name']} {row['last_name']}".strip(),
      "metadata": {
          "company": row.get("company") or None,
          "title":   row.get("title") or None,
          "email":   row.get("email") or None,
          "phone":   row.get("phone") or None,
          "custom": {
              "last_contacted": row.get("last_contacted") or "",
          },
      },
  }

def migrate(path: str):
  rows = load_contacts(path)
  for i in range(0, len(rows), 200):
      users = [to_sonzai_user(r) for r in rows[i : i + 200]]
      job = sonzai.agents.priming.batch_import(
          AGENT_ID, source="crm", users=users,
      )
      print(f"batch {i//200}: {job.total_users} users, "
            f"{job.facts_created} facts from metadata")

Because the metadata fields (company, title, email, phone) are first-class, Sonzai generates facts from them synchronously — the response's facts_created count is real and non-zero even without any content blocks.

Pattern 2: Inventory table → structured facts

For a table like "which products does each customer own", use structured_import. Each row becomes one fact shaped as User owns <label> with all the row's columns attached as structured metadata.

Sample CSV:

customer_id,product_sku,product_name,quantity,purchase_date,is_active
c_001,SKU-001,Hiking Backpack,1,2025-09-01,true
c_001,SKU-045,Water Bottle,2,2025-10-12,true
c_002,SKU-001,Hiking Backpack,1,2025-08-20,false

You call prime_user once per customer, passing that customer's rows as a CSV string plus a column mapping.

import fs from "fs";
import { parse, stringify } from "csv-parse/sync";
import { Sonzai } from "@sonzai-labs/agents";

const sonzai = new Sonzai({ apiKey: process.env.SONZAI_API_KEY! });
const AGENT_ID = "agent_abc";

async function importInventory(path: string) {
const allRows: Record<string, string>[] =
  parse(fs.readFileSync(path), { columns: true, skip_empty_lines: true });

// Group by customer
const byCustomer = new Map<string, Record<string, string>[]>();
for (const r of allRows) {
  if (!byCustomer.has(r.customer_id)) byCustomer.set(r.customer_id, []);
  byCustomer.get(r.customer_id)!.push(r);
}

for (const [userId, rows] of byCustomer) {
  // Serialize just this customer's rows back to CSV
  const csv = [
    "product_sku,product_name,quantity,purchase_date,is_active",
    ...rows.map(r => `${r.product_sku},${r.product_name},${r.quantity},${r.purchase_date},${r.is_active}`),
  ].join("\n");

  await sonzai.agents.priming.primeUser(AGENT_ID, userId, {
    source: "crm_inventory",
    structured_import: {
      entity_type: "product",
      content_csv: csv,
      column_mapping: {
        product_sku:   { property: "sku" },
        product_name:  { property: "name", is_label: true },
        quantity:      { property: "quantity", type: "number" },
        purchase_date: { property: "purchased_on" },
        is_active:     { property: "active", type: "boolean" },
      },
    },
  });
}
}

Column mapping rules

Each entry in column_mapping maps a CSV header to a metadata key on the resulting fact:

property — the metadata key the column's value ends up under.
is_label — exactly one column should be marked is_label: true. Its value becomes the human-readable name of the item (used to build the fact's text, e.g. "User owns Hiking Backpack", and to attempt knowledge base resolution).
type — coerces values to "number" or "boolean". Default is string. Unparseable numbers fall back to string without error.

Omitted columns are ignored. Extra columns in the CSV not present in column_mapping are silently dropped.

Optional: resolve against a knowledge base

Pass project_id inside structured_import to have Sonzai attempt to match each row's label against knowledge-base nodes of the given entity_type. Resolved rows get kb_node_id and kb_label stamped on their fact metadata, making them joinable with your product catalog.

{
  "structured_import": {
    "entity_type": "product",
    "project_id":  "proj_xyz",
    "content_csv": "...",
    "column_mapping": { ... }
  }
}

The response's kb_resolved and unresolved counts tell you the match rate. Upload your product catalog to the KB first (see Knowledge Base) and it'll work on subsequent imports.

Python SDK note

The sync Python SDK's prime_user() doesn't currently accept structured_import as a keyword. Use the async client (AsyncSonzai) which forwards extra kwargs, or fall back to an HTTP call via requests.post(). TypeScript and curl work as shown above.

Verify

# After either pattern, inspect the agent's users
curl -s "https://api.sonz.ai/api/v1/agents/agent_abc/users?limit=10" \
  -H "Authorization: Bearer $SONZAI_API_KEY" | jq '.users[] | {user_id,display_name,metadata}'

# For structured_import, spot-check the inventory facts
curl -s "https://api.sonz.ai/api/v1/agents/agent_abc/memory/facts?user_id=c_001&source_type=structured_import" \
  -H "Authorization: Bearer $SONZAI_API_KEY" | jq '.facts[] | {content,metadata}'

Tips

Combine patterns. Run batch_import first for all contacts (metadata), then loop with prime_user + structured_import for their inventory. The second call adds facts without disturbing the user record.
Label choice matters. The is_label column becomes both the fact text ("User owns X") and the KB resolution query. Pick the column with human-readable names, not IDs — "Hiking Backpack" resolves; "SKU-001" doesn't.
No LLM cost. structured_import skips LLM extraction entirely. Facts are deterministic from your CSV. That makes this the cheapest and most reliable path when you have clean tabular data.
Encoding. UTF-8. If your CRM export is UTF-16 / Latin-1, convert first — the CSV parser expects UTF-8.

What's next

Knowledge base — upload a product catalog so structured_import can resolve rows to KB nodes.
Inventory — the live API for querying and updating these facts after migration.
Custom JSON — if your CRM has conversational notes you also want to migrate.

From CRM / CSV