Persona Generator
Generate user personas from database schema
The PersonaGenerator analyzes your database schema to infer realistic user personas - the different types of people who would query your database. These personas can be used with SchemaSynthesizer and BreadthEvolver to generate diverse, perspective-aware training data.
Basic Usage
import { PersonaGenerator } from '@deepagents/text2sql/synthesis';
const generator = new PersonaGenerator(adapter, { count: 5 });
const personas = await generator.generate();
// [
// {
// role: "Financial Analyst",
// perspective: "As financial analyst, I care about:\n- Revenue trends and forecasting..."
// },
// {
// role: "Customer Support Rep",
// perspective: "As customer support, I care about:\n- Quick lookups by order ID..."
// }
// ]How It Works
PersonaGenerator examines your database schema to understand the domain and generate relevant personas:
- Introspects schema (tables, columns, relationships)
- Analyzes table names and structure to infer business domain
- Identifies different user types who would query this data
- Generates detailed perspectives for each persona including:
- What questions they typically ask
- What metrics/data points matter to them
- How they prefer data formatted
- Their priorities (speed vs accuracy, detail vs summary)
- Domain-specific concerns relevant to their role
Configuration Options
interface PersonaGeneratorOptions {
/** Number of personas to generate (default: 5) */
count?: number;
/** Model to use for generation */
model?: AgentModel;
}Number of Personas
Control diversity by adjusting count:
// Few personas - focused on main user types
const personas = await new PersonaGenerator(adapter, {
count: 3
}).generate();
// Many personas - maximum diversity
const personas = await new PersonaGenerator(adapter, {
count: 10
}).generate();Custom Model
Override the default model:
import { groq } from '@ai-sdk/groq';
const personas = await new PersonaGenerator(adapter, {
count: 5,
model: groq('llama-3.3-70b-versatile')
}).generate();Using with SchemaSynthesizer
Generate questions from different perspectives:
import { PersonaGenerator, SchemaSynthesizer, toPairs } from '@deepagents/text2sql/synthesis';
// Generate personas
const generator = new PersonaGenerator(adapter, { count: 5 });
const personas = await generator.generate();
// Use personas for question generation
const pairs = await toPairs(new SchemaSynthesizer(adapter, {
count: 10, // 10 questions per persona
complexity: 'medium',
personas: personas // Each persona gets their own questions
}));
// Result: 50 pairs (5 personas × 10 questions each)Using with BreadthEvolver
Paraphrase questions from different perspectives:
import { PersonaGenerator, BreadthEvolver, toPairs } from '@deepagents/text2sql/synthesis';
const generator = new PersonaGenerator(adapter, { count: 3 });
const personas = await generator.generate();
const existingPairs = [
{ question: 'Show revenue by product', sql: 'SELECT ...', success: true }
];
// Generate variations from analyst perspective
const analystVariations = await toPairs(
new BreadthEvolver(existingPairs, {
count: 3,
persona: personas[0] // Financial Analyst
})
);
// "What is the revenue breakdown by product?"
// "Display product-level revenue analysis"
// "Break down revenue by product category"
// Generate variations from support perspective
const supportVariations = await toPairs(
new BreadthEvolver(existingPairs, {
count: 3,
persona: personas[1] // Customer Support Rep
})
);
// "Show me which products are bringing in money"
// "What's each product earning?"
// "List products and their sales"Example Output
For an e-commerce database with orders, customers, and products tables:
[
{
role: "Customer Support Rep",
perspective: `As customer support, I care about:
- Quick lookups by order ID or customer email
- Order status and shipping tracking
- Return and refund history
- Customer contact details and order history
- I need fast answers, not complex analysis`
},
{
role: "Inventory Manager",
perspective: `As inventory manager, I care about:
- Current stock levels and reorder points
- Product availability across warehouses
- Slow-moving inventory identification
- Supplier lead times and pending orders
- I need accurate counts, often aggregated by location`
},
{
role: "Marketing Analyst",
perspective: `As marketing analyst, I care about:
- Customer acquisition and retention metrics
- Product performance and category trends
- Customer segmentation and lifetime value
- Campaign effectiveness and conversion rates
- I need historical trends and comparative analysis`
},
{
role: "Finance Controller",
perspective: `As finance controller, I care about:
- Revenue recognition and billing accuracy
- Payment status and accounts receivable aging
- Refund and chargeback tracking
- Period-over-period financial metrics
- I need precise numbers with audit trails`
},
{
role: "Executive",
perspective: `As executive, I care about:
- High-level KPIs and business health metrics
- Growth rates and market trends
- Performance against targets and forecasts
- Strategic insights, not operational details
- I need clear summaries with context`
}
]Full Pipeline Example
Generate comprehensive, diverse training data:
import {
PersonaGenerator,
TeachingsGenerator,
SchemaSynthesizer,
BreadthEvolver,
toPairs
} from '@deepagents/text2sql/synthesis';
// 1. Generate personas
const personaGen = new PersonaGenerator(adapter, { count: 10 });
const personas = await personaGen.generate();
// 2. Generate teachings
const teachingsGen = new TeachingsGenerator(adapter);
const teachings = await teachingsGen.generate();
// 3. Generate base pairs with personas and teachings
const basePairs = await toPairs(new SchemaSynthesizer(adapter, {
count: 5,
complexity: ['low', 'medium', 'hard'],
personas: personas,
teachings: teachings
}));
// 150 pairs (10 personas × 3 complexities × 5 questions)
// 4. Evolve with persona-specific paraphrases
const evolvedPairs = [];
for (const persona of personas.slice(0, 3)) { // Use top 3 personas
const variations = await toPairs(
new BreadthEvolver(basePairs.slice(0, 20), { // Evolve 20 pairs
count: 2,
persona: persona
})
);
evolvedPairs.push(...variations);
}
console.log(`Total: ${basePairs.length + evolvedPairs.length} pairs`);Best Practices
- Match count to domain complexity - Simple schemas need fewer personas (3-5), complex domains benefit from more (8-12)
- Review generated personas - Verify they match actual users of your database
- Combine with teachings - Personas + teachings = contextually rich training data
- Use selectively in evolution - Don't paraphrase every question from every persona; choose strategically
- Test persona coverage - Ensure generated questions span the full range of user needs
Schema-Specific Examples
Healthcare Database
// Generates: Doctor, Nurse, Administrator, Billing Specialist, Quality AnalystFinancial Database
// Generates: Trader, Risk Manager, Compliance Officer, Portfolio Manager, AnalystSaaS/Product Database
// Generates: Product Manager, Engineer, Customer Success, Sales, Growth AnalystThe generator adapts to your specific schema - the personas it creates will be relevant to your actual tables and data structure.