Text2SQL - Natural Language to SQL

Embedders convert text into vector representations (embeddings) that enable semantic similarity search. The retrieval package includes FastEmbed with multiple model options.

FastEmbed

FastEmbed is a lightweight, fast embedding library that runs locally. It downloads models on first use and caches them for subsequent runs.

Import

import { fastembed } from '@deepagents/retrieval';

Basic Usage

import { fastembed, nodeSQLite, similaritySearch } from '@deepagents/retrieval';
import { local } from '@deepagents/retrieval/connectors';

// Default model: BGE-small-en-v1.5 (384 dimensions)
const embedder = fastembed();
const store = nodeSQLite('./docs.db', 384);

const results = await similaritySearch('authentication', {
  connector: local('**/*.md'),
  store,
  embedder,
});

Options

import { EmbeddingModel } from 'fastembed';

fastembed({
  model?: EmbeddingModel;  // Embedding model (default: BGESmallENV15)
  batchSize?: number;      // Batch size for embedding (default: varies by model)
  cacheDir?: string;       // Custom cache directory for models
})

Available Models

Model	ID	Dimensions	Notes
BGE-small-en-v1.5	`EmbeddingModel.BGESmallENV15`	384	Default, good balance
BGE-base-en-v1.5	`EmbeddingModel.BGEBaseENV15`	768	Higher quality, slower
BGE-small-en	`EmbeddingModel.BGESmallEN`	384	Earlier version
BGE-base-en	`EmbeddingModel.BGEBaseEN`	768	Earlier version
BGE-small-zh	`EmbeddingModel.BGESmallZH`	384	Chinese language
All-MiniLM-L6-v2	`EmbeddingModel.AllMiniLML6V2`	384	Fast, lightweight
MLE5-Large	`EmbeddingModel.MLE5Large`	1024	Highest quality

Model Selection

import { fastembed } from '@deepagents/retrieval';
import { EmbeddingModel } from 'fastembed';

// High quality (slower, more memory)
const highQuality = fastembed({
  model: EmbeddingModel.BGEBaseENV15,
});

// Fast and lightweight
const fast = fastembed({
  model: EmbeddingModel.AllMiniLML6V2,
});

// Chinese content
const chinese = fastembed({
  model: EmbeddingModel.BGESmallZH,
});

Dimension Matching

The store dimension must match your embedding model's output dimension:

import { fastembed, nodeSQLite } from '@deepagents/retrieval';
import { EmbeddingModel } from 'fastembed';

// 384-dimension models
const embedder384 = fastembed(); // BGE-small-en-v1.5
const store384 = nodeSQLite('./384.db', 384);

// 768-dimension models
const embedder768 = fastembed({ model: EmbeddingModel.BGEBaseENV15 });
const store768 = nodeSQLite('./768.db', 768);

// 1024-dimension models
const embedder1024 = fastembed({ model: EmbeddingModel.MLE5Large });
const store1024 = nodeSQLite('./1024.db', 1024);

Mismatched dimensions will cause runtime errors.

Batch Processing

FastEmbed processes embeddings in batches to optimize memory usage:

// Custom batch size for memory-constrained environments
const embedder = fastembed({
  batchSize: 16, // Smaller batches use less memory
});

// Larger batches for faster processing
const fastEmbedder = fastembed({
  batchSize: 64,
});

Model Caching

Models are downloaded on first use and cached locally:

// Custom cache directory
const embedder = fastembed({
  cacheDir: '/path/to/cache',
});

Default cache location varies by OS:

macOS: ~/Library/Caches/fastembed
Linux: ~/.cache/fastembed
Windows: %LOCALAPPDATA%\fastembed

Embedder Interface

All embedders implement this interface:

type Embedder = (documents: string[]) => Promise<{
  embeddings: (number[] | Float32Array)[];
  dimensions: number;
}>;

You can create custom embedders that follow this interface:

const customEmbedder: Embedder = async (documents) => {
  // Call your embedding API
  const response = await fetch('https://api.example.com/embed', {
    method: 'POST',
    body: JSON.stringify({ texts: documents }),
  });

  const data = await response.json();

  return {
    embeddings: data.embeddings,
    dimensions: data.dimensions,
  };
};

Real-World Examples

Quality vs Speed Trade-off

import { fastembed, nodeSQLite, similaritySearch } from '@deepagents/retrieval';
import { local } from '@deepagents/retrieval/connectors';
import { EmbeddingModel } from 'fastembed';

// For development: fast iteration
const devEmbedder = fastembed({
  model: EmbeddingModel.AllMiniLML6V2,
  batchSize: 32,
});
const devStore = nodeSQLite('./dev.db', 384);

// For production: highest quality
const prodEmbedder = fastembed({
  model: EmbeddingModel.BGEBaseENV15,
  batchSize: 16,
});
const prodStore = nodeSQLite('./prod.db', 768);

// Use environment to switch
const embedder = process.env.NODE_ENV === 'production'
  ? prodEmbedder
  : devEmbedder;

const store = process.env.NODE_ENV === 'production'
  ? prodStore
  : devStore;

Reusable Embedder Instance

import { fastembed, nodeSQLite, similaritySearch, ingest } from '@deepagents/retrieval';
import { local, github } from '@deepagents/retrieval/connectors';

// Create embedder once, reuse across operations
const embedder = fastembed();
const store = nodeSQLite('./knowledge.db', 384);

// Index local docs
await ingest({
  connector: local('docs/**/*.md'),
  store,
  embedder,
});

// Index GitHub docs
await ingest({
  connector: github.file('vercel/next.js/README.md'),
  store,
  embedder,
});

// Search with same embedder
const results = await similaritySearch('routing', {
  connector: local('docs/**/*.md'),
  store,
  embedder,
});

Memory-Efficient Processing

import { fastembed, nodeSQLite, ingest } from '@deepagents/retrieval';
import { local } from '@deepagents/retrieval/connectors';

// For large codebases, use smaller batches
const embedder = fastembed({
  batchSize: 8, // Reduce memory usage
});

// The ingestion pipeline also batches internally (40 chunks at a time)
await ingest({
  connector: local('**/*.ts'),
  store: nodeSQLite('./code.db', 384),
  embedder,
});

Next Steps

Stores - Configure vector storage
Ingestion Modes - Control re-indexing
Custom Connectors - Build data sources

Embedders

On this page