Deep Agents
AgentContextOrchestratorRetrievalText2SQLToolbox

Embedders

Configure embedding models for vector representations

Embedders convert text into vector representations (embeddings) that enable semantic similarity search. The retrieval package includes FastEmbed with multiple model options.

FastEmbed

FastEmbed is a lightweight, fast embedding library that runs locally. It downloads models on first use and caches them for subsequent runs.

Import

import { fastembed } from '@deepagents/retrieval';

Basic Usage

import { fastembed, nodeSQLite, similaritySearch } from '@deepagents/retrieval';
import { local } from '@deepagents/retrieval/connectors';

// Default model: BGE-small-en-v1.5 (384 dimensions)
const embedder = fastembed();
const store = nodeSQLite('./docs.db', 384);

const results = await similaritySearch('authentication', {
  connector: local('**/*.md'),
  store,
  embedder,
});

Options

import { EmbeddingModel } from 'fastembed';

fastembed({
  model?: EmbeddingModel;  // Embedding model (default: BGESmallENV15)
  batchSize?: number;      // Batch size for embedding (default: varies by model)
  cacheDir?: string;       // Custom cache directory for models
})

Available Models

ModelIDDimensionsNotes
BGE-small-en-v1.5EmbeddingModel.BGESmallENV15384Default, good balance
BGE-base-en-v1.5EmbeddingModel.BGEBaseENV15768Higher quality, slower
BGE-small-enEmbeddingModel.BGESmallEN384Earlier version
BGE-base-enEmbeddingModel.BGEBaseEN768Earlier version
BGE-small-zhEmbeddingModel.BGESmallZH384Chinese language
All-MiniLM-L6-v2EmbeddingModel.AllMiniLML6V2384Fast, lightweight
MLE5-LargeEmbeddingModel.MLE5Large1024Highest quality

Model Selection

import { fastembed } from '@deepagents/retrieval';
import { EmbeddingModel } from 'fastembed';

// High quality (slower, more memory)
const highQuality = fastembed({
  model: EmbeddingModel.BGEBaseENV15,
});

// Fast and lightweight
const fast = fastembed({
  model: EmbeddingModel.AllMiniLML6V2,
});

// Chinese content
const chinese = fastembed({
  model: EmbeddingModel.BGESmallZH,
});

Dimension Matching

The store dimension must match your embedding model's output dimension:

import { fastembed, nodeSQLite } from '@deepagents/retrieval';
import { EmbeddingModel } from 'fastembed';

// 384-dimension models
const embedder384 = fastembed(); // BGE-small-en-v1.5
const store384 = nodeSQLite('./384.db', 384);

// 768-dimension models
const embedder768 = fastembed({ model: EmbeddingModel.BGEBaseENV15 });
const store768 = nodeSQLite('./768.db', 768);

// 1024-dimension models
const embedder1024 = fastembed({ model: EmbeddingModel.MLE5Large });
const store1024 = nodeSQLite('./1024.db', 1024);

Mismatched dimensions will cause runtime errors.

Batch Processing

FastEmbed processes embeddings in batches to optimize memory usage:

// Custom batch size for memory-constrained environments
const embedder = fastembed({
  batchSize: 16, // Smaller batches use less memory
});

// Larger batches for faster processing
const fastEmbedder = fastembed({
  batchSize: 64,
});

Model Caching

Models are downloaded on first use and cached locally:

// Custom cache directory
const embedder = fastembed({
  cacheDir: '/path/to/cache',
});

Default cache location varies by OS:

  • macOS: ~/Library/Caches/fastembed
  • Linux: ~/.cache/fastembed
  • Windows: %LOCALAPPDATA%\fastembed

Embedder Interface

All embedders implement this interface:

type Embedder = (documents: string[]) => Promise<{
  embeddings: (number[] | Float32Array)[];
  dimensions: number;
}>;

You can create custom embedders that follow this interface:

const customEmbedder: Embedder = async (documents) => {
  // Call your embedding API
  const response = await fetch('https://api.example.com/embed', {
    method: 'POST',
    body: JSON.stringify({ texts: documents }),
  });

  const data = await response.json();

  return {
    embeddings: data.embeddings,
    dimensions: data.dimensions,
  };
};

Real-World Examples

Quality vs Speed Trade-off

import { fastembed, nodeSQLite, similaritySearch } from '@deepagents/retrieval';
import { local } from '@deepagents/retrieval/connectors';
import { EmbeddingModel } from 'fastembed';

// For development: fast iteration
const devEmbedder = fastembed({
  model: EmbeddingModel.AllMiniLML6V2,
  batchSize: 32,
});
const devStore = nodeSQLite('./dev.db', 384);

// For production: highest quality
const prodEmbedder = fastembed({
  model: EmbeddingModel.BGEBaseENV15,
  batchSize: 16,
});
const prodStore = nodeSQLite('./prod.db', 768);

// Use environment to switch
const embedder = process.env.NODE_ENV === 'production'
  ? prodEmbedder
  : devEmbedder;

const store = process.env.NODE_ENV === 'production'
  ? prodStore
  : devStore;

Reusable Embedder Instance

import { fastembed, nodeSQLite, similaritySearch, ingest } from '@deepagents/retrieval';
import { local, github } from '@deepagents/retrieval/connectors';

// Create embedder once, reuse across operations
const embedder = fastembed();
const store = nodeSQLite('./knowledge.db', 384);

// Index local docs
await ingest({
  connector: local('docs/**/*.md'),
  store,
  embedder,
});

// Index GitHub docs
await ingest({
  connector: github.file('vercel/next.js/README.md'),
  store,
  embedder,
});

// Search with same embedder
const results = await similaritySearch('routing', {
  connector: local('docs/**/*.md'),
  store,
  embedder,
});

Memory-Efficient Processing

import { fastembed, nodeSQLite, ingest } from '@deepagents/retrieval';
import { local } from '@deepagents/retrieval/connectors';

// For large codebases, use smaller batches
const embedder = fastembed({
  batchSize: 8, // Reduce memory usage
});

// The ingestion pipeline also batches internally (40 chunks at a time)
await ingest({
  connector: local('**/*.ts'),
  store: nodeSQLite('./code.db', 384),
  embedder,
});

Next Steps