Deep Agents
AgentContextOrchestratorRetrievalText2SQLToolbox

Docker Sandbox

Execute real binaries in isolated Docker containers with three creation strategies

The Docker sandbox provides isolated code execution by running commands inside Docker containers. It supports three strategies for creating sandboxes, each suited to different use cases.

Strategies

Starts a vanilla image and runs an ordered list of installers at container startup. Best for quick experiments and when dependencies change frequently.

import { createDockerSandbox, pkg } from '@deepagents/context';

const sandbox = await createDockerSandbox({
  image: 'alpine:latest',
  installers: [pkg(['curl', 'jq', 'python3'])],
  volumes: [
    {
      type: 'bind',
      hostPath: process.cwd(),
      containerPath: '/workspace',
      readOnly: true,
    },
  ],
  resources: { memory: '512m', cpus: 1 },
});

try {
  await sandbox.executeCommand('python3 --version');
} finally {
  await sandbox.dispose();
}

Default image: alpine:latest

Installers are the unit of "put a tool in the container." Built-ins: pkg([...]), urlBinary({...}), npm(...), pip(...), githubRelease({...}). They run in array order and share state (architecture, package manager, idempotent tool ensures) via an InstallerContext.

Package manager detection: Alpine images use apk, Debian-based images (debian, ubuntu, node, python) use apt-get. pkg([...]) picks automatically.

Builds a custom image from a Dockerfile. Best for reproducible environments with many dependencies. Uses content-based hashing for automatic image caching — same Dockerfile content means same image tag, so Docker skips the rebuild.

import { createDockerSandbox } from '@deepagents/context';

// Inline Dockerfile
const sandbox = await createDockerSandbox({
  dockerfile: `
    FROM python:3.11-slim
    RUN pip install pandas numpy matplotlib
  `,
  context: '.',
  volumes: [
    {
      type: 'bind',
      hostPath: process.cwd(),
      containerPath: '/workspace',
    },
  ],
});

// Or reference a Dockerfile path
const sandbox2 = await createDockerSandbox({
  dockerfile: './Dockerfile.sandbox',
  context: '.',
});

Caching: The image tag is sandbox-<sha256-first-12-chars>. Same Dockerfile content produces the same tag, so subsequent calls skip the build entirely.

Detection: Inline vs path is determined by whether the string contains \n.

Manages multi-container environments using Docker Compose. Best for applications that need databases, APIs, or other services alongside the sandbox.

import { createDockerSandbox } from '@deepagents/context';

const sandbox = await createDockerSandbox({
  compose: './docker-compose.yml',
  service: 'app',
});

try {
  // Commands run in the 'app' service
  await sandbox.executeCommand('node --version');

  // Can reach other services by name
  await sandbox.executeCommand('curl http://db:5432');
} finally {
  // Stops ALL services
  await sandbox.dispose();
}

Volumes: Must be defined in the compose file itself, not via volumes.

Lifecycle: dispose() runs docker compose down, stopping all services.

Stable Container Names

Runtime and Dockerfile strategies accept an optional name to reuse the same container across calls:

const sandbox = await createDockerSandbox({
  image: 'node:lts-alpine',
  name: 'analytics-indexer',
  installers: [pkg(['curl'])],
});

DeepAgents uses sandbox-<name> as the Docker container name. Reuse behavior:

  • If sandbox-<name> is already running, the sandbox attaches to it.
  • If it exists but is stopped, the sandbox starts it and attaches.
  • If it does not exist, a new container is created.

When attaching to an existing named container, installer execution, volume preparation, and env setup are skipped because the container is assumed to already be configured.

name must match /^[A-Za-z0-9_.-]+$/.

dispose() still stops (and removes, because runtime and Dockerfile containers use --rm) named containers even when this call attached to an existing one. Coordinate lifecycle if multiple callers share the same name.

Container Command Overrides

Runtime and Dockerfile strategies accept an optional command to control what is appended after the image in docker run:

  • Omit command (default) to append tail -f /dev/null as a keep-alive.
  • Set command: null or command: [] to append nothing, so the image or Dockerfile CMD/ENTRYPOINT runs as declared.
  • Set a non-empty array (for example, command: ['sleep', 'infinity']) to append arguments verbatim and override image/Dockerfile CMD.
const sandbox = await createDockerSandbox({
  image: 'node:lts-alpine',
  command: null, // run the image CMD/ENTRYPOINT directly
});

Compose sandboxes do not use this option. Configure command behavior in the compose file (for example, command: or entrypoint: per service).

Volumes

Attach bind paths or Docker-managed volumes to the container. Read-only by default for security.

const sandbox = await createDockerSandbox({
  volumes: [
    {
      type: 'bind',
      hostPath: '/absolute/path/on/host',
      containerPath: '/workspace',
      readOnly: true,   // default
    },
    {
      type: 'bind',
      hostPath: process.cwd(),
      containerPath: '/project',
      readOnly: false,  // allow writes
    },
    {
      type: 'volume',
      name: 'existing-dataset',
      containerPath: '/data',
      // lifecycle defaults to 'external': the Docker volume must already exist.
    },
    {
      type: 'volume',
      name: 'sandbox-output',
      containerPath: '/output',
      lifecycle: 'managed',
      readOnly: false,
      // managed volumes are created before container start and removed on dispose.
    },
  ],
});

Bind paths are validated at creation time. A VolumePathError is thrown if a bind hostPath doesn't exist on the host.

External Docker volumes are inspected before the container starts. Managed Docker volumes are created before the container starts and removed on dispose() unless removeOnDispose: false is set.

Cloud-backed storage should be configured outside DeepAgents through Docker volume drivers or host-mounted filesystems. For example, configure S3 or Azure Blob credentials in the host/daemon/plugin layer, then attach the resulting Docker volume with type: 'volume' or the host mount with type: 'bind'. Do not pass cloud credentials through driverOptions. See the GCS Cloud Storage Volumes recipe for an end-to-end example backing a volume with a Google Cloud Storage bucket on a GCP VM.

Resource Limits

const sandbox = await createDockerSandbox({
  resources: {
    memory: '512m',  // default: '1g'
    cpus: 1,         // default: 2
  },
});

Environment Variables

Set environment variables in the container at creation time. Available to all commands executed via executeCommand.

const sandbox = await createDockerSandbox({
  env: {
    NODE_ENV: 'production',
    API_KEY: 'secret-123',
    DB_URL: 'postgresql://user:pass@host/db',
  },
});

const result = await sandbox.executeCommand('echo $NODE_ENV');
// stdout: "production"

Supported on Runtime and Dockerfile strategies. For Compose, define environment variables in your docker-compose.yml under the environment key.

Keys are validated — empty keys and keys containing = throw a DockerSandboxError.

Command Execution APIs

executeCommand is buffered and returns all output at process exit:

const result = await sandbox.executeCommand('python3 --version');
console.log(result.stdout, result.stderr, result.exitCode);

Pass AbortSignal to cancel a running command:

const controller = new AbortController();
const pending = sandbox.executeCommand('sleep 30', {
  signal: controller.signal,
});

setTimeout(() => controller.abort(), 1_000);
await pending;

For live byte streams, use spawn:

const proc = sandbox.spawn('python3 long-task.py', {
  cwd: '/workspace',
  env: { MODE: 'stream' },
});

const decoder = new TextDecoder();

for await (const chunk of proc.stdout) {
  process.stdout.write(decoder.decode(chunk, { stream: true }));
}

const exit = await proc.exit;
console.log(exit.code, exit.signal, exit.success);

spawn accepts cwd, env, and signal. Aborting a running process resolves proc.exit with a non-success status (for Docker backends, typically { code: null, signal: 'SIGKILL', success: false }).

spawn is Docker-only. Other sandbox backends intentionally omit it, so shared code should feature-detect with if (!sandbox.spawn) { ... }.

Installers

Installers are the polymorphic unit that puts a tool inside a running container. Mix and match them in the installers: array — they run in order and share an InstallerContext (memoised arch, idempotent tool ensures, resolved package manager).

pkg([...]) — OS packages

Auto-detects apk vs apt-get from the image.

import { pkg } from '@deepagents/context';

createDockerSandbox({ installers: [pkg(['curl', 'jq', 'python3'])] });

urlBinary({...}) — pre-built binary from a URL

Architecture-aware via uname -m. Supports raw binaries and .tar.gz/.tgz archives.

import { urlBinary } from '@deepagents/context';

createDockerSandbox({
  installers: [
    urlBinary({
      name: 'presenterm',
      url: {
        x86_64: 'https://github.com/.../presenterm-x86_64-linux-musl.tar.gz',
        aarch64: 'https://github.com/.../presenterm-aarch64-linux-musl.tar.gz',
      },
      binaryPath: 'presenterm',  // filename inside the archive
    }),
  ],
});

urlBinary ensures curl itself — no need to add it to pkg([...]) first.

npm('<pkg>') — global npm install

Fails loudly with MissingRuntimeError if nodejs/npm aren't in the image. Pass { ensureRuntime: true } to auto-install them.

import { npm } from '@deepagents/context';

createDockerSandbox({
  installers: [
    npm('prettier', { ensureRuntime: true }),       // installs node + npm too
    npm('typescript', { version: '5.4.5' }),         // pinned version
  ],
});

// Or use a node base image and skip ensureRuntime:
createDockerSandbox({
  image: 'node:lts-alpine',
  installers: [npm('prettier')],
});

pip('<pkg>') — pip install

Same shape as npm. ensureRuntime: true adds python3 + the right pip package (py3-pip on Alpine, python3-pip on Debian).

import { pip } from '@deepagents/context';

createDockerSandbox({
  installers: [pip('requests', { ensureRuntime: true, version: '2.31.0' })],
});

githubRelease({...}) — GitHub release asset

Convenience over urlBinary for GitHub releases. The asset(arch) callback builds the asset filename per architecture.

import { githubRelease } from '@deepagents/context';

createDockerSandbox({
  installers: [
    githubRelease({
      owner: 'mfontanini',
      repo: 'presenterm',
      version: 'v0.15.1',
      name: 'presenterm',
      asset: (arch) => `presenterm-0.15.1-${arch}-unknown-linux-musl.tar.gz`,
    }),
  ],
});

Custom installers

Subclass Installer for anything not covered. The base class is a one-method contract.

import { Installer, type InstallerContext } from '@deepagents/context';

class CargoInstaller extends Installer {
  readonly kind = `cargo:${this.crate}`;

  constructor(private crate: string) { super(); }

  async install(ctx: InstallerContext) {
    await ctx.ensureTool('cargo');
    const result = await ctx.exec(`cargo install ${this.crate}`);
    if (result.exitCode !== 0) throw new Error(result.stderr);
  }
}

File I/O

Read and write files inside the container using base64 encoding for binary safety:

// Write files
await sandbox.writeFiles([
  { path: '/tmp/data.json', content: '{"key": "value"}' },
  { path: '/tmp/script.py', content: 'print("hello")' },
]);

// Read files
const content = await sandbox.readFile('/tmp/data.json');

Parent directories are created automatically during writes.

Container Lifecycle

Manual Disposal

Always wrap sandbox usage in try/finally:

const sandbox = await createDockerSandbox({ installers: [pkg(['curl'])] });
try {
  await sandbox.executeCommand('curl --version');
} finally {
  await sandbox.dispose();
}

Auto-Disposal with useSandbox

import { useSandbox } from '@deepagents/context';

const version = await useSandbox(
  { installers: [pkg(['curl'])] },
  async (sandbox) => {
    const result = await sandbox.executeCommand('curl --version');
    return result.stdout.split('\n')[0];
  },
);

Containers are created with --rm, so they're removed automatically when stopped. dispose() calls docker stop (or docker compose down for Compose).

Error Handling

All errors extend DockerSandboxError:

ErrorWhen
DockerNotAvailableErrorDocker daemon is not running
ContainerCreationErrorContainer fails to start
PackageInstallErrorPackage installation fails (includes package names, manager type, stderr)
InstallErrorInstaller fails. Carries target (logical name), source ('url' | 'npm' | 'pypi' | 'github-release'), reason (stderr), url (when applicable)
MissingRuntimeErrornpm()/pip() ran without ensureRuntime: true and the runtime is absent. Carries runtime and required (binary names)
VolumePathErrorInvalid bind or volume path configuration
VolumeInspectErrorExternal Docker volume does not exist or cannot be inspected
VolumeCreateErrorManaged Docker volume cannot be created
VolumeRemoveErrorManaged Docker volume cannot be removed during cleanup
DockerfileBuildErrorDockerfile build fails (includes stderr)
ComposeStartErrorDocker Compose startup fails (includes compose file path, stderr)
import {
  createDockerSandbox,
  DockerNotAvailableError,
  PackageInstallError,
} from '@deepagents/context';

try {
  const sandbox = await createDockerSandbox({
    installers: [pkg(['nonexistent-package'])],
  });
} catch (error) {
  if (error instanceof DockerNotAvailableError) {
    console.error('Start Docker first');
  } else if (error instanceof PackageInstallError) {
    console.error(`Failed packages: ${error.packages.join(', ')}`);
    console.error(`Package manager: ${error.packageManager}`);
  }
}

Next Steps