Sandbox
Execute real sandbox commands across Docker, WASM, virtual, and custom backends
The sandbox system lets AI agents execute real commands in isolated
environments. Choose between Docker containers for full binary execution, WASM
virtual machines for lightweight in-process execution, virtual file systems for
simulation, custom Sandbox implementations for specialized policy, or binary
bridges to combine approaches.
When to Use What
| Approach | Use Case | Requires Docker? |
|---|---|---|
createDockerSandbox + createBashTool | Agent needs real binaries in isolated container | Yes |
createDockerSandbox | Direct container control without AI agent wiring | Yes |
createAgentOsSandbox | Lightweight WASM execution, no Docker needed | No |
createVirtualSandbox | In-process bash with a virtual filesystem | No |
createBinaryBridges | Bridge specific host binaries into just-bash virtual FS | No |
All AI-agent-facing surfaces compose the same way: pick a backend factory
(createDockerSandbox / createAgentOsSandbox / createVirtualSandbox), then
wrap it with createBashTool({ sandbox }) to get the bash tool, file tools,
skills upload, and file-event observation.
All backends expose the buffered command + file API (executeCommand,
readFile, writeFiles, dispose). Docker sandboxes additionally expose an
optional spawn(command, options?) for live streaming stdio. Feature-detect it
with if (!sandbox.spawn) { ... } when writing backend-agnostic code.
Agent OS is a v0.1.0 preview. Requires optional peer dependencies @rivet-dev/agent-os-core and @rivet-dev/agent-os-common.
Quick Start: Docker + Bash Tool
The fastest path for giving an AI agent a bash tool that runs in Docker:
import { groq } from '@ai-sdk/groq';
import { printer } from '@deepagents/agent';
import {
ContextEngine,
InMemoryContextStore,
agent,
createBashTool,
createDockerSandbox,
pkg,
role,
skills,
user,
} from '@deepagents/context';
const backend = await createDockerSandbox({
installers: [pkg(['curl', 'jq'])],
});
const sandbox = await createBashTool({
sandbox: backend,
skills: [
{ host: './skills', sandbox: '/workspace/skills' },
],
});
// `sandbox.skills` is populated from on-disk SKILL.md frontmatter, and the
// skill files are already uploaded into the container at `/workspace/skills`.
try {
const context = new ContextEngine({
chatId: 'demo',
userId: 'user-1',
store: new InMemoryContextStore(),
});
context.set(
role('You are a helpful assistant with bash access.'),
skills(sandbox),
user('Use the most relevant skill to help me.'),
);
const assistant = agent({
name: 'Assistant',
sandbox,
model: groq('gpt-oss-20b'),
context,
});
await printer.stdout(await assistant.stream({}));
} finally {
await sandbox.sandbox.dispose();
}Quick Start: Docker Sandbox
For direct container control without AI agent wiring:
import { createDockerSandbox, pkg } from '@deepagents/context';
const sandbox = await createDockerSandbox({
image: 'alpine:latest',
installers: [pkg(['curl', 'jq'])],
volumes: [
{
type: 'bind',
hostPath: process.cwd(),
containerPath: '/workspace',
readOnly: true,
},
],
resources: { memory: '512m', cpus: 1 },
});
try {
const result = await sandbox.executeCommand('curl --version');
console.log(result.stdout);
await sandbox.writeFiles([
{ path: '/tmp/hello.txt', content: 'Hello from Docker!' },
]);
const content = await sandbox.readFile('/tmp/hello.txt');
console.log(content);
} finally {
await sandbox.dispose();
}Or use useSandbox for automatic cleanup:
import { useSandbox } from '@deepagents/context';
const output = await useSandbox(
{ installers: [pkg(['curl'])] },
async (sandbox) => {
const result = await sandbox.executeCommand('curl --version');
return result.stdout;
},
);Quick Start: Binary Bridges
Bridge specific host binaries into a just-bash virtual file system (no Docker needed):
import { Bash, ReadWriteFs } from 'just-bash';
import { createBashTool, createBinaryBridges } from '@deepagents/context';
const bridges = createBinaryBridges(
'node',
{ name: 'python', binaryPath: 'python3' },
{ name: 'git', allowedArgs: /^(status|log|diff|show)/ },
);
const { bash } = await createBashTool({
sandbox: new Bash({
fs: new ReadWriteFs({ root: process.cwd() }),
customCommands: bridges,
}),
});Binary bridges resolve virtual paths to real host paths and use the host's PATH for binary resolution, while restricting access via optional allowedArgs regex.
Quick Start: Agent OS
WASM-based execution with no Docker dependency and ~6ms cold start:
import common from '@rivet-dev/agent-os-common';
import { createAgentOsSandbox } from '@deepagents/context';
const sandbox = await createAgentOsSandbox({
software: [common],
});
try {
const result = await sandbox.executeCommand('echo "Hello from WASM!"');
console.log(result.stdout);
await sandbox.writeFiles([
{ path: '/tmp/hello.txt', content: 'Written inside WASM' },
]);
const content = await sandbox.readFile('/tmp/hello.txt');
console.log(content);
} finally {
await sandbox.dispose();
}Or use useAgentOsSandbox for automatic cleanup:
import common from '@rivet-dev/agent-os-common';
import { useAgentOsSandbox } from '@deepagents/context';
const output = await useAgentOsSandbox(
{ software: [common] },
async (sandbox) => {
const result = await sandbox.executeCommand('ls /');
return result.stdout;
},
);Quick Start: Virtual Sandbox
Use the virtual sandbox when an agent needs the standard bash/read/write tool surface without Docker:
import { InMemoryFs } from 'just-bash';
import { createBashTool, createVirtualSandbox } from '@deepagents/context';
const sandbox = await createBashTool({
sandbox: await createVirtualSandbox({ fs: new InMemoryFs() }),
});
const result = await sandbox.sandbox.executeCommand('echo "hello"');
console.log(result.stdout); // helloThe virtual backend uses just-bash, so command execution and file IO stay
in-process while still exercising the same createBashTool path used by agents.
When you want command groups such as sql run, tool validate, or
marker remind inside a virtual sandbox, pass customCommands to
createVirtualSandbox() and build them with
Subcommand Builders.
Bash Tool Schema
createBashTool() from @deepagents/context returns a bash tool whose input schema requires two fields: command and reasoning. The LLM must provide a brief reason on every call — the wrapper enforces it at the Zod schema level, and the upstream bash-tool package's { command } shape is widened accordingly.
interface BashToolInput {
command: string;
reasoning: string;
}
type WrappedBashTool = Tool<BashToolInput, CommandResult>;The reasoning field is stripped before the command runs — it never influences execution. The AI SDK records it as part of the tool-call step input, so it remains visible in the response's content array for auditing. Calls missing reasoning fail schema validation before execute is invoked.
const sandbox = await createBashTool();
// ✅ Accepted by the tool schema
await sandbox.tools.bash.execute(
{ command: 'ls /workspace', reasoning: 'List files to find the entry point' },
{} as never,
);
// ❌ Rejected — AI SDK emits a tool-error with /reasoning/
await sandbox.tools.bash.execute(
{ command: 'ls /workspace' } as never,
{} as never,
);Type-safe consumers can import WrappedBashTool and BashToolInput directly from @deepagents/context.
Meta Channel
Some bash handlers need to surface two different outputs at once:
- host-only metadata for the application runtime
- a short reminder the model should see on its next turn
useBashMeta() gives handlers both channels inside the createBashTool()
execution frame.
import { useBashMeta } from '@deepagents/context';
function markResult() {
const meta = useBashMeta();
meta?.setHidden({ formattedSql: 'SELECT 1' });
meta?.setReminder('Validate before executing the next query.');
return { stdout: 'ok\n', stderr: '', exitCode: 0 };
}meta is preserved in the raw tool result for host-side consumers and stripped
from model-visible output. reminder stays visible to the model. If a command
runs outside the createBashTool() wrapper, useBashMeta() returns null.
Custom Errors: BashException
Subclass BashException when you want a command or transform hook to fail with
an explicit CommandResult shape.
import type { CommandResult } from 'bash-tool';
import { BashException } from '@deepagents/context';
class RateLimitError extends BashException {
constructor(private readonly retryAfterMs: number) {
super(`rate limited; retry in ${retryAfterMs}ms`);
}
format(): CommandResult {
return {
stdout: '',
stderr: `${this.message}\n`,
exitCode: 1,
};
}
}createBashTool() catches BashException instances around sandbox execution
and returns format() to the caller. Other error types still propagate normally.
Debug Logging
Gate bash logging behind your own environment variable when you construct the tool:
DEBUG_BASH=1 node scripts/run-demo.ts
# [bash] sql run main "SELECT 1"
# [bash] exit 1The env var is gate-only — not a magic config:
const debug = Boolean(process.env.DEBUG_BASH);
await createBashTool({
sandbox: bashInstance,
onBeforeBashCall: ({ command }) => {
if (debug) console.log(`[bash] ${command}`);
return { command };
},
onAfterBashCall: ({ result }) => {
if (debug && result.exitCode !== 0) {
console.log(`[bash] exit ${result.exitCode}`);
}
return { result };
},
});Next Steps
- Docker Sandbox - Three strategies for container creation
- Agent OS Sandbox - WASM-based execution without Docker
- Subcommand Builders - Build just-bash command groups with dispatch and repair helpers
- Architecture: Sandbox - Strategy pattern and internals