Your preferred AI model is a deeply personal choice. Public benchmarks might declare the “best” model for a specific task, but they don’t account for your workflow or how you access information, which can change the calculation entirely.
Person A might find that using Gemini for a specific task is more productive than Claude because of the workflow they’ve adopted, while Person B might disagree and prefer Codex instead.
Databricks Coding Agents give you access to top frontier models, letting you choose the best model for your workflow.
Databricks AI Gateway
As you get more proficient with these agents, you realise that the bottleneck isn’t the agents, it’s you having to approve every single action.
Reviewing every edit kills your speed. But giving your agent access to your entire workspace and letting it run unchecked with --dangerously-skip-permissions or --yolo can lead to severe unintended consequences:
The solution is straightforward. Run your coding agent inside a container, provide the environment variables and mount only the directories it should touch.
The agent gets full autonomy inside the container, so it can read, write, and execute freely. But “freely” only applies to the environment variables you pass in and the directories you choose to mount. Your home directory, SSH keys, shell config, and every other project are invisible to it, limiting the blast radius of the agent’s mistakes.
+-------------------------------------+
| Your Machine |
| |
| ~/.zshrc <- untouchable |
| ~/.ssh/ <- untouchable |
| ~/Documents/ <- untouchable |
| |
| ~/my-project/ ---- mounted --->+ |
| | |
| +---------------------------+ | |
| | Docker Container | | |
| | | | |
| | /workspace <- only this |<--+ |
| | | |
| | Agent runs here with | |
| | full permissions but | |
| | can only see /workspace | |
| +---------------------------+ |
+-------------------------------------+
While there are many ways to define containers, this article will use Dockerfiles and also assume that your coding agents are configured to use Databricks endpoints. It will describe the following:
We start by defining a container for Claude Code, Gemini or Codex. I’ve published sandbox templates for each of these agents here.
Each template is intentionally minimal. Here’s the Claude one in its entirety:
# Minimal Claude Code sandbox
FROM node:20-bookworm
# Install Claude Code CLI
# Note: unpinned version resolves to latest at build time. Pin for reproducibility, e.g.:
# RUN npm install -g @anthropic-ai/claude-code@1.x
RUN npm install -g @anthropic-ai/claude-code
# --- Add project-specific dependencies below (e.g., Go, Terraform, Python, Rust) ---
# --- End project-specific dependencies ---
# Create non-root user with a fixed uid so --tmpfs uid=1001 in docker run
# always matches this user (Claude Code blocks --dangerously-skip-permissions as root)
RUN useradd -m -s /bin/bash -u 1001 agent \
&& mkdir -p /home/agent/.claude \
&& chown -R agent:agent /home/agent
# Set working directory
WORKDIR /workspace
# Switch to non-root user
USER agent
# Default command
ENTRYPOINT ["claude"]
The Codex and Gemini templates follow the same pattern. Just swap out the CLI package and the relevant configuration directory for your coding agent, then build the image:
# Claude
docker build -f Dockerfile.claude.template -t claude-code:latest .
# Codex
docker build -f Dockerfile.codex.template -t codex-cli:latest .
# Gemini
docker build -f Dockerfile.gemini.template -t gemini-cli:latest .
If you require additional tools, add them to the Dockerfile. On Apple Silicon, add --platform linux/arm64 for native performance.
Assuming you have configured authentication following the Databricks Coding Agents setup instructions for AWS, Azure, or GCP, the config files need small adjustments to work inside a container. See the readme located with the sandbox templates for working examples.
After that navigate to the project you want the agent to work on and run the container. The project directory gets mounted as /workspace, and each agent reads its Databricks credentials differently depending on what the CLI supports:
Codex:
docker run -it --rm \
--cap-drop=ALL \
--security-opt=no-new-privileges \
--read-only \
--tmpfs /tmp:rw,size=128m \
--tmpfs "/home/agent:rw,size=128m,uid=1001,gid=1001" \
--pids-limit=512 \
--memory=4g \
-e DATABRICKS_TOKEN \
-v $(pwd):/workspace \
-v ~/.codex:/home/agent/.codex \
codex-cli:latest
Gemini:
docker run -it --rm \
--cap-drop=ALL \
--security-opt=no-new-privileges \
--read-only \
--tmpfs /tmp:rw,size=128m \
--tmpfs "/home/agent:rw,size=128m,uid=1001,gid=1001" \
--pids-limit=512 \
--memory=4g \
-v $(pwd):/workspace \
--env-file ~/.gemini/.env \
gemini-cli:latest --yolo
Claude:
docker run -it --rm \
--cap-drop=ALL \
--security-opt=no-new-privileges \
--read-only \
--tmpfs /tmp:rw,size=128m \
--tmpfs "/home/agent:rw,size=128m,uid=1001,gid=1001" \
--pids-limit=512 \
--memory=4g \
-v $(pwd):/workspace \
-v ~/.claude/settings.json:/home/agent/.claude/settings.json:ro \
claude-code:latest --verbose --dangerously-skip-permissions
That’s it. The agent launches in interactive mode, sees only /workspace, and everything else on your machine is invisible.
Here are a few notes on how credentials are passed for each coding agent:
settings.json as a single read-only file and only reads this file at startup.~/.codex directory (not just the config file) because Codex needs to save session data alongside its config. Your config.toml should tell Codex to read the token from the DATABRICKS_TOKEN environment variable rather than running a shell command to fetch one, since the container does not have the Databricks CLI installed.--env-file which loads your .env file as environment variables. This is used instead of mounting the file directly because Gemini needs to write other files into its config directory. One thing to watch out for is that values in the .env file should not have quotes around them. For example use GEMINI_API_KEY_AUTH_MECHANISM=bearer, as Docker reads the quotes literally, which breaks authentication.
1. -v mounts only the directories you specify
-v $(pwd):/workspace
Use -v to mount directories into the container. Only the directories you explicitly mount are visible to the agent. Everything else on your machine does not exist from the agent’s perspective.
2. -e passes only the environment variables you specify
-e DATABRICKS_TOKEN
Use -e to pass environment variables into the container. The agent only has access to the variables you explicitly declare.
3. :ro makes mounts read-only
-v ~/.claude/settings.json:/home/agent/.claude/settings.json:ro
The :ro flag makes the mounted file read-only inside the container. The agent can read your credentials to authenticate, but cannot overwrite or corrupt them.
4. Non-root user
RUN useradd -m -s /bin/bash -u 1001 agent
USER agent
The agent process runs as a regular user, not root. Even inside the container, this prevents the agent from modifying system files or installing packages at runtime.
5. Root-owned workspace
WORKDIR /workspace
WORKDIR runs before USER agent, so /workspace is created as root. If you forget the -v mount, the agent gets a permission denied error instead of silently writing to a throwaway filesystem. This is a fail-closed design. The container does nothing useful unless you explicitly mount a directory.
6. --cap-drop=ALL removes all special OS permissions
--cap-drop=ALL
Containers get a set of Linux capabilities by default (things like changing file ownership or binding to low ports). The agent doesn’t need any of them. Dropping them all means even if something goes wrong inside the container, it can’t do anything privileged.
7. --security-opt=no-new-privileges blocks privilege escalation
--security-opt=no-new-privileges
Stops the agent from running a program that grants itself higher permissions. This closes the “exploit a setuid binary” loophole that attackers commonly use to escalate from a regular user to root.
8. --read-only with writable tmpfs (temporary file system) mounts
--read-only
--tmpfs /tmp:rw,size=128m
--tmpfs "/home/agent:rw,size=128m,uid=1001,gid=1001"
Makes the container’s own filesystem immutable. The agent can’t install packages or drop scripts into system directories. The CLI tools still need somewhere to write session state and temp files, so we give them two small writable locations: /tmp and the agent’s home directory. Both are capped at 128 MB and get thrown away when the container exits. Your mounted project folder is the only place where real work persists.
The uid=1001,gid=1001 part sets ownership of the tmpfs to the agent user inside the container. Without it, the tmpfs is owned by root and the agent can’t write to its own home directory. The 1001 value matches the uid we pinned in the Dockerfile with useradd -u 1001 agent.
9. --pids-limit=512 caps the number of processes
--pids-limit=512
Limits how many processes can run at once inside the container. If a command goes haywire and starts spawning processes in a loop, it hits this ceiling instead of eating all your host resources.
10. --memory=4g caps RAM usage
--memory=4g
Specifies a hard cap on the memory allowed to be consumed by the container. If the agent spawns something that leaks memory, the container gets killed at 4 GB instead of consuming your whole machine’s RAM. Adjust the value based on your workload.
The templates are starting points. If your project needs specific toolchains, copy a template into a subdirectory and add dependencies in the marked section:
# --- Add project-specific dependencies below (e.g., Go, Terraform, Python, Rust) ---
RUN apt-get update && apt-get install -y python3 python3-pip \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
# --- End project-specific dependencies ---
You can wrap the container command in a shell alias so running your sandboxed agent feels like calling a local CLI. I call mine claudex.
On macOS or Linux, add this to your ~/.zshrc (or ~/.bashrc😞
claudex() {
docker run -it --rm \
--cap-drop=ALL \
--security-opt=no-new-privileges \
--read-only \
--tmpfs /tmp:rw,size=128m \
--tmpfs "/home/agent:rw,size=128m,uid=1001,gid=1001" \
--pids-limit=512 \
--memory=4g \
--workdir $(pwd) \
-v $(pwd):$(pwd) \
-v ~/.claude/settings.json:/home/agent/.claude/settings.json:ro \
claude-code:latest --verbose --dangerously-skip-permissions "$@"
}
This mounts your current directory at the same absolute path inside the container (so the agent sees the same file paths you do on the host) and sets --workdir to match. Your Claude settings are mounted read-only for authentication. The agent is confined to the working directory and cannot access or modify anything else on your host machine.
After reloading your shell (for example, source ~/.zshrc), you can run claudex from any project directory to launch the agent inside its container.
On Windows, you can achieve something similar with PowerShell functions or aliases, but the exact setup will differ from the macOS/Linux example above.
Containers limit filesystem access, not network access. The agent can still make API calls, download packages, or reach the internet. Docker does not offer fine-grained network controls out of the box. Adding --network=none to your docker run command blocks all traffic including the model API. Fine-grained network control requires additional tooling like a forward proxy, which is beyond the scope of this article.
This also doesn’t protect the mounted directory itself. If you mount $(pwd):/workspace, the agent can delete everything in that directory. Use git branches or worktrees so you can always revert.
Note: Since the release of Claude’s model Mythos, there is growing discussion around the next wave of models being able to discover and exploit container breakout vulnerabilities, sometimes without the prompter even knowing it happened. Containers share the host kernel, and a sufficiently capable agent that finds a kernel exploit can escape the container entirely.
This is driving interest in using virtual machines instead, as they run their own kernel. So even if the agent compromises the guest operating system, it cannot reach the host.
For most development workflows today, containers remain a practical and meaningful improvement over running agents directly on your host. But if you are giving agents access to sensitive environments or running untrusted prompts, consider whether VM-level isolation is more appropriate for your threat model. The industry is moving in this direction, and it is worth watching how vendors respond as model capabilities continue to advance.
We have just gone through how to use each frontier model in a container sandbox. Each model’s credentials are passed into the container (via read-only mounts, env vars, or env files depending on the CLI) allowing you to connect to the models hosted on Databricks.
With a sandbox environment, you don’t have to choose between speed and safety. Run your coding agent in a container, mount only what it needs, and let it work with full permissions inside that boundary. The worst case goes from “agent deletes your home directory” to “agent makes a mess in one git branch.”
The Dockerfiles are available on GitHub.
Pick a template, build it, and start sandboxing.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.