Monday
Hi,
We are evaluating whether it is possible to host a browser‑based agentic application on Databricks.
Our application performs frontend UI automation using the browser-use Python library and also exposes FastAPI endpoints to drive a UI..
Application Overview-
Uses browser-use, which is built on Playwright
Requires OS‑level browser dependencies
Runs headless Chrome via the Chrome DevTools Protocol (CDP)
Uses random, short‑lived local ports for browser communication
Challenges Encountered-
1. Databricks Serverless / Databricks Apps
Not supported for this use case
No access to OS‑level dependencies or browser binaries required by Playwright
2. Legacy Compute with Init Scripts
Browsers (e.g., Chrome) can be installed via init scripts
However, browser-use fails to connect to headless Chrome
Possible causes include:
Restrictions on dynamic or ephemeral localhost ports
Networking limitations
Constraints imposed by the Databricks runtime or base image
3. Legacy Compute with Custom Docker Images
Ideally, we would like to use a custom Docker image (e.g., python:3.11 s-l-i-m) with all required browser dependencies preinstalled
Databricks currently does not allow using non‑Databricks Runtime base images for compute
Question-
Is there any supported way on Databricks to:
Run custom Docker images that are not based on the Databricks Runtime, or
Use another Databricks‑supported service or pattern that would allow running browser‑based automation workloads (Playwright or headless Chrome or CDP‑based tools)?
Any guidance and help would be greatly appreciated.
Tuesday
TLDR: Databricks Apps/serverless won’t support this pattern; classic compute with Databricks Container Services is your only real option on Databricks, and even that has trade‑offs. For serious browser automation, run it off‑platform and integrate with Databricks by API.
apt-get/yum/apk, no system‑level packages or browsers.0.0.0.0:<DATABRICKS_APP_PORT>; the reverse proxy terminates TLS and forwards traffic.Yes, but with constraints:
python:3.11-slim) as long as you:
Caveats:
Given how specialized browser automation is, the most robust pattern is:
Summary:
yesterday
Thanks @Lu_Wang_ENB_DBX thanks for your reply with details. It helps. As I mentioned I tried with classic compute with image and then call the AI browser api from custom port from notebook and saved the output in Lakebase....
Tuesday
TLDR: Databricks Apps/serverless won’t support this pattern; classic compute with Databricks Container Services is your only real option on Databricks, and even that has trade‑offs. For serious browser automation, run it off‑platform and integrate with Databricks by API.
apt-get/yum/apk, no system‑level packages or browsers.0.0.0.0:<DATABRICKS_APP_PORT>; the reverse proxy terminates TLS and forwards traffic.Yes, but with constraints:
python:3.11-slim) as long as you:
Caveats:
Given how specialized browser automation is, the most robust pattern is:
Summary:
yesterday
Thanks @Lu_Wang_ENB_DBX thanks for your reply with details. It helps. As I mentioned I tried with classic compute with image and then call the AI browser api from custom port from notebook and saved the output in Lakebase....