Monday
Hi,
We are evaluating whether it is possible to host a browserโbased agentic application on Databricks.
Our application performs frontend UI automation using the browser-use Python library and also exposes FastAPI endpoints to drive a UI..
Application Overview-
Uses browser-use, which is built on Playwright
Requires OSโlevel browser dependencies
Runs headless Chrome via the Chrome DevTools Protocol (CDP)
Uses random, shortโlived local ports for browser communication
Challenges Encountered-
1. Databricks Serverless / Databricks Apps
Not supported for this use case
No access to OSโlevel dependencies or browser binaries required by Playwright
2. Legacy Compute with Init Scripts
Browsers (e.g., Chrome) can be installed via init scripts
However, browser-use fails to connect to headless Chrome
Possible causes include:
Restrictions on dynamic or ephemeral localhost ports
Networking limitations
Constraints imposed by the Databricks runtime or base image
3. Legacy Compute with Custom Docker Images
Ideally, we would like to use a custom Docker image (e.g., python:3.11 s-l-i-m) with all required browser dependencies preinstalled
Databricks currently does not allow using nonโDatabricks Runtime base images for compute
Question-
Is there any supported way on Databricks to:
Run custom Docker images that are not based on the Databricks Runtime, or
Use another Databricksโsupported service or pattern that would allow running browserโbased automation workloads (Playwright or headless Chrome or CDPโbased tools)?
Any guidance and help would be greatly appreciated.
yesterday
TLDR: Databricks Apps/serverless wonโt support this pattern; classic compute with Databricks Container Services is your only real option on Databricks, and even that has tradeโoffs. For serious browser automation, run it offโplatform and integrate with Databricks by API.
apt-get/yum/apk, no systemโlevel packages or browsers.0.0.0.0:<DATABRICKS_APP_PORT>; the reverse proxy terminates TLS and forwards traffic.Yes, but with constraints:
python:3.11-slim) as long as you:
Caveats:
Given how specialized browser automation is, the most robust pattern is:
Summary:
55m ago
Thanks @Lu_Wang_ENB_DBX thanks for your reply with details. It helps. As I mentioned I tried with classic compute with image and then call the AI browser api from custom port from notebook and saved the output in Lakebase....
yesterday
TLDR: Databricks Apps/serverless wonโt support this pattern; classic compute with Databricks Container Services is your only real option on Databricks, and even that has tradeโoffs. For serious browser automation, run it offโplatform and integrate with Databricks by API.
apt-get/yum/apk, no systemโlevel packages or browsers.0.0.0.0:<DATABRICKS_APP_PORT>; the reverse proxy terminates TLS and forwards traffic.Yes, but with constraints:
python:3.11-slim) as long as you:
Caveats:
Given how specialized browser automation is, the most robust pattern is:
Summary:
55m ago
Thanks @Lu_Wang_ENB_DBX thanks for your reply with details. It helps. As I mentioned I tried with classic compute with image and then call the AI browser api from custom port from notebook and saved the output in Lakebase....