cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Custom docker image fails to initalize

matanper
New Contributor III

I'm trying to use a custom docker image for my job. This is my docker file:

FROM databricksruntime/standard:12.2-LTS

COPY . .
RUN /databricks/python3/bin/pip install -U pip
RUN /databricks/python3/bin/pip install -r requirements.txt

USER root

My job is using a pool, and first I've tried to directly go to the job -> compute -> advanced -> docker, and put my image. But then it fails with the following:

 Unexpected user error while preparing the cluster for the job. Cause: INVALID_PARAMETER_VALUE: The target instance pool InstancePoolId(xxxxx) does not have docker images configured, thus not supporting cluster creation with docker image. Please update your cluster attribute or create a separate instance pool for docker image clusters.

So instead I tried to create a new all-purpose cluster with my custom image defined, and when the cluster is trying to initialize it fails with the error:

23/07/25 13:40:48 ERROR DriverDaemon$: stderr:
/databricks/spark/scripts/setup_container_iptables_rules.sh: line 32: iptables: command not found
23/07/25 13:40:48 ERROR DriverDaemon$: XXX Fatal uncaught exception. Terminating driver.
org.apache.spark.api.python.PythonSecurityException: Failed to run: 'enable iptables restrictions for Python'

Any advice?

5 REPLIES 5

Debayan
Esteemed Contributor III
Esteemed Contributor III

Hi,

Could you please check if the requirements are fulfilled, https://docs.databricks.com/clusters/custom-containers.html#requirements

Also, could you please try through the API if it is getting deployed? 

Also, as per the error it looks like the iptables has been enabled in the image for python. (https://stackoverflow.com/questions/5891779/is-there-a-python-interface-to-iptables). Could you please confirm it? 

Please tag @Debayan  with your next comment which will notify me. Thanks!

matanper
New Contributor III

Hey @Debayan 

1. I have Databricks Container Services enabled
2. "Your machine must be running a recent Docker daemon" - I'm not sure I followed this one. Databricks is the one managing my machines on AWS.

"could you please try through the API if it is getting deployed?" - Do you refer to try the cluster initialization using the API? can you give me a reference for this one?

"it looks like the iptables has been enabled in the image for python" - Trying both standard and python images and they seem to be missing iptables:

 

docker run databricksruntime/standard:12.2-LTS bash -c 'iptables'
> bash: iptables: command not found

docker run databricksruntime/python:12.2-LTS bash -c 'iptables'
> bash: iptables: command not found

 

 

matanper
New Contributor III

Looks like running "apt install iptables" might work, will update. But it seems weird the base image provided by Databricks doesn't contain all the executables needed by it to run.

 

Debayan
Esteemed Contributor III
Esteemed Contributor III

Hi, I think, disabling iptables will be better in this case, could you please try the below command and confirm? 

$ sudo iptables -S

matanper
New Contributor III

after installing `iptables`? before it's still command not found

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.