โ07-25-2023 07:07 AM
I'm trying to use a custom docker image for my job. This is my docker file:
FROM databricksruntime/standard:12.2-LTS
COPY . .
RUN /databricks/python3/bin/pip install -U pip
RUN /databricks/python3/bin/pip install -r requirements.txt
USER root
My job is using a pool, and first I've tried to directly go to the job -> compute -> advanced -> docker, and put my image. But then it fails with the following:
Unexpected user error while preparing the cluster for the job. Cause: INVALID_PARAMETER_VALUE: The target instance pool InstancePoolId(xxxxx) does not have docker images configured, thus not supporting cluster creation with docker image. Please update your cluster attribute or create a separate instance pool for docker image clusters.
So instead I tried to create a new all-purpose cluster with my custom image defined, and when the cluster is trying to initialize it fails with the error:
23/07/25 13:40:48 ERROR DriverDaemon$: stderr:
/databricks/spark/scripts/setup_container_iptables_rules.sh: line 32: iptables: command not found
23/07/25 13:40:48 ERROR DriverDaemon$: XXX Fatal uncaught exception. Terminating driver.
org.apache.spark.api.python.PythonSecurityException: Failed to run: 'enable iptables restrictions for Python'
Any advice?
โ07-25-2023 12:15 PM
Hi,
Could you please check if the requirements are fulfilled, https://docs.databricks.com/clusters/custom-containers.html#requirements.
Also, could you please try through the API if it is getting deployed?
Also, as per the error it looks like the iptables has been enabled in the image for python. (https://stackoverflow.com/questions/5891779/is-there-a-python-interface-to-iptables). Could you please confirm it?
Please tag @Debayan with your next comment which will notify me. Thanks!
โ07-25-2023 12:49 PM - edited โ07-25-2023 12:52 PM
Hey @Debayan
1. I have Databricks Container Services enabled
2. "Your machine must be running a recent Docker daemon" - I'm not sure I followed this one. Databricks is the one managing my machines on AWS.
"could you please try through the API if it is getting deployed?" - Do you refer to try the cluster initialization using the API? can you give me a reference for this one?
"it looks like the iptables has been enabled in the image for python" - Trying both standard and python images and they seem to be missing iptables:
docker run databricksruntime/standard:12.2-LTS bash -c 'iptables'
> bash: iptables: command not found
docker run databricksruntime/python:12.2-LTS bash -c 'iptables'
> bash: iptables: command not found
โ07-25-2023 12:59 PM
Looks like running "apt install iptables" might work, will update. But it seems weird the base image provided by Databricks doesn't contain all the executables needed by it to run.
โ07-26-2023 08:30 AM
Hi, I think, disabling iptables will be better in this case, could you please try the below command and confirm?
$ sudo iptables -S
โ07-26-2023 09:02 AM
after installing `iptables`? before it's still command not found
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group