07-28-2023 07:50 AM - edited 07-28-2023 07:51 AM
Hi,
In our setup we use custom docker images on our job clusters. The setup of these images is in line with the docker images found on the databricks github page. The custom image works fine on runtimes, including 13.0. When using anything higher that 13.0 I get the following error for example when running spark.catalog.listTables()
```
java.lang.NoClassDefFoundError: org/apache/hadoop/hive/ql/metadata/HiveException
```
Very strange, almost seems like something is going wrong when databricks injects the magic runtime into the docker image and the java classpaths are not set correctly. I have looked at the changelog from runtime 13.0 to 13.1 but I cannot find why this should all of the sudden not work? Has anyone been able to get docker images to work on runtimes above 13.0?
The same happens both on an interactive cluster as well as a job cluster.
08-04-2023 08:10 AM
Also, this issue can be linked to the following github issue on the Databricks container images github page.
https://github.com/databricks/containers/issues/116
07-31-2023 12:07 AM
Hi, Any specific error has been surfaced?
Also, we need to check, Databricks Runtime for Machine Learning does not support Databricks Container Services.
Please tag @Debayan with your next comment which will notify me. Thanks!
08-01-2023 12:44 AM
Hi @Debayan! Thanks for your reply.
If I attach the docker image to an interactive cluster and run spark.catalog.listTables("dbname")
I get the following error:
```
java.lang.NoClassDefFoundError: org/apache/hadoop/hive/metastore/api/AlreadyExistsException
```
If I run the same command but attach the docker image to a job cluster I get the error mentioned above
```
java.lang.NoClassDefFoundError: org/apache/hadoop/hive/ql/metadata/HiveException
```
Both commands run fine if I attach docker image to a 13.0 cluster, but a 13.1 and 13.2 cluster don't work.
```
08-04-2023 02:08 AM
Thank you for your answer. However, I was under the impression that when using docker images in Databricks, Databricks actually injects the spark and hadoop packages into the image. This is how it seems to have always worked. The docker images are simply base ubuntu packages, nothing else. So in that case there must be going something wrong when these hadoop and spark packages are being injected.
08-04-2023 08:10 AM
Also, this issue can be linked to the following github issue on the Databricks container images github page.
https://github.com/databricks/containers/issues/116
08-21-2023 07:22 AM
Interestingly enough I can use the exact same image on the 13.3LTS beta runtime without issues.
08-21-2023 07:23 AM
Interestingly enough I can use the exact same docker image on 13.3LTS Beta with no issues.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group