cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Cannot use docker container with runtime 13.1 and above

DaanRademaker
New Contributor III

Hi,

In our setup we use custom docker images on our job clusters. The setup of these images is in line with the docker images found on the databricks github page. The custom image works fine on runtimes, including 13.0. When using anything higher that 13.0 I get the following error for example when running spark.catalog.listTables()

```

java.lang.NoClassDefFoundError: org/apache/hadoop/hive/ql/metadata/HiveException

```

Very strange, almost seems like something is going wrong when databricks injects the magic runtime into the docker image and the java classpaths are not set correctly. I have looked at the changelog from runtime 13.0 to 13.1 but I cannot find why this should all of the sudden not work? Has anyone been able to get docker images to work on runtimes above 13.0?

The same happens both on an interactive cluster as well as a job cluster.

 

1 ACCEPTED SOLUTION

Accepted Solutions

DaanRademaker
New Contributor III

Also, this issue can be linked to the following github issue on the Databricks container images github page.
https://github.com/databricks/containers/issues/116

View solution in original post

6 REPLIES 6

Debayan
Databricks Employee
Databricks Employee

Hi, Any specific error has been surfaced? 

Also, we need to check, Databricks Runtime for Machine Learning does not support Databricks Container Services.

Please tag @Debayan  with your next comment which will notify me. Thanks!

DaanRademaker
New Contributor III

Hi @Debayan! Thanks for your reply.

If I attach the docker image to an interactive cluster and run spark.catalog.listTables("dbname")
I get the following error:
```

java.lang.NoClassDefFoundError: org/apache/hadoop/hive/metastore/api/AlreadyExistsException

```

If I run the same command but attach the docker image to a job cluster I get the error mentioned above

```

java.lang.NoClassDefFoundError: org/apache/hadoop/hive/ql/metadata/HiveException

```

Both commands run fine if I attach docker image to a 13.0 cluster, but a 13.1 and 13.2 cluster don't work.

```

 

 

DaanRademaker
New Contributor III

Thank you for your answer. However, I was under the impression that when using docker images in Databricks, Databricks actually injects the spark and hadoop packages into the image. This is how it seems to have always worked. The docker images are simply base ubuntu packages, nothing else. So in that case there must be going something wrong when these hadoop and spark packages are being injected.

DaanRademaker
New Contributor III

Also, this issue can be linked to the following github issue on the Databricks container images github page.
https://github.com/databricks/containers/issues/116

DaanRademaker
New Contributor III

Interestingly enough I can use the exact same image on the 13.3LTS beta runtime without issues.

DaanRademaker
New Contributor III

Interestingly enough I can use the exact same docker image on 13.3LTS Beta with no issues.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group