Does Databricks Container Services (DCS) support for GPU containers with Databricks Runtime 11.3 LTS and higher?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-20-2023 03:23 PM
I have been trying to start a cluster using DCS with GPU containers (https://github.com/databricks/containers/tree/master/ubuntu/gpu), but was only successful with Databricks Runtime 10.4 LTS and lower.
With Databricks Runtime 11.3 LTS and higher, I got the error:
"Internal error message: Spark error: Driver down cause: driver state change"
Does DCS supports GPU containers for 11.3 LTS and higher?
- Labels:
-
ContainerServices
-
DatabricksContainer
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-27-2024 03:33 PM
Hello @ppang !
Since you posted your question, the repository you shared has received an update, which includes the following warning:
"Using conda in DCS images is no longer supported starting Databricks Runtime 9.0. We highly recommend users to extend cuda-11.8
examples. We no longer support cuda-10.1
and cuda-11.0
compatibility with latest databricks runtime."
It's likely that the issue you encountered was related to a CUDA incompatibility.
Best regards,
Jéssica Santos

