cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

NVIDIA driver update

ravi-kolluri_in
New Contributor II

I want to update the cuda driver for the NVIDIA tesla T4 GPU on the cluster. 

using the following command

%sh
sudo apt-get --purge remove "*nvidia*"
sudo /usr/bin/nvidia-uninstall
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-ubuntu2204-11-8-local_11.8.0-520.61.05-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-11-8-local_11.8.0-520.61.05-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda
 
It is stuck with out returning any result. 
Please help.
1 REPLY 1

 

@Retired_mod 

The DBR runtime compatible  cuda version (11.3) is DBR 11.X. unfortunately pyspark.ml.torch.distributed works only with DBR 13.X

So going back to DBR 11.X is not solving the problem

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group