cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

NVIDIA driver update

ravi-kolluri_in
New Contributor II

I want to update the cuda driver for the NVIDIA tesla T4 GPU on the cluster. 

using the following command

%sh
sudo apt-get --purge remove "*nvidia*"
sudo /usr/bin/nvidia-uninstall
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-ubuntu2204-11-8-local_11.8.0-520.61.05-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-11-8-local_11.8.0-520.61.05-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda
 
It is stuck with out returning any result. 
Please help.
2 REPLIES 2

Kaniz
Community Manager
Community Manager

Hi @ravi-kolluri_in Please ensure you've selected the appropriate version for your machine and the Databricks Runtime version being used.

 

@Kaniz 

The DBR runtime compatible  cuda version (11.3) is DBR 11.X. unfortunately pyspark.ml.torch.distributed works only with DBR 13.X

So going back to DBR 11.X is not solving the problem

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.