cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

NVIDIA driver update

ravi-kolluri_in
New Contributor II

I want to update the cuda driver for the NVIDIA tesla T4 GPU on the cluster. 

using the following command

%sh
sudo apt-get --purge remove "*nvidia*"
sudo /usr/bin/nvidia-uninstall
sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda-repo-ubuntu2204-11-8-local_11.8.0-520.61.05-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204-11-8-local_11.8.0-520.61.05-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2204-11-8-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda
 
It is stuck with out returning any result. 
Please help.
1 REPLY 1

 

@Retired_mod 

The DBR runtime compatible  cuda version (11.3) is DBR 11.X. unfortunately pyspark.ml.torch.distributed works only with DBR 13.X

So going back to DBR 11.X is not solving the problem

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now