cancel
Showing results for 
Search instead for 
Did you mean: 
Data Governance
Join discussions on data governance practices, compliance, and security within the Databricks Community. Exchange strategies and insights to ensure data integrity and regulatory compliance.
cancel
Showing results for 
Search instead for 
Did you mean: 

Cannot set spark.plugins com.nvidia.spark.SQLPlugin config

CarlosAlberto
New Contributor

I'm trying to use the Spark-Nvidia integrations in order to train Spark ML models using GPU.

While trying to follow these instructions: https://docs.nvidia.com/spark-rapids/user-guide/23.12.1/getting-started/databricks.html

I could not execute the init.sh script:

#!/bin/bash
sudo wget -O /databricks/jars/rapids-4-spark_2.12-23.12.1.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.12.1/rapids-4-spark_2.12-23.12.1.ja...

Because, for governance reasons, my cluster doesn't have the permissions to access any external URL. I can only install the libraries from the authorized artifactory of my company or via the "libraries" section of the cluster.

I've tried a workaround by installing the rapids suite via the "libraries" section, "Maven" subsection but it appears that the spark configuration comes before the installation of libraries, so it simply doesn't work.

The error I get while inspecting the event logs of the cluster is:

Internal error messageSpark Driver was down due to misconfiguration, please check your config. [details] SparkMisconfiguration: java.lang.ClassNotFoundException: com.nvidia.spark.SQLPlugin not found in com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader@2e7563f6

I'm using a Standard NC6s V3 with DBR 14.3

Is there any workaround to this?

0 REPLIES 0