Databricks Community

CarlosAlberto · ‎07-17-2025

I'm trying to use the Spark-Nvidia integrations in order to train Spark ML models using GPU.

While trying to follow these instructions: https://docs.nvidia.com/spark-rapids/user-guide/23.12.1/getting-started/databricks.html

I could not execute the init.sh script:

#!/bin/bash
sudo wget -O /databricks/jars/rapids-4-spark_2.12-23.12.1.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.12.1/rapids-4-spark_2.12-23.12.1.ja...

Because, for governance reasons, my cluster doesn't have the permissions to access any external URL. I can only install the libraries from the authorized artifactory of my company or via the "libraries" section of the cluster.

I've tried a workaround by installing the rapids suite via the "libraries" section, "Maven" subsection but it appears that the spark configuration comes before the installation of libraries, so it simply doesn't work.

The error I get while inspecting the event logs of the cluster is:

Internal error message: Spark Driver was down due to misconfiguration, please check your config. [details] SparkMisconfiguration: java.lang.ClassNotFoundException: com.nvidia.spark.SQLPlugin not found in com.databricks.backend.daemon.driver.ClassLoaders$LibraryClassLoader@2e7563f6

I'm using a Standard NC6s V3 with DBR 14.3

Is there any workaround to this?

mark_ott · 3 weeks ago

Since cluster initialization happens before the "libraries" section installs Maven artifacts, the plugin isn’t available at the required time, causing the error.

Workaround Strategies

1. Internal Artifactory or DBFS Manual Upload

Upload the RAPIDS jar manually to your company-approved internal artifactory or internal storage (such as DBFS or Workspace Files).
Reference the internal path in both your cluster’s init script and Spark configuration, ensuring the RAPIDS jar is available before Spark initializes.

Typical steps:

Download the RAPIDS jar outside Databricks and upload it to DBFS via Databricks CLI or the web UI.
Modify your cluster’s init script to copy the jar from DBFS to the cluster’s /databricks/jars/ directory (no external wget required): cp /dbfs/FileStore/your-path/rapids-4-spark_2.12-23.12.1.jar /databricks/jars/

If you can use workspace files, copy from workspace instead: cp /Workspace/your-path/rapids-4-spark_2.12-23.12.1.jar /databricks/jars/

Update the Spark config to reference the correct jar path.

2. Custom Docker Image Approach

Build a custom Docker image containing the RAPIDS jar pre-installed, using your internal CI/CD tools.
Use this image as the base for your Databricks cluster. This bypasses external downloads at runtime.

3. Library Installation Order Issue

The problem with the "libraries" section is that:
- JARs installed via Maven appear in the cluster only after initialization, so Spark plugins that must be available at driver launch will not be loaded.
There’s presently no native Databricks support for safe plugin load sequencing vis-à-vis the library installation GUI. This is a known issue with Spark plugin workflows on Databricks.

View solution in original post

mark_ott · 3 weeks ago