5 hours ago
Hi All,
Recently we have implemented the change to make databricks workspace accessible only via a private network. After this change, we found lot of errors on connectivity like from Power BI to Databricks, Azure Data factory to Databricks etc.
I was able to resolve above mentioned issues whereas currently I am stuck with some other issue whose root cause looks like the same implementation.
The Error I get is:
run failed with error message Library installation failed for library due to user error for whl: "dbfs:/Volumes/any.whl" Error messages: Library installation attempted on the driver node of cluster XXXX and failed. Please refer to the following error message to fix the library or contact Databricks support. Error code: DRIVER_LIBRARY_INSTALLATION_FAILURE. Error message: java.util.concurrent.ExecutionException: org.apache.spark.sql.AnalysisException: 403: Unauthorized access to workspace: 12345543234
I have executed this job for 200 times and out of which it failed with above error on 13 instances. So, it is clearly a temporary issue happenning sometimes.
Does someone have faced similar issue or has any idea how to proceed with this?
4 hours ago
Hi @Uj337,
Couple of questions:
How are you installing the library? I assume through cluster libraries?
Are you the only using starting the cluster or other users as well?
What DBR version are you using, also is a single access mode or shared cluster?
Based on the error it does look that wheel package is downloaded from a Volume, right? are any packages being downloaded from internet?
4 hours ago
How are you installing the library? I assume through cluster libraries?
Its a job cluster created from ADF pipeline. Within ADF activity, we are passing this wheel library as DBFS URI.
Are you the only using starting the cluster or other users as well? Yes
What DBR version are you using (14.3 LTS), also is a single access mode or shared cluster (single)?
Based on the error it does look that wheel package is downloaded from a Volume, right? YES.
Are any packages being downloaded from internet? - No.
2 hours ago
@Uj337 The relevant part is the "org.apache.spark.sql.AnalysisException: 403: Unauthorized access to workspace: 12345543234". Anything from the Driver logs correlated to this analysis exception? Does it probably come with a full stacktrace?
an hour ago
24/12/23 11:22:33 INFO DriverCorral: [Thread 161] AttachLibraries - candidate libraries: List(dbfs:/Volumes/external_location/any.whl)
24/12/23 11:22:33 INFO DriverCorral: [Thread 161] AttachLibraries - new libraries to install (including resolved dependencies): List(dbfs:/Volumes/external_location/any.whl)
24/12/23 11:22:33 INFO SharedDriverContext: [Thread 161] attachLibrariesToSpark dbfs:/Volumes/external_location/any.whl
24/12/23 11:22:33 INFO SharedDriverContext: Attaching Python lib: dbfs:/Volumes/external_location/any.whl to clusterwide nfs path
24/12/23 11:22:33 INFO DriverConf: Configured feature flag data source LaunchDarkly
24/12/23 11:22:33 WARN DriverConf: REGION environment variable is not defined. getConfForCurrentRegion will always return default value
24/12/23 11:22:33 INFO LibraryDownloadManager: Downloading a library that was not in the cache: dbfs:/Volumes/external_location/any.whl
24/12/23 11:22:33 INFO LibraryDownloadManager: Attempt 1: wait until library dbfs:/Volumes/external_location/any.whl is downloaded
24/12/23 11:22:33 INFO LibraryDownloadManager: Preparing to download library file from UC Volume path: dbfs:/Volumes/external_location/any.whl
24/12/23 11:22:34 INFO RDriverLocal: 9. RDriverLocal.b7abce53-a76c-4c5e-ba76-f0e5221260b6: R process started with RServe listening on port 1100.
24/12/23 11:22:34 INFO RDriverLocal: 10. RDriverLocal.b7abce53-a76c-4c5e-ba76-f0e5221260b6: starting interpreter to talk to R process ...
24/12/23 11:22:35 WARN SparkContext: Using an existing SparkContext; some configuration may not take effect.
24/12/23 11:22:35 INFO ROutputStreamHandler: Successfully connected to stdout in the RShell.
24/12/23 11:22:35 INFO ROutputStreamHandler: Successfully connected to stderr in the RShell.
24/12/23 11:22:35 INFO RDriverLocal: 11. RDriverLocal.b7abce53-a76c-4c5e-ba76-f0e5221260b6: R interpreter is connected.
24/12/23 11:22:35 INFO RDriverWrapper: setupRepl:ReplId-4fde1-7f7fb-47ee1-1: finished to load
24/12/23 11:22:39 INFO LibraryDownloadManager: Attempt 2: wait until library dbfs:/Volumes/external_location/any.whl is downloaded
24/12/23 11:22:39 INFO LibraryDownloadManager: Preparing to download library file from UC Volume path: dbfs:/Volumes/external_location/any.whl
24/12/23 11:22:44 INFO LibraryDownloadManager: Attempt 3: wait until library dbfs:/Volumes/external_location/any.whl is downloaded
24/12/23 11:22:44 INFO LibraryDownloadManager: Preparing to download library file from UC Volume path: dbfs:/Volumes/external_location/any.whl
24/12/23 11:22:44 ERROR LibraryDownloadManager: Could not download dbfs:/Volumes/external_location/any.whl.
org.apache.spark.sql.AnalysisException: 403: Unauthorized access to workspace: 12345563455323
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group