cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Failing to install a library from dbfs mounted storage (adls2) with pass through credentials cluster

alonisser
Contributor

We've setup a premium workspace with passthrough credentials cluster , while they do work and access my adls gen 2 storage

I can't make it install a library on the cluster from there. and keeping getting

"Library installation attempted on the driver node of cluster 0522-200212-mib0srv0 and failed. Please refer to the following error message to fix the library or contact Databricks support. Error Code: DRIVER_LIBRARY_INSTALLATION_FAILURE. Error Message: com.google.common.util.concurrent.UncheckedExecutionException: com.databricks.backend.daemon.data.client.adl.AzureCredentialNotFoundException: Could not find ADLS Gen2 Token

"

How can I actually install a "cluster wide" library? on those passthrough credentials clusters

(The general adls mount is using those credentials to mount the data lake)

This happens on both standard and high concurrency clusters

14 REPLIES 14

Kaniz
Community Manager
Community Manager

Hi @Alon Nisser​ , Here is a similar issue on S.O. Please let us know if that helps.

alonisser
Contributor

Nope @Kaniz Fatma​ it's not actually my question, I know how to install a library on a cluster and do it quite a lot. the question is how to install a library stored on the data lake (via dbfs wheel) for a "pass credentials" cluster

Kaniz
Community Manager
Community Manager

Hi @Alon Nisser​ , Thank you for the clarification. I might have misinterpreted the question.

alonisser
Contributor

I currently got a hack, of copying the library from the data lake to root dbfs. and from there . but I don't like it

Kaniz
Community Manager
Community Manager

Hi @Alon Nisser​ , I'm glad you got a hack for the time being, and thank you for sharing it on our platform. Can you tell us the reason for your dissatisfaction with it?

nancy_g
New Contributor III

Hi @Kaniz Fatma​ Even I am facing the same issue. I am trying to use job cluster with credential passthrough enabled to deploy a job but library installation fails with the same exception:

"Message: com.google.common.util.concurrent.UncheckedExecutionException: com.databricks.backend.daemon.data.client.adl.AzureCredentialNotFoundException: Could not find ADLS Gen2 Token"

Where to add the token? or am I missing something?

@Nancy Gupta​ , As far as I can trace this issue, it's about the token isn't set up yet when the cluster is starting; I assume it does work with pass-through credentials after starting the collection regularly?

My hack was to copy the library to the root dbfs (I've created a new folder there) using another group, and then install from this place does work

Kaniz
Community Manager
Community Manager

Hi @Nancy Gupta​ , Were you able to replicate the solution provided by @Alon Nisser​ ?

nancy_g
New Contributor III

@Kaniz Fatma​ , yes but that is just a workaround and it would be great if I can get a solution for this!

Also again in the job for any read from adls, it fails again with the same error.

nancy_g
New Contributor III

@Kaniz Fatma​ , any solutions pls?

Kaniz
Community Manager
Community Manager

Hi @Nancy Gupta​,

By design, it is a limitation that the ADF-linked service access token will not be passed through the notebook activity. It would help if you used the credentials inside the notebook activity or key vault store.

image 

Reference: ADLS using AD credentials passthrough – limitations.

Hope this helps. Do let us know if you any further queries.

Kaniz
Community Manager
Community Manager

Hi @Nancy Gupta​ ​, We haven’t heard from you on the last response from me, and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community as it can be helpful to others. Otherwise, we will respond with more details and try to help.

User16764241763
Honored Contributor

Hello @Alon Nisser​ @Nancy Gupta​ 

Installing libraries using passthrough credentials is currently not supported

You need below configs on the cluster

fs.azure.account...

https://docs.microsoft.com/en-us/azure/databricks/data/data-sources/azure/adls-gen2/azure-datalake-g...

We can file a feature request for this.

alonisser
Contributor

Sorry I can't figure this out, the link you've added is irrelevant for passthrough credentials, if we add it the cluster won't be passthrough, Is there a way to add this just for a specific folder? while keeping passthrough for the rest?

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.