cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

how to install custom libraries of python on Shared or multi user cluster

PareDesa_10157
New Contributor II

Hello all,

wanted to know how to install custom python libraries or load the library files on multiuser or shared databrick cluster.

thanks

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz
Community Manager
Community Manager

Hi @Paresh Desaiโ€‹, To install custom Python libraries or load library files on a multi-user or shared Databricks cluster, you can follow the steps below:

  1. Create a library: You can create a library by clicking on the "Libraries" tab in the left-hand panel of the Databricks workspace and selecting "Create Library." You can then choose whether to install a library from PyPI, upload a library in .whl or .egg format, or upload a library from a Maven repository.
  2. Attach the library to a cluster: Once it is created, you can attach it by selecting the "Clusters" tab in the left-hand panel, selecting the cluster you want to attach the library to, and clicking on the "Libraries" tab. You can click "Attach Library" and choose the library you want to attach.
  3. Load the library in your code: Once it is attached to the cluster, you can load it using the "import" statement. For example, if you installed the pandas library, you can load it in your code by adding "import pandas" at the beginning of your code.

View solution in original post

3 REPLIES 3

Kaniz
Community Manager
Community Manager

Hi @Paresh Desaiโ€‹, To install custom Python libraries or load library files on a multi-user or shared Databricks cluster, you can follow the steps below:

  1. Create a library: You can create a library by clicking on the "Libraries" tab in the left-hand panel of the Databricks workspace and selecting "Create Library." You can then choose whether to install a library from PyPI, upload a library in .whl or .egg format, or upload a library from a Maven repository.
  2. Attach the library to a cluster: Once it is created, you can attach it by selecting the "Clusters" tab in the left-hand panel, selecting the cluster you want to attach the library to, and clicking on the "Libraries" tab. You can click "Attach Library" and choose the library you want to attach.
  3. Load the library in your code: Once it is attached to the cluster, you can load it using the "import" statement. For example, if you installed the pandas library, you can load it in your code by adding "import pandas" at the beginning of your code.

Hello,

I did the dbfs file load and loaded the file. but if i as admin run the python code it works fine but when user load the file (csv or other data file) user gets error as : user does not have permission to SELECT any file.

then i tried to go on the cluster and open notebook and run GRANT select on any file to 'user'

Operation not allowed: GRANT(line 1, pos 0) == SQL == GRANT SELECT on ANY FILE to REDACTED_LOCAL_PART@domain.com' ^^^

what i am doing wrong? how to solve it

Thanks

append to the previous message: the cluster env is shared. does dbfs supported in shared cluster?

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.