cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Permissions error on cluster requirements.txt installation

rtreves
Contributor

Hi Databricks Community,

I'm looking to resolve the following error:
Library installation attempted on the driver node of cluster {My cluster ID} and failed. Please refer to the following error message to fix the library or contact Databricks support. Error code: DRIVER_LIBRARY_INSTALLATION_FAILURE. Error message: org.apache.spark.SparkException: requirements.txt installation failed with output: ERROR: Could not open requirements file: [Errno 13] Permission denied: {Path to my requirements.txt file}
The error appears in the cluster "Libraries" tab when I start up a compute cluster which is managed by a cluster policy linked to my requirements.txt file. Note that my requirements file is in a github-linked "Repos" folder (/Workspace/Repos/...). I'm able to read and write the requirements.txt file when I navigate to it independently. I've also determined that the error is not cluster- or cluster-policy specific - it recurs when I use a different cluster policy (either shared or single-user clusters) or start a different cluster attached to a policy linked to the file.

I've copied below the full cluster policy spec:
{
"spark_version": {
"type": "fixed",
"value": "15.4.x-scala2.12"
},
"spark_conf.spark.databricks.cluster.profile": {
"type": "forbidden",
"hidden": true
},
"node_type_id": {
"type": "unlimited",
"defaultValue": "i3.xlarge"
},
"num_workers": {
"type": "forbidden",
"hidden": true
},
"data_security_mode": {
"type": "fixed",
"value": "USER_ISOLATION",
"hidden": true
},
"cluster_type": {
"type": "fixed",
"value": "all-purpose"
},
"driver_instance_pool_id": {
"type": "forbidden",
"hidden": true
},
"instance_pool_id": {
"type": "forbidden",
"hidden": true
},
"autotermination_minutes": {
"type": "fixed",
"value": 60
},
"autoscale.min_workers": {
"type": "unlimited",
"defaultValue": 1
},
"autoscale.max_workers": {
"type": "unlimited",
"defaultValue": 5
},
"enable_elastic_disk": {
"type": "fixed",
"value": true,
"hidden": true
},
"aws_attributes.availability": {
"type": "fixed",
"value": "SPOT_WITH_FALLBACK",
"hidden": true
},
"aws_attributes.spot_bid_price_percent": {
"type": "fixed",
"value": 100,
"hidden": true
},
"aws_attributes.first_on_demand": {
"type": "range",
"minValue": 1,
"defaultValue": 1
},
"aws_attributes.instance_profile_arn": {
"type": "fixed",
"value": "arn:aws:iam::775333757806:instance-profile/databricks-workspace-stack-access-data-buckets"
},
"aws_attributes.zone_id": {
"type": "unlimited",
"defaultValue": "auto",
"hidden": true
}
}

Thank you in advance.
Tagging @ablee for visibility.

1 ACCEPTED SOLUTION

Accepted Solutions

Alberto_Umana
Databricks Employee
Databricks Employee

Got it, so it does look you don't have access, can you add yourself (or ask you admin to add you) and test it?

You need to have the necessary permissions to install libraries on the cluster. This typically means you should have the "Can Manage" permission on the cluster

View solution in original post

15 REPLIES 15

Alberto_Umana
Databricks Employee
Databricks Employee

Hi @rtreves,

The error you are encountering, "DRIVER_LIBRARY_INSTALLATION_FAILURE" with the message "ERROR: Could not open requirements file: [Errno 13] Permission denied," indicates that the driver node does not have the necessary permissions to access the requirements.txt file located in your GitHub-linked "Repos" folder.

Can you try installing the libraries via a notebook? 

%pip install -r /Workspace/Repos/path/to/requirements.txt

@Alberto_Umana Thank you for the speedy response. I am indeed able to install the libraries using `%pip` in a notebook attached to the cluster in question.

Alberto_Umana
Databricks Employee
Databricks Employee

Understood! have you enabled logging on the cluster, that would give us more details on the failure. 

https://docs.databricks.com/en/compute/configure.html

@Alberto_Umana I have not enabled logging, no. However, I can see the "Event log" and "Driver log" tabs on the cluster page.

rtreves
Contributor

Hi @Alberto_Umana , do you have any further recommendations or updates?

Alberto_Umana
Databricks Employee
Databricks Employee

Hi @rtreves,

Could you please enable logging on the cluster and restart it so that when error happens you can validate logging to have more details on why library installation fails?

What type of cluster are you using? shared access mode or single mode?

rtreves
Contributor

@Alberto_Umana I enabled logging and restarted the cluster (which is shared access, though the issue recurs with single access mode clusters). I've uploaded the log files at this google drive folder: https://drive.google.com/drive/folders/1EdEF6yYjqgdJuUgcAX22xYN7P2bLsTwf?usp=sharing

Alberto_Umana
Databricks Employee
Databricks Employee

Thanks, I will review them and get back if I can find the root of the issue.

Alberto_Umana
Databricks Employee
Databricks Employee

@rtreves - is there a chance you can raise a case with us? Do you have any active support plan?

Based on the error, problem is definitely with permissions on the git folder. Can you validate the "Share" permissions under your git folder /Workspace/Repos?

 

25/01/02 14:28:15 WARN LibraryState: [Thread 186] Failed to install library file:/Workspace/Repos/BC/tributary-lab-utils/requirements.txt
org.apache.spark.SparkException: requirements.txt installation failed with output: ERROR: Could not open requirements file: [Errno 13] Permission denied: '/Workspace/Repos/BC/tributary-lab-utils/requirements.txt'

 

@Alberto_Umana I don't believe we have an active support plan. 

Attached below is a screenshot of the share permissions for the  /Workspace/Repos folder. It is the same for the /Workspace/Repos/BC subfolder. I am not an admin.

Screenshot 2025-01-02 at 10.15.27 AM.png

Alberto_Umana
Databricks Employee
Databricks Employee

Got it, so it does look you don't have access, can you add yourself (or ask you admin to add you) and test it?

You need to have the necessary permissions to install libraries on the cluster. This typically means you should have the "Can Manage" permission on the cluster

rtreves
Contributor

@Alberto_Umana Ah, that makes sense. I gave myself manage permissions on the folder and the issue resolved. I think this is a workable solution for my use case. Thank you for your help!

Alberto_Umana
Databricks Employee
Databricks Employee

Glad it got solved! let me know if you have any other questions.

rtreves
Contributor

@Alberto_Umana  Apologies, but I realized the issue resolved when I tested on a single-access cluster, but I'm still seeing the error on a shared access cluster (the same one from which the error logs were produced). I have manage permissions on the github folder, as well as on the cluster itself. I have "Can use" permission on the cluster policy (I don't see an option to change that permission to any other value).

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group