2 weeks ago
Hi Databricks Community,
I'm looking to resolve the following error:
Library installation attempted on the driver node of cluster {My cluster ID} and failed. Please refer to the following error message to fix the library or contact Databricks support. Error code: DRIVER_LIBRARY_INSTALLATION_FAILURE. Error message: org.apache.spark.SparkException: requirements.txt installation failed with output: ERROR: Could not open requirements file: [Errno 13] Permission denied: {Path to my requirements.txt file}
The error appears in the cluster "Libraries" tab when I start up a compute cluster which is managed by a cluster policy linked to my requirements.txt file. Note that my requirements file is in a github-linked "Repos" folder (/Workspace/Repos/...). I'm able to read and write the requirements.txt file when I navigate to it independently. I've also determined that the error is not cluster- or cluster-policy specific - it recurs when I use a different cluster policy (either shared or single-user clusters) or start a different cluster attached to a policy linked to the file.
I've copied below the full cluster policy spec:
{
"spark_version": {
"type": "fixed",
"value": "15.4.x-scala2.12"
},
"spark_conf.spark.databricks.cluster.profile": {
"type": "forbidden",
"hidden": true
},
"node_type_id": {
"type": "unlimited",
"defaultValue": "i3.xlarge"
},
"num_workers": {
"type": "forbidden",
"hidden": true
},
"data_security_mode": {
"type": "fixed",
"value": "USER_ISOLATION",
"hidden": true
},
"cluster_type": {
"type": "fixed",
"value": "all-purpose"
},
"driver_instance_pool_id": {
"type": "forbidden",
"hidden": true
},
"instance_pool_id": {
"type": "forbidden",
"hidden": true
},
"autotermination_minutes": {
"type": "fixed",
"value": 60
},
"autoscale.min_workers": {
"type": "unlimited",
"defaultValue": 1
},
"autoscale.max_workers": {
"type": "unlimited",
"defaultValue": 5
},
"enable_elastic_disk": {
"type": "fixed",
"value": true,
"hidden": true
},
"aws_attributes.availability": {
"type": "fixed",
"value": "SPOT_WITH_FALLBACK",
"hidden": true
},
"aws_attributes.spot_bid_price_percent": {
"type": "fixed",
"value": 100,
"hidden": true
},
"aws_attributes.first_on_demand": {
"type": "range",
"minValue": 1,
"defaultValue": 1
},
"aws_attributes.instance_profile_arn": {
"type": "fixed",
"value": "arn:aws:iam::775333757806:instance-profile/databricks-workspace-stack-access-data-buckets"
},
"aws_attributes.zone_id": {
"type": "unlimited",
"defaultValue": "auto",
"hidden": true
}
}
Thank you in advance.
Tagging @ablee for visibility.
Thursday
Got it, so it does look you don't have access, can you add yourself (or ask you admin to add you) and test it?
You need to have the necessary permissions to install libraries on the cluster. This typically means you should have the "Can Manage" permission on the cluster
2 weeks ago
Hi @rtreves,
The error you are encountering, "DRIVER_LIBRARY_INSTALLATION_FAILURE" with the message "ERROR: Could not open requirements file: [Errno 13] Permission denied," indicates that the driver node does not have the necessary permissions to access the requirements.txt file located in your GitHub-linked "Repos" folder.
Can you try installing the libraries via a notebook?
%pip install -r /Workspace/Repos/path/to/requirements.txt
2 weeks ago
@Alberto_Umana Thank you for the speedy response. I am indeed able to install the libraries using `%pip` in a notebook attached to the cluster in question.
2 weeks ago
Understood! have you enabled logging on the cluster, that would give us more details on the failure.
2 weeks ago
@Alberto_Umana I have not enabled logging, no. However, I can see the "Event log" and "Driver log" tabs on the cluster page.
Thursday
Hi @Alberto_Umana , do you have any further recommendations or updates?
Thursday
Hi @rtreves,
Could you please enable logging on the cluster and restart it so that when error happens you can validate logging to have more details on why library installation fails?
What type of cluster are you using? shared access mode or single mode?
Thursday
@Alberto_Umana I enabled logging and restarted the cluster (which is shared access, though the issue recurs with single access mode clusters). I've uploaded the log files at this google drive folder: https://drive.google.com/drive/folders/1EdEF6yYjqgdJuUgcAX22xYN7P2bLsTwf?usp=sharing
Thursday
Thanks, I will review them and get back if I can find the root of the issue.
Thursday
@rtreves - is there a chance you can raise a case with us? Do you have any active support plan?
Based on the error, problem is definitely with permissions on the git folder. Can you validate the "Share" permissions under your git folder /Workspace/Repos?
25/01/02 14:28:15 WARN LibraryState: [Thread 186] Failed to install library file:/Workspace/Repos/BC/tributary-lab-utils/requirements.txt org.apache.spark.SparkException: requirements.txt installation failed with output: ERROR: Could not open requirements file: [Errno 13] Permission denied: '/Workspace/Repos/BC/tributary-lab-utils/requirements.txt'
Thursday
@Alberto_Umana I don't believe we have an active support plan.
Attached below is a screenshot of the share permissions for the /Workspace/Repos folder. It is the same for the /Workspace/Repos/BC subfolder. I am not an admin.
Thursday
Got it, so it does look you don't have access, can you add yourself (or ask you admin to add you) and test it?
You need to have the necessary permissions to install libraries on the cluster. This typically means you should have the "Can Manage" permission on the cluster
Thursday
@Alberto_Umana Ah, that makes sense. I gave myself manage permissions on the folder and the issue resolved. I think this is a workable solution for my use case. Thank you for your help!
Thursday
Glad it got solved! let me know if you have any other questions.
Thursday
@Alberto_Umana Apologies, but I realized the issue resolved when I tested on a single-access cluster, but I'm still seeing the error on a shared access cluster (the same one from which the error logs were produced). I have manage permissions on the github folder, as well as on the cluster itself. I have "Can use" permission on the cluster policy (I don't see an option to change that permission to any other value).
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group