cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Problem with installing Python WHEEL in an existed cluster

jeremy98
Contributor

Hi community,
I was running a workflow based on different tasks but also taking into account the existed cluster to execute those tasks, but I was getting error in configurations:

 

run failed with error message Library installation failed for library due to user error for whl: "/Workspace/Users/<user-mail>/.bundle/data_pipelines/stg/files/dist/data_pipelines-0.0.1-py3-none-any.whl" Error messages: Library installation attempted on the driver node of cluster 1213-225649-fslmwxpk and failed. User tried to install a wheel, but pip could not build the wheel successfully. Please check your wheel package contents and dependencies.. Error code: WHEEL_BUILD_ERROR. Error message: org.apache.spark.SparkException: Process List(/bin/su, libraries, -c, bash /local_disk0/.ephemeral_nfs/cluster_libraries/python/python_start_clusterwide.sh /local_disk0/.ephemeral_nfs/cluster_libraries/python/bin/pip install --upgrade /local_disk0/tmp/addedFilec83cde1956f94fb68ad0f2e34cd4b42013465877465959137763/rnc_data_pipelines-0.0.1-py3-none-any.whl --disable-pip-version-check) exited with code 1. ... *WARNING: message truncated. Skipped 3389 bytes of output**

 

How solve this error? And how to see the entire warning?

1 ACCEPTED SOLUTION

Accepted Solutions

Walter_C
Databricks Employee
Databricks Employee
3 REPLIES 3

Walter_C
Databricks Employee
Databricks Employee

 

Ensure that the wheel package you are trying to install is correctly built and that all its dependencies are properly specified. You can do this by inspecting the setup.py or pyproject.toml file in your package.

 

Conflicts between different versions of dependencies can cause installation failures. Make sure that the dependencies specified in your wheel file do not conflict with other libraries installed on the cluster. For example, if your package requires a specific version of numpy, ensure that this version is compatible with other installed packages.

 

Hello,
I solved the problem directly removing pandas and numpy. It is difficult because needs to be matched with the version of a current cluster installed.

But, if I want to run in a serveless mode, https://github.com/databricks/cli/issues/1621, based on this how do I need to specify the environment my wheel etc. I don't understand..

Walter_C
Databricks Employee
Databricks Employee

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group