cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

DLT Pipeline unable to find custom Libraries/Wheel packages

Fz1
New Contributor III

We have our DLT pipeline and we need to import our custom libraries packaged in wheel files.

We are on Azure DBX and we are using Az DevOps CI/CD to build and deploy the wheel packages on our DBX environment.

 

In the top of our DLT notebook we are importing the wheel package as below

%pip install /dbfs/Libraries/whls/{wheel_file_name}.whl

On execution of the pipeline we get the below error

CalledProcessError: Command 'pip --disable-pip-version-check install /dbfs/Libraries/whls/{wheel_file_name}.whl' returned non-zero exit status 1.,None,Map(),Map(),List(),List(),Map())

And from the logs you can see that the file is not accessible:

Python interpreter will be restarted.
WARNING: Requirement '/dbfs/Libraries/whls/{wheel_file_name}.whl' looks like a filename, but the file does not exist
Processing /dbfs/Libraries/whls/{wheel_file_name}.whl
ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: '/dbfs/Libraries/whls/{wheel_file_name}.whl'

knowing that the file exists already, when checked from the DBFS Explore UI screen.

We've tried to list the available folders and files accessible by the DLT Pipeline node and we got the below:

Files in the ROOT Directory: ['mnt', 'tmp', 'local_disk0', 'dbfs', 'Volumes', 'Workspace', . . . . .]

Files in the ROOT/dbfs Directory: []

As you can see dbfs looks empty and it doesn't contain any folder or file, which we can see and access from the DBFS explorer ui portal.

 

Volumes and Workspace files are accessible from the pipeline, but:

- Uploading to Volumes giving Error uploading without additional details to know the issue, even uploading manually from the UI

- Workspace/shared...: Files are accessible but the problem that it's not working with CI/CD pipelines to automatically push wheel files from there, so we need to upload them manually.

 

Any idea, how can we overcome this, and to be able to upload the wheel files via Azure DevOps to the DBX environment and to be able to import them in our DLT pipelines?

 

3 REPLIES 3

ColibriMike
New Contributor II

Exactly the same issue here - please say if you find a solution.

FWIW "Unrestricted Single User" clusters work fine - shared compute of any description appears to run into this issue.

ColibriMike
New Contributor II
"context_based_upload_for_execute": true

in projects.json allowed the code to run - but ended with

RuntimeError: Cannot start a remote Spark session because there is a regular Spark session already running.
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!