- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-08-2023 05:21 AM
Hello,
I am reaching to the community in order to shed some light into a Bug I have been encountering recently. The two setups are as follows:
SETUP-1 (WORKS):
- Python Notebook in Workspace FileSystem (this is Shared/folder/...)
- Python custom wheel library in .whl installed into the cluster
SETUP-2 (DOES NOT WORK)
- Python Notebook in Repos
- Same wheel and cluster as SETUP-1
Moreover SETUP-2 is able to improt some of the functions but not all of them. At first I thought it was an issue with the Wheel generation, but SETUP-1 works just fine and is able to improt everything.
This issue makes me think that there might be a difference into how Databricks manages filesystem or any other variables in Repos that I am not grasping.
Can someone indicate me what could be the issue here or any important difference when working with databricks Repos?
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-15-2023 03:37 AM
Solution has comed with an update, as stated in Work with Python and R modules | Databricks on AWS, In Databricks Runtime 13.0 and above, directories added to the Python sys.path are automatically distributed to all executors in the cluster. In Databricks Runtime 12.2 LTS and below, libraries added to the sys.path must be explicitly installed on executors.
This seems to have solved our strange import problem from Databricks Repos
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-10-2023 01:30 AM
Hi,
How can I run non-Databricks notebook files in a repo?
For example, a .py file?
You can use any of the following:
- Bundle and deploy as a library on the cluster.
- Pip install the Git repository directly. This requires a credential in secrets manager.
- Use
- %run
- with inline code in a notebook.
- Use a custom container image. See Customize containers with Databricks Container Services.
You can refer to: https://docs.databricks.com/repos/limits.html and https://docs.databricks.com/repos/limits.html#non-notebook-files-files-in-repos
Also, refer to the errors: https://docs.databricks.com/repos/errors-troubleshooting.html
Please let us know if this helps and let us know the errors. Also, please tag @Debayan with your next comment so that I will get notified. Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-19-2023 01:31 AM
Hi @Alvaro Moure
Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.
We'd love to hear from you.
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-15-2023 03:37 AM
Solution has comed with an update, as stated in Work with Python and R modules | Databricks on AWS, In Databricks Runtime 13.0 and above, directories added to the Python sys.path are automatically distributed to all executors in the cluster. In Databricks Runtime 12.2 LTS and below, libraries added to the sys.path must be explicitly installed on executors.
This seems to have solved our strange import problem from Databricks Repos

