08-24-2022 02:09 AM
I know that import can find installed cluster libraries, workspace libraries etc. as well as how to create wheel libraries externally and upload them.
However, in a project I came across the feature that import was able to find the folder in, if I remember correctly, a git repository. Essentially, I could import the Python library from the folder without installing it as a, for example, wheel library.
I have tried to find information about this feature in the documentation. However, it is a bit hard to find, partly because "import" has a lot of different denotations in he Databricks environment.
So, apart from installed libraries, where does Python import statement in a notebook look for libraries?
09-04-2022 08:53 AM
I think this might help: https://docs.databricks.com/_static/notebooks/files-in-repos.html
It's a quick explanation of the path and how to alter it in databricks.
08-25-2022 01:23 AM
Hi @Jonas Mellin
Can you try to run the below command from the notebook you will get the list of libraries installed in the DBR.
For example:
%sh ls -l /databricks/python3/lib/python3.8/site-packages
Also ref the docs for notebooks for python: https://docs.databricks.com/libraries/notebooks-python-libraries.html
If you still need any clarification Please let me know, I will try to explain further 🙂
09-02-2022 12:53 AM
That is not really what I want. What I want is to have a folder in a git repository connected to Databricks and that notebooks can search this folder or the repository for packages developed in-house. I know of the way to deploy them as wheel libraries both on clusters and in workspaces as well as how to connect them to Jobs.
It is much more convenient to be able to work in a proper IDE, then check in the change, check them out in Databricks and test them. By mistake I came across this in a project and the package was a folder in the root of the repository. When I ran tests, import found the folder in the repository. Very convenient. The tests resided in the same repository.
However, I have not fully understood or have had time to figure out how import could find this. There are numerous questions concerning this:
I do not want to perform any ugly hacks changing sys.path inside the notebooks or the packages.
09-04-2022 08:53 AM
I think this might help: https://docs.databricks.com/_static/notebooks/files-in-repos.html
It's a quick explanation of the path and how to alter it in databricks.
09-02-2022 12:54 AM
I have been a bit busy, but I followed up the question with a reply. Thanks for the reminder.
09-20-2022 02:44 AM
Hi @Jonas Mellin
Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.
We'd love to hear from you.
Thanks!
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group