08-20-2023 11:50 PM
I'm deploying a new workspace for testing the deployed notebooks. But when trying to import the python files as module in the newly deployed workspace, I'm getting an error saying "function not found".
Two points to note here:
1. If I append absolute path of the python file to the sys path, it runs fine. But when I append relative path, it throws "not found" error (Relative path is not able to pick workspace's path correctly)
2. If the python file and the notebook are in a same directory, it works fine
I know I can use files in repos to fix this but is it possible to do this using workspace only ?
NOTE:
In the deployed workspace ,The cwd of the file i want to import is in this format : /home/spark-9851f...-....-....-....-.. (not aligned with the folder structure)
But when working with repos, the cwd is correct and is aligned with the folder structure
09-06-2023 04:04 PM
Hi @Avin_Kohale ,
An error was encountered when importing Python files as modules in a newly deployed workspace
- Error message: "Function not found."
- Two points to note:
- Steps to resolve the issue using the workspace:
__file__
attribute to get the absolute path of the notebook and construct the relative path to Python file
09-28-2023 09:06 AM
Hi @Kaniz_Fatma, I see your suggestion to append the necessary path to the sys.path. I'm curious if this is the recommendation for projects deployed via Databricks Asset Bundles. I want to maintain a project structure that looks something like this:
project/
├── app/
│ ├── nb1.py
│ ├── nb2.py
└── src/
├── foo.py
└── bar.py
I want do the following import in nb1:
from src.foo import foo_func
If this were a Databricks Repo, that would work fine since I think Databricks repos add the root to sys.path. However, I'm deploying via Databricks Asset Bundles, which deploy to a workspace directory, not a repo. I'm curious if there are any better recommendations for Databricks Asset Bundles deployments, e.g. could it be deployed directly to a repo?
11-15-2023 08:59 PM
@TimW did you ever solve this? I haven't found a successful way to achieve the same as you've depicted (which we can easily do when using Repos.
11-16-2023 05:25 AM
Hi @JeremyFord, after some research I think that in my example appending the 'project' directory to sys.path is in fact the recommended way to do this. In studying for the Data Engineering Professional Exam, I came across this resource, which gives some pretty clear examples on how Databricks recommends importing .py modules from outside of your current working directory: https://github.com/databricks-academy/cli-demo/blob/published/notebooks/00_refactoring_to_relative_i....
11-17-2023 02:58 AM
@JeremyFord, I found this recommendation that is probably the best way to handle this. They suggest passing the path of your bundle root as a param to your notebook and then appending to sys.path. I haven't tried it myself, but looks like a good approach.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group