We are using the "databricks_notebook" terraform resource to deploy our notebooks into the "Workspace" as part of our CICD run, and our jobs run notebooks from the workspace. For development we clone the repo into "Repos". At the moment the only modularization of our code is done with %run statements, and we have a large "utils" folder in our repo, but I am investigating how to move to an import-based workflow. In "Repos" it works fine, since the root of the repo is automatically added to sys.path, we can then do things like "import notebooks.utils.stuff" from anywhere in the tree. But when deployed to the Workspace, only the current path is added, not the root "/Workspace", so then the import does not work. I guess we could modify sys.path and add "/Workspace", but
1: That is ugly and error prone, and
2: It makes it likely that we will end up importing the wrong version (from /Workspace) when we are developing on a branch in Repos.
Any tips? Have I maybe overlooked something smart?
Just in case it was not clear, here is our folderstructure:
- Readme.md
- notebooks/utils/utilnotebook1.py (and many more)
- notebooks/gold/silver/problem1/notebook1.py (and silver + bronze)
- terraform/