cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Import Python Modules with Git Folder Error

michal1228
New Contributor
Dear Databricks Community,
 
We encountered Bug in behaviour of import method explained in documentation https://learn.microsoft.com/en-us/azure/databricks/files/workspace-modules#autoreload-for-python-mod....
 
Couple months ago we migrated our pipelines importing dependencies using %run command on notebooks, to importing python (.py) modules adding Workspace root of repo/directory to sys.path. This solution worked for couple months till recently when the modules in Git Folders started failing on import attempt.
 
We observed new behaviour for All-Purpose Cluster in GitFolder 
 - The Workspace root path of the GitFolder is now added to the sys.path by defualt
 
This configuration however still works in Workspace directory where we deploy our code (separate compute) and it also works for Git Folder on Serverless. We're using All-Purpose Dedicated mode clusters both for scheduled jobs and development.
 
We reproduced this failure with various clusters
 
Source
Compute
Test
Git Folder
Serverless
Ok
Workspace Directory
Serverless
Ok
Git Folder
All-Purpose cluster
Fails
Workspace Directory
All-Purpose cluster
Ok
 
 
 
ModuleNotFoundError: No module named 'Libraries'
--------------------------------------------------------------------------- ModuleNotFoundError Traceback (most recent call last) File <command-4896020584242793>, line 1 ----> 1 from Libraries.configuration_module import get_global_configuration 3 global_config = get_global_configuration() 4 environment = global_config["environment_code"] File /databricks/python_shell/dbruntime/autoreload/discoverability/hook.py:72, in AutoreloadDiscoverabilityHook.pre_run_cell.<locals>.patched_import(name, *args, **kwargs) 66 if not self._should_hint and ( 67 (module := sys.modules.get(absolute_name)) is not None and 68 (fname := get_allowed_file_name_or_none(module)) is not None and 69 (mtime := os.stat(fname).st_mtime) > self.last_mtime_by_modname.get( 70 absolute_name, float("inf")) and not self._should_hint): 71 self._should_hint = True ---> 72 module = self._original_builtins_import(name, *args, **kwargs) 73 if (fname := fname or get_allowed_file_name_or_none(module)) is not None: 74 mtime = mtime or os.stat(fname).st_mtime ModuleNotFoundError: No module named 'Libraries'
 
 
Repo structure:
 
--Libraries/
--NotebooksDirectory/tests/
 
 
Import format for notebook located in NotebooksDirectory/tests/:
 
from Libraries.configuration_module import get_global_configuration
 
 
1. What's the recommended way to resolve this problem?
2. Has there been any changes during last two months in Git Folder structure mapping implementation?
3. Is there an available method that allows for importing python workspace files modules from notebooks based in nested structure of repo like ours??
 
 
Thanks for your help!
 
3 REPLIES 3

michal1228
New Contributor

We're using DBR version 16.4

saurabh18cs
Honored Contributor II

hi can you show what have you added as repo root to sys.path and can you try with older dbr once?

 
my notebooks are in 
/Workspace/Users/{{user_email}}/repo/NotebooksDirectory/tests
The record added to sys.path was:
/Workspace/Users/{{user_email}}/repo

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now