Hello,
I have an error on a databricks workspace (production) when trying to import a local module into azure databricks notebook. I work with a Databricks Repos.
I have the message : "ModuleNotFoundError: No module named 'test.dataset_test'"
When I test on a different databricks workspace (development), I don't have the problem.
The difference is on PYTHONPATH. I don't see WSFS_NOTEBOOK_DIR.
development : /databricks/spark/python:/databricks/spark/python/lib/py4j-0.10.9.1-src.zip:/databricks/jars/spark--driver--driver-spark_3.2_2.12_deploy.jar:/databricks/spark/python:/databricks/jars/spark--maven-trees--ml--10.x--graphframes--org.graphframes--graphframes_2.12--org.graphframes__graphframes_2.12__0.8.2-db1-spark3.2.jar:/databricks/python_shell:/WSFS_NOTEBOOK_DIR
production : /databricks/spark/python:/databricks/spark/python/lib/py4j-0.10.9.1-src.zip:/databricks/jars/spark--driver--driver-spark_3.2_2.12_deploy.jar:/databricks/spark/python:/databricks/jars/spark--maven-trees--ml--10.x--graphframes--org.graphframes--graphframes_2.12--org.graphframes__graphframes_2.12__0.8.2-db1-spark3.2.jar:/databricks/python_shell
The difference is on sys.path
On production databricks workspace, the execution seems to do under the driver.
On Development databricks workspace, the execution does under the path of repo, so import works.
I join the content of my sys.path on the 2 environments.
Do you have any idea why PYTHONPATH and sys.path are not automatically updated on my production workspace even though everything is correct on my development environment? The environment have the same runtime databricks.
Regards,
Nath