cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Trying to use Python source file as module in databricks Notebook

adhi_databricks
New Contributor III

Hi everyone,

I’m currently working on a project in Databricks(version 13.3 LTS) and could use some help with importing external Python files as modules into my notebook. I’m aiming to organize my code better and reuse functions across different notebooks.

Could someone please provide detailed steps or best practices for this? Are there any specific configurations I should be aware of, or recommended file structures? 

Here’s the structure I’m working with:

 

 
test/ ├── Code/ │ └── notebook └── utils/ └── utilities.py

I need to import functions from utilities.py to use them in my notebook. However, I’m encountering a "module not found" error when I try to do this.

Thanks for your assistance!

13 REPLIES 13

filipniziol
Contributor III

Hi @adhi_databricks ,
to accomplish what you need:
1. Make sure that utilities.py are a file, and not a notebook. If you created the notebook, you will not be able to do the imports.

filipniziol_0-1728142720237.png

2. Append path to utils and import utilities (not utils.utilities):
filipniziol_1-1728143067624.png

 

 

Hi @filipniziol,

It works with the Serverless cluster. However, I'm using the 13.3 LTS DBR version, and it's not functioning as expected. Is there a specific version that is to be compatible with this usecase?

Hi @adhi_databricks ,
Since Databricks Runtime 14.0+ the default current directory has been changed:
https://docs.databricks.com/en/files/workspace-modules.html

filipniziol_0-1728147908843.png

To solve it:
1. Use Serverless, or any version 14+, like 14.3 LTS, 15.4 LTS
2. In version 13.3 change the path according to the old design, one of the solutions is simply to use absolute path to the utils:

filipniziol_1-1728148157745.png

One more thing, when retesting with different runtime/path, make sure to Clear state in the notebook:

filipniziol_0-1728148759729.png

 

Hey @filipniziol,

Here’s the path to the notebook:
/Workspace/Repos/xyz@email.com/de/usecase/main/data_pipelines/notebooks/main_notebook

And here’s the path to the utils Python file:
/Workspace/Repos/xyz@email.com/de/usecase/main/data_pipelines/utils.py

I’m on version 13.3 of the DBR, and using the absolute path in the Repos folder didn’t work for me. However, it did work in the Workspace/Shared folder, as you mentioned.

Could you help me with this? Thanks!

filipniziol
Contributor III

Hi @adhi_databricks ,

Here is the article on current working directory for different versions:
https://docs.databricks.com/en/files/cwd-dbr-14.html

The solution is change the current working directory to your current directory and then to use relative paths (in your case it should be "../"):

 

import os

os.chdir("/tmp")

 

Here is the code tested using DBR 13.3 inside /Workspace/Repos:

filipniziol_0-1728226157543.png

 

Hey @filipniziol ,

According to the documentation from the link you shared, when code runs in a path under Workspace/Repos, the current working directory depends on your admin configuration and the cluster's DBR version. Specifically, for workspaces with  enableWorkspaceFilesystem set to dbr11.0+ on DBR versions 11.0 and higher

the CWD is the directory containing the notebook or script being executed.

I'm using os.getcwd to get the CWD, which reflects where the script is running. However, I'm having trouble with sys.path.append, both with and without the os.chdir command. Any insights?

Hi @adhi_databricks ,

in your case the file is called utilities and not utils. You need to import the name of the file, so to import utilities.

EDIT:
I realized first in was utilities, later utils.
Could you share once again what are your paths and what is the name of the .py file.
Also is your "main_notebok" a notebook, or it is also some folder where the notebook is located.

Hey @filipniziol, name of file is utils.py and notebook is main_notebook(NOTEBOOK)

Here’s the path to the notebook:
/Workspace/Repos/xyz@email.com/de/usecase/main/data_pipelines/notebooks/main_notebook

And here’s the path to the utils Python file:
/Workspace/Repos/xyz@email.com/de/usecase/main/data_pipelines/utils.py


Hmm.. it looks good. Could you clear state and cell output and try once again?

I’ve already tried that, but it didn’t work.

filipniziol
Contributor III

Hi @adhi_databricks ,

1. Double check the directories Python is using to look for modules:

import sys 

... add the path...

print(sys.path)

2. Double check utils.py is a file and not a notebook

3. Try to set manually the current working directory:

import os
os.chdir('/Workspace/Repos/xyz@email.com/de/usecase/main/data_pipelines/notebooks')

4. Always test first clearing the state of the notebook

5. Prioritize your custom path using sys.path.insert(0, path) instead sys.path.append()

hey @filipniziol ,As DBR Version is 13.3 and enableWorkspaceFilesystem is enabled the cwd is already set to 

 

/Workspace/Repos/xyz@email.com/de/usecase/main/data_pipelines/notebooks

 

But still prioritized the path using sys.path.insert(0,path),still facing Module not found error😐

 

filipniziol
Contributor III

Hi @adhi_databricks ,

I am out of ideas in this case. Is utils.py the correct python file, no errors found.
Could you test with some simple code like below?
I am starting to think there is something wrong with the file (although you mentioned it works in /Shared folder).

filipniziol_0-1728233763162.png

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group