Databricks Community

tariq · ‎09-13-2022

I'm not sure how a simple thing like importing a module in python can be so broken in such a product. First, I was able to make it work using the following:

import sys
sys.path.append("/Workspace/Repos/Github Repo/sparkling-to-databricks/src")
from utils.some_util import *

I was able to use the imported function. But then I restarted the cluster and this would not work even though the path is in sys.path.

I also tried the following:

spark.sparkContext.addPyFile("/Workspace/Repos/Github Repo/my-repo/src/utils/some_util.py")

This did not work either. Can someone please tell me what I'm doing wrong here and suggest a solution. Thanks.

Anonymous · ‎09-13-2022

https://docs.databricks.com/libraries/index.html the docs should help.

Keep in mind that paths on the cluster are distributed paths and that paths in python are local to the driver. If you do %fs ls / you'll get a different result than if you do %sh fs ls /

tariq · ‎09-13-2022

So there's no other way than creating a library?

KrishZ · ‎09-13-2022

I too wonder the same thing. How can importing a python module be so difficult and not even documented lol.

No need for libraries..

Here's what worked for me..

Step1: Upload the module by first opening a notebook >> File >> Upload Data >> drag and drop your module

Step2: Click on Next

Step3: Copy the databricks path for your module. (this path is diplayed in the pop up that you see just after click on Next)

For me , if my module is named test_module the path looks like

dbfs:/FileStore/shared_uploads/krishz@company.com/test_module.py

Step4: Append the above to Path (albeit with changes)

Change 1: Change dbfs:/ to /dbfs/
Change 2: remove your module name from the path 🙂

Now my path to append looks like

/dbfs/FileStore/shared_uploads/krishz@company.com

Step5:

import sys
sys.path.append("/dbfs/FileStore/shared_uploads/krishz@company.com")

Step6:

Now you can import simply by using below:

import test_module

Post this, you may write code as if you had imported test_module in your usual Jupyter notebook - No need to worry about any databricks intricacies.

Lemme know if you are unclear about any step. Honestly speaking, I don't know why you were recommended to use libraries for such a simple request.

tariq · ‎09-13-2022

Thanks for the reply. The file I have is part of a repo within the repo structure. Is there a way to import dependencies within a repo. The repo structure looks something like below:

And I need to import the some_util module in my_notebook.

Databricks Community

Importing python module

Photos

Connect with Databricks Users in Your Area

Data + AI Summit 2025 — registration now open!

Women’s Week Challenge: Play, Engage & Win Swag

Databricks DevConnect: Global Community Meetups for Data Engineers

Databricks Community Champion - February 2025 - Stefan Koch