How to import a function to another notebook?

tramtran
Contributor

Could you please provide guidance on the correct way to dynamically import a Python module from a user-specific path in Databricks Repos? Any advice on resolving the ModuleNotFoundError would be greatly appreciated.

udf_check_table_exists notebook:

from pyspark.sql.utils import AnalysisException
import os
import sys

def table_exists(table_name):
    try:
        spark.table(table_name)
        return True
    except AnalysisException:
        return False
 
Error ModuleNotFoundError: No module named 'udf_check_table_exists' with another notebook:
import os
import sys
sys.path.append(os.path.abspath('/Workspace/Repos/[my repo]/dw/common_helper'))

from udf_check_table_exists import table_exists
 

Thank you for your assistance.

Hkesharwani
Contributor II

Hi,
There are two ways to import functions from other notebook:

  1. %run ../notebook path : This command will run the entire notebook and the function along with all the variable names will be imported. [This function should ideally be used to import the functions from another notebook, if another notebook only contains function definition]
  2. The second method to import function is for repos: In  repos we can easily  import  static .py files.
    From folder name import function.
    Refer to this documentation for more detail answer: https://www.databricks.com/blog/2021/10/07/databricks-repos-is-now-generally-available.html
Harshit Kesharwani
Data engineer at Rsystema

tramtran
Contributor

Thanks @Hkesharwani for your replying,

As DLT doesn't support the magic command %run, that's why I'm trying the import function way

my layout like this:

tramtran_0-1718612237964.png

other notebook:

import os
import sys
sys.path.append(os.path.abspath('/Workspace/Repos/[my repo]/dw/common_helper'))

 

from udf_check_table_exists import table_exists
 
But it always returns ModuleNotFoundError  error

 

jacovangelder
Databricks MVP

Hey @tramtran,

I didn't try with repos, but it works with a Workspace path the following way (we're doing it like this so I validated it for you). I guess since Repos are also inside /Workspace, it will work the same way. 

Lets use the default /Workspace/Shared folder for this example.

1. Add the .py file with your table_exists function to the /Workspace/Shared folder. Lets call the file function_file.py for this example. 
2. Create an __init__.py file in this/Workspace/Shared directory as well. So Databricks knows this is a package index. 
3. In the notebook you want to import the function from the .py file, add this:

 

import sys
sys.path.append("/Workspace/Shared")

 

then import the function like this:

 

from function_file import table_exists

 

Hope this helps, good luck! 

tramtran
Contributor

Thanks @jacovangelder,

I did the same as your suggestion, but another error appeared:

OSError: [Errno 95] Operation not supported: '/Workspace/Shared/function_file.py'.

Have you faced this issue before?

I haven't, have you done exactly the steps outlined? It looks like perhaps you're not allowed to read form the shared location? Not 100% sure. 

tramtran
Contributor

It works for me now 🙂 by refer this solution: 

[Errno 95] Operation not supported · Issue #113823 · MicrosoftDocs/azure-docs · GitHub

creating a "file" instead of a "notebook" and moving the code from the notebook into the file, I was able to use the "import" statement

my import code:

import sys
sys.path.append("/Workspace/Shared/")

from test_import_func import table_exists

print(table_exists('mytable',spark))

 

View solution in original post

tramtran
Contributor

Thank you all again