Re: code execution from Databrick folder

Kartikb · ‎10-21-2024

We are able to run a notebook that references Python code using import statements from a Databricks repo
with the source code checked out. However, we encounter a ModuleNotFoundError when executing the same code from a folder.

Error: ModuleNotFoundError: No module named ...

Is there a future roadmap to enable code execution from Databricks folder and subfolders?

KartikB

saurabh18cs · ‎10-21-2024

Hi Kartik,

Can you elaborate further on your query? I may not able to understand you properly but if your source code is checked out to databricks repos and you want to execute your code for testing purposes , had you try setting up your repo dir into sys paths?

this can go into the top cell of your execution notebook before you're doing your imports.

import os, sys

# get current directory

path = os.getcwd()

print("Current Directory", path)

# prints parent directory

repo_dir = os.path.abspath(os.path.join(path, os.pardir))

# repo_dir = os.path.dirname(os.pardir())

sys.path.append(f"{repo_dir}/")

print(repo_dir)

View solution in original post

filipniziol · ‎10-21-2024

Hi @Kartikb ,

This feature is already available.

As @saurabh18cs mentioned, you probably have not added the location of your python code to sys.path.

Also, it is important to note, that your python code must be a file, and not a notebook. You can recognize by checking the object icon or object type:

Panda · ‎10-21-2024

@Kartikb Best approach is by creating a package and Install Code as a Module or Wheel. This ensures your code is accessible across notebooks and distributed jobs without manual path adjustments

Kartikb · ‎10-21-2024

Below worked as you have suggested.

import os, sys

project_path = os.path.abspath("/Workspace/<folder-name-1>/<folder-name-2>/<top-level-code-folder>")

if project_path not in sys.path:

sys.path.append(project_path)

KartikB