cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

workspace notebook path not recognized by dbutils.notebook.run() when running from a workflow/job

siva_pusarla
Visitor

result = dbutils.notebooks.run("/Workspace/YourFolder/NotebookA",
timeout_seconds=600, arguments={"param1": "value1"})
print(result)

I was able to execute the above code manually from a notebook.

But when i run the same notebook as a job, it fails stating that the path not found.

Complete code is checked-in to repo and job reads the source from repo.

But I'm still unclear as I have mentioned the absolute path of the workspace notebookA.

3 REPLIES 3

K_Anudeep
Databricks Employee
Databricks Employee

Hello @siva_pusarla ,

As per official docs, the “absolute path” examples for running/importing notebooks are like:

  • /Users/<user>@<org>/folder/notebook (workspace notebooks)
  • and for repos, /Repos/... or /Workspace/Repos/... (those two are equivalent for Repos)

So /Workspace/YourFolder/NotebookA is often not a valid notebook object path, unless that exact path exists as a notebook path in your workspace.

Also, when the job is configured to run from a remote Git repository, the notebook being executed is an ephemeral checkout, and paths behave differently between Git folders and workspace folders. Databricks even calls out that Git-folder path behaviour can differ from workspace-folder behaviour.

Please refer to dochttps://kb.databricks.com/libraries/paths-behave-differently-on-git-folders-and-workspace-folders

So in the job run, Databricks tries to resolve /Workspace/YourFolder/NotebookA as a notebook path and can’t find it and hence throws the pathNotFoundError

You need to refer to this doc for best practices to use Notebooks: https://docs.databricks.com/aws/en/notebooks/notebook-workflows

 

Anudeep

Hi Anudeep,

thanks for your response..

notebookA is not the part of repo code.. it is a constant file in workspace.

I would like to keep some files local to the workspace(for env setup) irrespective of the repos.

| git_folder (GIT)
| -- module
| ---- app.py

...
| Workspace_folder
| -- Common_Utils
| ---- env_setup.py

env_setup is local to different workspaces - dev,test and prod - hence cannot checkin to repo. 

Similar to the above set up i want to run dbutils.notebook.run(/Workspace/Common_Utils/env_setup) from app.py while executing the app through workflow/job

com.databricks.WorkflowException: com.databricks.NotebookExecutionException: FAILED: Unable to access the notebook "Workspace/Common_Utils/env_setup". Either it does not exist, or the identity used to run this job, xxx, lacks the required permissions.

But both the notebook and the permission to the notebook exists and works fine when run outside the job

Poorva21
New Contributor II

@siva_pusarla , 

Try to convert env_setup into repo-based code and control behavior via environment

Instead of a workspace notebook, use a Python module in the repo and drive environment differences using:

Job parameters

Branches (dev / test / prod)

Secrets (workspace-specific)

Example repo structure

repo/

├── common_utils/

 |       └── env_setup.py

└── app.py
Example:
env_setup.py

from databricks.sdk.runtime import dbutils

def load_env(env):
if env == "dev":
return {
"catalog": "dev_catalog",
"password": dbutils.secrets.get("app-secrets", "db-password")
}
if env == "test":
return {
"catalog": "test_catalog",
"password": dbutils.secrets.get("app-secrets", "db-password")
}
if env == "prod":
return {
"catalog": "prod_catalog",
"password": dbutils.secrets.get("app-secrets", "db-password")
}

app.py

from common_utils.env_setup import load_env
from databricks.sdk.runtime import dbutils

env = dbutils.widgets.get("env")
config = load_env(env)

print(f"Running in {env}, catalog = {config['catalog']}")