cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Orchestrate run of a folder

cmilligan
Contributor II

I'm needing to run the contents of a folder, which can change over time. Is there a way to set up a notebook that can orchestrate running all notebooks in a folder? My though was if I could retrieve a list of the notebooks I could create a loop to run them

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

List all notebooks by making API call and then run them by using dbutils.notebook.run:

import requests
ctx = dbutils.notebook.entry_point.getDbutils().notebook().getContext()
host_name = ctx.tags().get("browserHostName").get()
host_token = ctx.apiToken().get()
 
notebook_folder = '/Users/hubert.dudek@databrickster.com'
 
response = requests.get(
f'https://{host_name}/api/2.0/workspace/list',
headers={'Authorization': f'Bearer {host_token}'},
json={'path': notebook_folder}
).json()
 
for notebook in response['objects']:
    if notebook['object_type'] == 'NOTEBOOK':
        dbutils.notebook.run(notebook['path'], 1800)

View solution in original post

3 REPLIES 3

weldermartins
Honored Contributor

hello, could you detail what the file type would be? To run several files you could enter the *(asterisk)

Hubert-Dudek
Esteemed Contributor III

List all notebooks by making API call and then run them by using dbutils.notebook.run:

import requests
ctx = dbutils.notebook.entry_point.getDbutils().notebook().getContext()
host_name = ctx.tags().get("browserHostName").get()
host_token = ctx.apiToken().get()
 
notebook_folder = '/Users/hubert.dudek@databrickster.com'
 
response = requests.get(
f'https://{host_name}/api/2.0/workspace/list',
headers={'Authorization': f'Bearer {host_token}'},
json={'path': notebook_folder}
).json()
 
for notebook in response['objects']:
    if notebook['object_type'] == 'NOTEBOOK':
        dbutils.notebook.run(notebook['path'], 1800)

So that seemed to work when running the notebook manually. Unfortunately when I set it as a job to run it throws an error. Also will this run them sequentially? I need them to run one at a time in order. Screenshot_20221103_021555

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group