cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Orchestrate run of a folder

cmilligan
Contributor II

I'm needing to run the contents of a folder, which can change over time. Is there a way to set up a notebook that can orchestrate running all notebooks in a folder? My though was if I could retrieve a list of the notebooks I could create a loop to run them

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

List all notebooks by making API call and then run them by using dbutils.notebook.run:

import requests
ctx = dbutils.notebook.entry_point.getDbutils().notebook().getContext()
host_name = ctx.tags().get("browserHostName").get()
host_token = ctx.apiToken().get()
 
notebook_folder = '/Users/hubert.dudek@databrickster.com'
 
response = requests.get(
f'https://{host_name}/api/2.0/workspace/list',
headers={'Authorization': f'Bearer {host_token}'},
json={'path': notebook_folder}
).json()
 
for notebook in response['objects']:
    if notebook['object_type'] == 'NOTEBOOK':
        dbutils.notebook.run(notebook['path'], 1800)

View solution in original post

3 REPLIES 3

weldermartins
Honored Contributor

hello, could you detail what the file type would be? To run several files you could enter the *(asterisk)

Hubert-Dudek
Esteemed Contributor III

List all notebooks by making API call and then run them by using dbutils.notebook.run:

import requests
ctx = dbutils.notebook.entry_point.getDbutils().notebook().getContext()
host_name = ctx.tags().get("browserHostName").get()
host_token = ctx.apiToken().get()
 
notebook_folder = '/Users/hubert.dudek@databrickster.com'
 
response = requests.get(
f'https://{host_name}/api/2.0/workspace/list',
headers={'Authorization': f'Bearer {host_token}'},
json={'path': notebook_folder}
).json()
 
for notebook in response['objects']:
    if notebook['object_type'] == 'NOTEBOOK':
        dbutils.notebook.run(notebook['path'], 1800)

So that seemed to work when running the notebook manually. Unfortunately when I set it as a job to run it throws an error. Also will this run them sequentially? I need them to run one at a time in order. Screenshot_20221103_021555

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now