cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Orchestrate run of a folder

cmilligan
Contributor II

I'm needing to run the contents of a folder, which can change over time. Is there a way to set up a notebook that can orchestrate running all notebooks in a folder? My though was if I could retrieve a list of the notebooks I could create a loop to run them

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

List all notebooks by making API call and then run them by using dbutils.notebook.run:

import requests
ctx = dbutils.notebook.entry_point.getDbutils().notebook().getContext()
host_name = ctx.tags().get("browserHostName").get()
host_token = ctx.apiToken().get()
 
notebook_folder = '/Users/hubert.dudek@databrickster.com'
 
response = requests.get(
f'https://{host_name}/api/2.0/workspace/list',
headers={'Authorization': f'Bearer {host_token}'},
json={'path': notebook_folder}
).json()
 
for notebook in response['objects']:
    if notebook['object_type'] == 'NOTEBOOK':
        dbutils.notebook.run(notebook['path'], 1800)

View solution in original post

3 REPLIES 3

weldermartins
Honored Contributor

hello, could you detail what the file type would be? To run several files you could enter the *(asterisk)

Hubert-Dudek
Esteemed Contributor III

List all notebooks by making API call and then run them by using dbutils.notebook.run:

import requests
ctx = dbutils.notebook.entry_point.getDbutils().notebook().getContext()
host_name = ctx.tags().get("browserHostName").get()
host_token = ctx.apiToken().get()
 
notebook_folder = '/Users/hubert.dudek@databrickster.com'
 
response = requests.get(
f'https://{host_name}/api/2.0/workspace/list',
headers={'Authorization': f'Bearer {host_token}'},
json={'path': notebook_folder}
).json()
 
for notebook in response['objects']:
    if notebook['object_type'] == 'NOTEBOOK':
        dbutils.notebook.run(notebook['path'], 1800)

So that seemed to work when running the notebook manually. Unfortunately when I set it as a job to run it throws an error. Also will this run them sequentially? I need them to run one at a time in order. Screenshot_20221103_021555

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!