cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to call a few child notebooks from master notebook parallelly?

andrew0117
Contributor

Planning using dbutils.notebook.run() to call all the child notebooks in the master notebook, but they are executed sequentially. 

1 ACCEPTED SOLUTION

Accepted Solutions

UmaMahesh1
Honored Contributor III

Hi @andrew li​ 

You can do this using scala or python constructs using threads and futures.

You can download and import the notebook archive from this link. It has the function to run notebooks parallelly.

https://docs.databricks.com/notebooks/notebook-workflows.html#run-multiple-notebooks-concurrently

After that, based on your preference, set the number of parallel notebooks to be run using numNotebooksInParallel variable in parallel-notebooks notebook .

Once done, you can call the parallelNotebooks function to run your notebooks parallelly. For examples on how to do that, refer Concurrent Notebooks notebook in that downloaded archive.

Be careful not to crash your driver by providing too many parallel notebooks.

Hope this helps.. Cheers.

View solution in original post

5 REPLIES 5

Anonymous
Not applicable

Hi @andrew li​ 

Great to meet you, and thanks for your question! 

Let's see if your peers in the community have an answer to your question first. Or else bricksters will get back to you soon.

Thanks

UmaMahesh1
Honored Contributor III

Hi @andrew li​ 

You can do this using scala or python constructs using threads and futures.

You can download and import the notebook archive from this link. It has the function to run notebooks parallelly.

https://docs.databricks.com/notebooks/notebook-workflows.html#run-multiple-notebooks-concurrently

After that, based on your preference, set the number of parallel notebooks to be run using numNotebooksInParallel variable in parallel-notebooks notebook .

Once done, you can call the parallelNotebooks function to run your notebooks parallelly. For examples on how to do that, refer Concurrent Notebooks notebook in that downloaded archive.

Be careful not to crash your driver by providing too many parallel notebooks.

Hope this helps.. Cheers.

Thank you very much!

So, by using thread, all the jobs running child notebook are sharing the same cluster on which the master notebook is running?

UmaMahesh1
Honored Contributor III

Hi @andrew li​ 

Yes, They do run on the cluster on which the master notebook is running.

Specifically, we are multithreading the Spark driver with Futures to enable parallel job submission.

You can check out more on threads and futures for deeper understanding.

Hope this helps. Do mark the above as the best answer if it helped.

Cheers.

Kaniz
Community Manager
Community Manager

Hi @andrew li​, We haven’t heard from you since the last response from @Uma Maheswara Rao Desula​, and I was checking back to see if his suggestions helped you.

Or else, If you have any solution, please share it with the community, as it can be helpful to others.

Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.