cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Is there a way to run notebooks concurrently in same session?

darshan
New Contributor III

tried using-

dbutils.notebook.run(notebook.path, notebook.timeout, notebook.parameters)

but it takes 20 seconds to start new session. %run uses same session but cannot figure out how to use it to run notebooks concurrently.

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

dbutils.notebook.run starts a new job, that's why it takes this time and test yo can start multiple concurrently using ThreadPool or other async libraries. Probably with the better server, it could be 10 seconds, not 20.

%RUN executes other notebooks like it would be code of notebook which we run (like include in some languages).

Maybe you could plan a different architecture way (but it depends on the use case) as this way is instead for no-time critical projects (analytics). Stream and delta live tables should be used instead if it is a project when that 20s delay is too much.

View solution in original post

13 REPLIES 13

Hubert-Dudek
Esteemed Contributor III

dbutils.notebook.run starts a new job, that's why it takes this time and test yo can start multiple concurrently using ThreadPool or other async libraries. Probably with the better server, it could be 10 seconds, not 20.

%RUN executes other notebooks like it would be code of notebook which we run (like include in some languages).

Maybe you could plan a different architecture way (but it depends on the use case) as this way is instead for no-time critical projects (analytics). Stream and delta live tables should be used instead if it is a project when that 20s delay is too much.

craig_lukasik
Databricks Employee
Databricks Employee

Another approach is to use the Jobs API and leverage a notebook_task with an existing cluster (existing_cluster_id).

Prabakar
Databricks Employee
Databricks Employee

You can use jobs workflow. It helps you to run the notebooks in parallel as well as sequential.

darshan
New Contributor III

Is there a way to send data from one notebook to other using workflow?

bb_huey99
New Contributor II

This is a great question. Iโ€™ve been struggling with opening multiple browser sessions to open more than one notebook at a time.

sbo
New Contributor III

dbutils.notebook.run will run multiple notebooks in the same session!

darshan
New Contributor III

But it creates new jobs which makes process slow by 20 seconds

darshan
New Contributor III

Hello, yes responses were helpful, so I tried workflow but stuck with sending data across notebooks using workflow.

Also, trying delta live tables. What I would like to know is is there threshold about how many records delta live tables can hold at a time?

darshan
New Contributor III

Hello, one thing I am not able to understand is, while developing low latency application where even 5 seconds cannot be tolerated, how to create jobs? Because each job takes 20 seconds to init.

zulazuardi
New Contributor II

You can use python concurrent run and execute the dbutils.notebooks.runโ€‹

gfree76
New Contributor III

There is a databricks sample notebook that shows how to use scala futures to run notebooks in parallel. We use this extensively, so I can promise it works ๐Ÿ™‚ Look for the section Run Notebooks Concurrently https://docs.databricks.com/notebooks/notebook-workflows.html

darshan
New Contributor III

This does help, but can you help me understand why job run time takes more time than actual run time of notebook? And is there a way to reduce the time?

rudesingh56
New Contributor II

Iโ€™ve been struggling with opening multiple browser sessions to open more than one notebook at a time.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group