โ06-27-2022 06:29 AM
โ06-27-2022 07:10 AM
dbutils.notebook.run starts a new job, that's why it takes this time and test yo can start multiple concurrently using ThreadPool or other async libraries. Probably with the better server, it could be 10 seconds, not 20.
%RUN executes other notebooks like it would be code of notebook which we run (like include in some languages).
Maybe you could plan a different architecture way (but it depends on the use case) as this way is instead for no-time critical projects (analytics). Stream and delta live tables should be used instead if it is a project when that 20s delay is too much.
โ06-27-2022 07:10 AM
dbutils.notebook.run starts a new job, that's why it takes this time and test yo can start multiple concurrently using ThreadPool or other async libraries. Probably with the better server, it could be 10 seconds, not 20.
%RUN executes other notebooks like it would be code of notebook which we run (like include in some languages).
Maybe you could plan a different architecture way (but it depends on the use case) as this way is instead for no-time critical projects (analytics). Stream and delta live tables should be used instead if it is a project when that 20s delay is too much.
โ06-27-2022 12:47 PM
Another approach is to use the Jobs API and leverage a notebook_task with an existing cluster (existing_cluster_id).
โ06-28-2022 01:22 AM
You can use jobs workflow. It helps you to run the notebooks in parallel as well as sequential.
โ06-29-2022 12:24 AM
Is there a way to send data from one notebook to other using workflow?
โ06-28-2022 10:53 AM
This is a great question. Iโve been struggling with opening multiple browser sessions to open more than one notebook at a time.
โ06-28-2022 11:33 AM
dbutils.notebook.run will run multiple notebooks in the same session!
โ06-29-2022 12:23 AM
But it creates new jobs which makes process slow by 20 seconds
โ06-29-2022 12:22 AM
Hello, yes responses were helpful, so I tried workflow but stuck with sending data across notebooks using workflow.
Also, trying delta live tables. What I would like to know is is there threshold about how many records delta live tables can hold at a time?
โ06-29-2022 04:28 AM
Hello, one thing I am not able to understand is, while developing low latency application where even 5 seconds cannot be tolerated, how to create jobs? Because each job takes 20 seconds to init.
โ06-29-2022 03:20 PM
You can use python concurrent run and execute the dbutils.notebooks.runโ
โ06-29-2022 03:37 PM
There is a databricks sample notebook that shows how to use scala futures to run notebooks in parallel. We use this extensively, so I can promise it works ๐ Look for the section Run Notebooks Concurrently https://docs.databricks.com/notebooks/notebook-workflows.html
โ06-30-2022 12:01 AM
This does help, but can you help me understand why job run time takes more time than actual run time of notebook? And is there a way to reduce the time?
โ07-04-2022 06:49 AM
Iโve been struggling with opening multiple browser sessions to open more than one notebook at a time.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group