Databricks Community

LukaszJ · ‎02-28-2022

Hello,

I want to run some notebooks from notebook "A".

And regardless of the contents of the some notebook, it is run for a long time (20 seconds). It is constans value and I do not know why it takes so long.

I tried run simple notebook with one input parameter and only print it - it takes the same 20 seconds.

I use this method:

notebook_result = dbutils.notebook.run("notebook_name", 60, {"key1": "value1", "key2": "value2"})

The notebooks are in the same folder and in the same cluster (really good cluster).

Could someone explain me why it takes so long and how can I speed it run?

Best regards,

Łukasz

LukaszJ · ‎03-09-2022

Okay I am not able to set the same session for the both notebooks (parent and children).

So my result is to use %run ./notebook_name .

I put all the code to functions and now I can use them.

Example:

# Children notebook
def do_something(param1, param2):
    # some code ...
    return result_value

# Parent notebook
 
# some code ...
 
%run ./children_notebook
 
# some code ...
 
function_result = do_something(value_1, value_2)
 
# some code ...

Thanks to everyone for the answers

View solution in original post

MartinB · ‎02-28-2022

I guess the creation of the spark session requires the 20 seconds

Ryan_Chynoweth · ‎02-28-2022

I believe that dbutils.notebook.run creates a new session so there is a little more overhead. If you do not want to create a new session you can use

%run <NOTEBOOK PATH>

This will execute the notebook inline with the same session as the parent notebook. Note that this shares the session so if you define variables or functions in the child notebook they will be available in the parent notebook.

Also, if you are trying to orchestrate notebooks you should use the task orchestration available in the Databricks jobs ui.

LukaszJ · ‎03-01-2022

Hello Ryan,

Thank you for the response.

Now I understand.

However, is there any way to put inputs and take outputs from the notebook using this method?

Best regards,

Łukasz

Ryan_Chynoweth · ‎03-04-2022

I do not believe you can get outputs from dbutils.notebook.exit. But you could potentially drop a file locally with values and read it in the other notebook or save them as variables and access that variable.

Hubert-Dudek · ‎03-01-2022

You can also just use files in repos and import needed library/class to your notebook.

If you run 2 notebooks in parallel it is good to reserve resources for every of them using pool option:

spark.sparkContext.setLocalProperty("spark.scheduler.pool", "notebook1")

LukaszJ · ‎03-04-2022

Hello Hubert,

Thank you for the response.

I am not sure if it works for me.

I run in a loop the same notebook in a few times. Something like that:

spark.sparkContext.setLocalProperty("spark.scheduler.pool", "My_Notebook")
 
for row in data:
    notebook_results = dbutils.notebook.run("My_Notebook", 60, {"data": row})

And yet the time to start any notebook is several seconds.

Could you tell me what is wrong with this solution?

Best regards,

Łukasz

LukaszJ · ‎03-09-2022

Okay I am not able to set the same session for the both notebooks (parent and children).

So my result is to use %run ./notebook_name .

I put all the code to functions and now I can use them.

Example:

# Children notebook
def do_something(param1, param2):
    # some code ...
    return result_value

# Parent notebook
 
# some code ...
 
%run ./children_notebook
 
# some code ...
 
function_result = do_something(value_1, value_2)
 
# some code ...

Thanks to everyone for the answers

Databricks Community

Long time turning on another notebook

Join Us as a Local Community Builder!

🌟 Community Pulse: Your Weekly Roundup! November 28 – December 04, 2025

Lakehouse, Lagers & Legends — Bangalore Meetup | December 13

Join us for another BrickTalk: Vibe-Coding Databricks Apps in Replit with Augusto!

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐