cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Long time turning on another notebook

LukaszJ
Contributor III

Hello,

I want to run some notebooks from notebook "A".

And regardless of the contents of the some notebook, it is run for a long time (20 seconds). It is constans value and I do not know why it takes so long.

I tried run simple notebook with one input parameter and only print it - it takes the same 20 seconds.

I use this method:

notebook_result = dbutils.notebook.run("notebook_name", 60, {"key1": "value1", "key2": "value2"})

The notebooks are in the same folder and in the same cluster (really good cluster).

Could someone explain me why it takes so long and how can I speed it run?

Best regards,

ลukasz

1 ACCEPTED SOLUTION

Accepted Solutions

LukaszJ
Contributor III

Okay I am not able to set the same session for the both notebooks (parent and children).

So my result is to use %run ./notebook_name .

I put all the code to functions and now I can use them.

Example:

# Children notebook
def do_something(param1, param2):
    # some code ...
    return result_value
# Parent notebook
 
# some code ...
 
%run ./children_notebook
 
# some code ...
 
function_result = do_something(value_1, value_2)
 
# some code ...

Thanks to everyone for the answers

View solution in original post

7 REPLIES 7

MartinB
Contributor III

I guess the creation of the spark session requires the 20 seconds

Ryan_Chynoweth
Honored Contributor III

I believe that dbutils.notebook.run creates a new session so there is a little more overhead. If you do not want to create a new session you can use

%run <NOTEBOOK PATH>

This will execute the notebook inline with the same session as the parent notebook. Note that this shares the session so if you define variables or functions in the child notebook they will be available in the parent notebook.

Also, if you are trying to orchestrate notebooks you should use the task orchestration available in the Databricks jobs ui.

Hello Ryan,

Thank you for the response.

Now I understand.

However, is there any way to put inputs and take outputs from the notebook using this method?

Best regards,

ลukasz

Ryan_Chynoweth
Honored Contributor III

I do not believe you can get outputs from dbutils.notebook.exit. But you could potentially drop a file locally with values and read it in the other notebook or save them as variables and access that variable.

Hubert-Dudek
Esteemed Contributor III

You can also just use files in repos and import needed library/class to your notebook.

If you run 2 notebooks in parallel it is good to reserve resources for every of them using pool option:

spark.sparkContext.setLocalProperty("spark.scheduler.pool", "notebook1")

Hello Hubert,

Thank you for the response.

I am not sure if it works for me.

I run in a loop the same notebook in a few times. Something like that:

spark.sparkContext.setLocalProperty("spark.scheduler.pool", "My_Notebook")
 
for row in data:
    notebook_results = dbutils.notebook.run("My_Notebook", 60, {"data": row})

And yet the time to start any notebook is several seconds.

Could you tell me what is wrong with this solution?

Best regards,

ลukasz

LukaszJ
Contributor III

Okay I am not able to set the same session for the both notebooks (parent and children).

So my result is to use %run ./notebook_name .

I put all the code to functions and now I can use them.

Example:

# Children notebook
def do_something(param1, param2):
    # some code ...
    return result_value
# Parent notebook
 
# some code ...
 
%run ./children_notebook
 
# some code ...
 
function_result = do_something(value_1, value_2)
 
# some code ...

Thanks to everyone for the answers

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.