<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Long time turning on another notebook in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/long-time-turning-on-another-notebook/m-p/26831#M18841</link>
    <description>&lt;P&gt;Hello Hubert,&lt;/P&gt;&lt;P&gt;Thank you for the response.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am not sure if it works for me.&lt;/P&gt;&lt;P&gt;I run in a loop the same notebook in a few times. Something like that:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;spark.sparkContext.setLocalProperty("spark.scheduler.pool", "My_Notebook")
&amp;nbsp;
for row in data:
    notebook_results = dbutils.notebook.run("My_Notebook", 60, {"data": row})&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;And yet the time to start any notebook is several seconds.&lt;/P&gt;&lt;P&gt;Could you tell me what is wrong with this solution?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Best regards,&lt;/P&gt;&lt;P&gt;Łukasz&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 04 Mar 2022 13:48:43 GMT</pubDate>
    <dc:creator>LukaszJ</dc:creator>
    <dc:date>2022-03-04T13:48:43Z</dc:date>
    <item>
      <title>Long time turning on another notebook</title>
      <link>https://community.databricks.com/t5/data-engineering/long-time-turning-on-another-notebook/m-p/26826#M18836</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I want to run some notebooks from notebook "A".&lt;/P&gt;
&lt;P&gt;And regardless of the contents of the some notebook, it is run for a long time (20 seconds). It is constans value and I do not know why it takes so long.&lt;/P&gt;
&lt;P&gt;I tried run simple notebook with one input parameter and only print it - it takes the same 20 seconds.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I use this method:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;notebook_result = dbutils.notebook.run("notebook_name", 60, {"key1": "value1", "key2": "value2"})&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The notebooks are in the same folder and in the same cluster (really good cluster).&lt;/P&gt;
&lt;P&gt;Could someone explain me why it takes so long and how can I speed it run?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Best regards,&lt;/P&gt;
&lt;P&gt;Łukasz&lt;/P&gt;</description>
      <pubDate>Fri, 21 Mar 2025 13:17:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/long-time-turning-on-another-notebook/m-p/26826#M18836</guid>
      <dc:creator>LukaszJ</dc:creator>
      <dc:date>2025-03-21T13:17:32Z</dc:date>
    </item>
    <item>
      <title>Re: Long time turning on another notebook</title>
      <link>https://community.databricks.com/t5/data-engineering/long-time-turning-on-another-notebook/m-p/26827#M18837</link>
      <description>&lt;P&gt;I guess the creation of the &lt;B&gt;&lt;U&gt;spark session&lt;/U&gt;&lt;/B&gt; requires the 20 seconds&lt;/P&gt;</description>
      <pubDate>Mon, 28 Feb 2022 16:59:35 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/long-time-turning-on-another-notebook/m-p/26827#M18837</guid>
      <dc:creator>MartinB</dc:creator>
      <dc:date>2022-02-28T16:59:35Z</dc:date>
    </item>
    <item>
      <title>Re: Long time turning on another notebook</title>
      <link>https://community.databricks.com/t5/data-engineering/long-time-turning-on-another-notebook/m-p/26828#M18838</link>
      <description>&lt;P&gt;I believe that dbutils.notebook.run creates a new session so there is a little more overhead. If you do not want to create a new session you can use &lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;%run &amp;lt;NOTEBOOK PATH&amp;gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;This will execute the notebook inline with the same session as the parent notebook. Note that this shares the session so if you define variables or functions in the child notebook they will be available in the parent notebook. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Also, if you are trying to orchestrate notebooks you should use the task orchestration available in the Databricks jobs ui. &lt;/P&gt;</description>
      <pubDate>Mon, 28 Feb 2022 17:50:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/long-time-turning-on-another-notebook/m-p/26828#M18838</guid>
      <dc:creator>Ryan_Chynoweth</dc:creator>
      <dc:date>2022-02-28T17:50:46Z</dc:date>
    </item>
    <item>
      <title>Re: Long time turning on another notebook</title>
      <link>https://community.databricks.com/t5/data-engineering/long-time-turning-on-another-notebook/m-p/26829#M18839</link>
      <description>&lt;P&gt;You can also just use files in repos and import needed library/class to your notebook.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If you run 2 notebooks in parallel it is good to reserve resources for every of them using pool option:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;spark.sparkContext.setLocalProperty("spark.scheduler.pool", "notebook1")&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Mar 2022 08:33:31 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/long-time-turning-on-another-notebook/m-p/26829#M18839</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2022-03-01T08:33:31Z</dc:date>
    </item>
    <item>
      <title>Re: Long time turning on another notebook</title>
      <link>https://community.databricks.com/t5/data-engineering/long-time-turning-on-another-notebook/m-p/26830#M18840</link>
      <description>&lt;P&gt;Hello Ryan,&lt;/P&gt;&lt;P&gt;Thank you for the response.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Now I understand.&lt;/P&gt;&lt;P&gt;However, is there any way to put inputs and take outputs from the notebook using this method?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Best regards,&lt;/P&gt;&lt;P&gt;Łukasz&lt;/P&gt;</description>
      <pubDate>Tue, 01 Mar 2022 08:41:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/long-time-turning-on-another-notebook/m-p/26830#M18840</guid>
      <dc:creator>LukaszJ</dc:creator>
      <dc:date>2022-03-01T08:41:45Z</dc:date>
    </item>
    <item>
      <title>Re: Long time turning on another notebook</title>
      <link>https://community.databricks.com/t5/data-engineering/long-time-turning-on-another-notebook/m-p/26831#M18841</link>
      <description>&lt;P&gt;Hello Hubert,&lt;/P&gt;&lt;P&gt;Thank you for the response.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I am not sure if it works for me.&lt;/P&gt;&lt;P&gt;I run in a loop the same notebook in a few times. Something like that:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;spark.sparkContext.setLocalProperty("spark.scheduler.pool", "My_Notebook")
&amp;nbsp;
for row in data:
    notebook_results = dbutils.notebook.run("My_Notebook", 60, {"data": row})&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;And yet the time to start any notebook is several seconds.&lt;/P&gt;&lt;P&gt;Could you tell me what is wrong with this solution?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Best regards,&lt;/P&gt;&lt;P&gt;Łukasz&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 04 Mar 2022 13:48:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/long-time-turning-on-another-notebook/m-p/26831#M18841</guid>
      <dc:creator>LukaszJ</dc:creator>
      <dc:date>2022-03-04T13:48:43Z</dc:date>
    </item>
    <item>
      <title>Re: Long time turning on another notebook</title>
      <link>https://community.databricks.com/t5/data-engineering/long-time-turning-on-another-notebook/m-p/26832#M18842</link>
      <description>&lt;P&gt;I do not believe you can get outputs from dbutils.notebook.exit. But you could potentially drop a file locally with values and read it in the other notebook or save them as variables and access that variable. &lt;/P&gt;</description>
      <pubDate>Fri, 04 Mar 2022 17:01:58 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/long-time-turning-on-another-notebook/m-p/26832#M18842</guid>
      <dc:creator>Ryan_Chynoweth</dc:creator>
      <dc:date>2022-03-04T17:01:58Z</dc:date>
    </item>
    <item>
      <title>Re: Long time turning on another notebook</title>
      <link>https://community.databricks.com/t5/data-engineering/long-time-turning-on-another-notebook/m-p/26833#M18843</link>
      <description>&lt;P&gt;Okay I am not able to set the same session for the both notebooks (parent and children).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;So my result is to use &lt;B&gt;%run ./notebook_name &lt;/B&gt;.&lt;/P&gt;&lt;P&gt;I put all the code to functions and now I can use them.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Example:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;# Children notebook
def do_something(param1, param2):
    # some code ...
    return result_value&lt;/CODE&gt;&lt;/PRE&gt;&lt;PRE&gt;&lt;CODE&gt;# Parent notebook
&amp;nbsp;
# some code ...
&amp;nbsp;
%run ./children_notebook
&amp;nbsp;
# some code ...
&amp;nbsp;
function_result = do_something(value_1, value_2)
&amp;nbsp;
# some code ...&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Thanks to everyone for the answers&lt;/P&gt;</description>
      <pubDate>Wed, 09 Mar 2022 09:10:08 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/long-time-turning-on-another-notebook/m-p/26833#M18843</guid>
      <dc:creator>LukaszJ</dc:creator>
      <dc:date>2022-03-09T09:10:08Z</dc:date>
    </item>
  </channel>
</rss>

