<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Jobs overhead why ? in Get Started Discussions</title>
    <link>https://community.databricks.com/t5/get-started-discussions/jobs-overhead-why/m-p/114175#M9274</link>
    <description>&lt;P&gt;Hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/156082"&gt;@Krthk&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;If you want to orchestrate a notebook, the easiest way is to go to &lt;SPAN class=""&gt;&lt;STRONG&gt;File &amp;gt; Schedule&lt;/STRONG&gt;&lt;/SPAN&gt; directly from the notebook. My recommendation is to use &lt;SPAN class=""&gt;&lt;STRONG&gt;cron syntax&lt;/STRONG&gt;&lt;/SPAN&gt; to define when it should run, and attach it to a &lt;SPAN class=""&gt;&lt;STRONG&gt;predefined cluster&lt;/STRONG&gt;&lt;/SPAN&gt; or configure a &lt;SPAN class=""&gt;&lt;STRONG&gt;new job cluster&lt;/STRONG&gt;&lt;/SPAN&gt;.&lt;/P&gt;&lt;P class=""&gt;Keep in mind that if you’re using a &lt;SPAN class=""&gt;&lt;STRONG&gt;new job cluster&lt;/STRONG&gt;&lt;/SPAN&gt;, you’ll need to wait for the cluster to &lt;SPAN class=""&gt;&lt;STRONG&gt;spin up&lt;/STRONG&gt;&lt;/SPAN&gt;, &lt;SPAN class=""&gt;&lt;STRONG&gt;install dependencies&lt;/STRONG&gt;&lt;/SPAN&gt;, and &lt;SPAN class=""&gt;&lt;STRONG&gt;execute the code&lt;/STRONG&gt;&lt;/SPAN&gt;. If you configure the cluster with the &lt;SPAN class=""&gt;&lt;STRONG&gt;same specs&lt;/STRONG&gt;&lt;/SPAN&gt; (instance type, number of workers, etc.) as the one you used during development, the actual code execution time should be similar.&lt;/P&gt;&lt;P class=""&gt;If you’re using a &lt;SPAN class=""&gt;&lt;STRONG&gt;pre-existing interactive cluster&lt;/STRONG&gt;&lt;/SPAN&gt;, you’ll only need to wait for it to &lt;SPAN class=""&gt;&lt;STRONG&gt;wake up&lt;/STRONG&gt;&lt;/SPAN&gt; (if it’s currently stopped). After the job finishes, you can check the &lt;SPAN class=""&gt;&lt;STRONG&gt;“View details”&lt;/STRONG&gt;&lt;/SPAN&gt; section on the right panel of the job run, and look into the &lt;SPAN class=""&gt;&lt;STRONG&gt;Event log&lt;/STRONG&gt;&lt;/SPAN&gt; to see how much time was spent in each phase: cluster creation, init script execution, and actual job execution.&lt;BR /&gt;&lt;BR /&gt;Hope this helps &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;BR /&gt;&lt;BR /&gt;Isi&lt;/P&gt;</description>
    <pubDate>Tue, 01 Apr 2025 13:10:34 GMT</pubDate>
    <dc:creator>Isi</dc:creator>
    <dc:date>2025-04-01T13:10:34Z</dc:date>
    <item>
      <title>Jobs overhead why ?</title>
      <link>https://community.databricks.com/t5/get-started-discussions/jobs-overhead-why/m-p/114144#M9273</link>
      <description>&lt;P&gt;Hi, I have a py notebook that I want to execute in an automated manner. One way I found this was to attach this to a job/task and hit it using the api from my local. However this seems to be adding significant overhead, my code even if it’s just one line that should take milliseconds, takes around a minute. I’m fairly new to the platform can someone explain why this happens and if not this way what are my options ?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Apr 2025 06:21:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/jobs-overhead-why/m-p/114144#M9273</guid>
      <dc:creator>Krthk</dc:creator>
      <dc:date>2025-04-01T06:21:45Z</dc:date>
    </item>
    <item>
      <title>Re: Jobs overhead why ?</title>
      <link>https://community.databricks.com/t5/get-started-discussions/jobs-overhead-why/m-p/114175#M9274</link>
      <description>&lt;P&gt;Hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/156082"&gt;@Krthk&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;If you want to orchestrate a notebook, the easiest way is to go to &lt;SPAN class=""&gt;&lt;STRONG&gt;File &amp;gt; Schedule&lt;/STRONG&gt;&lt;/SPAN&gt; directly from the notebook. My recommendation is to use &lt;SPAN class=""&gt;&lt;STRONG&gt;cron syntax&lt;/STRONG&gt;&lt;/SPAN&gt; to define when it should run, and attach it to a &lt;SPAN class=""&gt;&lt;STRONG&gt;predefined cluster&lt;/STRONG&gt;&lt;/SPAN&gt; or configure a &lt;SPAN class=""&gt;&lt;STRONG&gt;new job cluster&lt;/STRONG&gt;&lt;/SPAN&gt;.&lt;/P&gt;&lt;P class=""&gt;Keep in mind that if you’re using a &lt;SPAN class=""&gt;&lt;STRONG&gt;new job cluster&lt;/STRONG&gt;&lt;/SPAN&gt;, you’ll need to wait for the cluster to &lt;SPAN class=""&gt;&lt;STRONG&gt;spin up&lt;/STRONG&gt;&lt;/SPAN&gt;, &lt;SPAN class=""&gt;&lt;STRONG&gt;install dependencies&lt;/STRONG&gt;&lt;/SPAN&gt;, and &lt;SPAN class=""&gt;&lt;STRONG&gt;execute the code&lt;/STRONG&gt;&lt;/SPAN&gt;. If you configure the cluster with the &lt;SPAN class=""&gt;&lt;STRONG&gt;same specs&lt;/STRONG&gt;&lt;/SPAN&gt; (instance type, number of workers, etc.) as the one you used during development, the actual code execution time should be similar.&lt;/P&gt;&lt;P class=""&gt;If you’re using a &lt;SPAN class=""&gt;&lt;STRONG&gt;pre-existing interactive cluster&lt;/STRONG&gt;&lt;/SPAN&gt;, you’ll only need to wait for it to &lt;SPAN class=""&gt;&lt;STRONG&gt;wake up&lt;/STRONG&gt;&lt;/SPAN&gt; (if it’s currently stopped). After the job finishes, you can check the &lt;SPAN class=""&gt;&lt;STRONG&gt;“View details”&lt;/STRONG&gt;&lt;/SPAN&gt; section on the right panel of the job run, and look into the &lt;SPAN class=""&gt;&lt;STRONG&gt;Event log&lt;/STRONG&gt;&lt;/SPAN&gt; to see how much time was spent in each phase: cluster creation, init script execution, and actual job execution.&lt;BR /&gt;&lt;BR /&gt;Hope this helps &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;BR /&gt;&lt;BR /&gt;Isi&lt;/P&gt;</description>
      <pubDate>Tue, 01 Apr 2025 13:10:34 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/jobs-overhead-why/m-p/114175#M9274</guid>
      <dc:creator>Isi</dc:creator>
      <dc:date>2025-04-01T13:10:34Z</dc:date>
    </item>
  </channel>
</rss>

