<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to speed up `dbx launch --from-assets` in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-to-speed-up-dbx-launch-from-assets/m-p/9687#M5014</link>
    <description>&lt;P&gt;Hi, I have no solution, actually I've just registered to open a very similar ticket, when saw yours.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;According to my experiments getting an &lt;B&gt;already running &lt;/B&gt;VM from the pool (times between events: CREATING - INIT_SCRIPTS_STARTED) can take anything between 5seconds and 5minutes. It's unclear why.&lt;/P&gt;&lt;P&gt;It's actually faster not to use the instance pools according to my experiments.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Additionally it would be great if at least the global ini scripts would be applied on the VMs when they are started in the pool (via that one could do some time consuming init steps independently of the job start).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Since you received no answer in a reltively long time, I guess one has to accept that it's just simply slow..... :'(&lt;/P&gt;</description>
    <pubDate>Tue, 21 Feb 2023 16:05:07 GMT</pubDate>
    <dc:creator>tonkol</dc:creator>
    <dc:date>2023-02-21T16:05:07Z</dc:date>
    <item>
      <title>How to speed up `dbx launch --from-assets`</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-speed-up-dbx-launch-from-assets/m-p/9684#M5011</link>
      <description>&lt;P&gt;Hiya,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I'm trying to follow the testing workflow of&lt;/P&gt;&lt;P&gt;```&lt;/P&gt;&lt;P&gt;$ dbx deploy test --assets-only&lt;/P&gt;&lt;P&gt;$ dbx launch test --from-assets --trace --include-output stdout&lt;/P&gt;&lt;P&gt;```&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;But I find the turnaround time is quite long, even with an instance pool.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The `deployment.yaml` looks like&lt;/P&gt;&lt;P&gt;```&lt;/P&gt;&lt;P&gt;environments:&lt;/P&gt;&lt;P&gt;&amp;nbsp;default:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;workflows:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;- name: "test"&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;tasks:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;- task_key: "main"&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;new_cluster:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;spark_version: "11.3.x-scala2.12"&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;num_workers: 1&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;instance_pool_id: "instance-pool://****"&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;init_scripts:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;- dbfs:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;destination: dbfs://****/init.sh&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;python_wheel_task:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;package_name: "foo"&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;entry_point: "bar"&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;parameters: [&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;"--conf-file", "file:fuse://conf/tasks/bar.yaml",&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;]&lt;/P&gt;&lt;P&gt;```&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The `init.sh` simply does a `pip install`.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Some excerpts from the logs, to show how long things take&lt;/P&gt;&lt;P&gt;```&lt;/P&gt;&lt;P&gt;[dbx][2023-02-09 10:41:57.487] Launching workflow&lt;/P&gt;&lt;P&gt;[dbx][2023-02-09 10:42:01.389] Run URL: &lt;A href="https://dbc-" target="test_blank"&gt;https://dbc-&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;[dbx][2023-02-09 10:47:37.229] Finished tracing run with id ....&lt;/P&gt;&lt;P&gt;```&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;So it takes about 5 minutes setting up a new environment!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Anyone has ideas of how to speed up things?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 09 Feb 2023 09:51:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-speed-up-dbx-launch-from-assets/m-p/9684#M5011</guid>
      <dc:creator>agagrins</dc:creator>
      <dc:date>2023-02-09T09:51:32Z</dc:date>
    </item>
    <item>
      <title>Re: How to speed up `dbx launch --from-assets`</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-speed-up-dbx-launch-from-assets/m-p/9685#M5012</link>
      <description>&lt;P&gt;Oh no, do we get chat bots in here now?&lt;/P&gt;</description>
      <pubDate>Fri, 10 Feb 2023 06:54:36 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-speed-up-dbx-launch-from-assets/m-p/9685#M5012</guid>
      <dc:creator>agagrins</dc:creator>
      <dc:date>2023-02-10T06:54:36Z</dc:date>
    </item>
    <item>
      <title>Re: How to speed up `dbx launch --from-assets`</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-speed-up-dbx-launch-from-assets/m-p/9686#M5013</link>
      <description>&lt;P&gt;lol. maybe. But yes, if you're unclear about the many essay kinds, you might look at &lt;A href="https://writinguniverse.com/essay-types/process-essays/" alt="https://writinguniverse.com/essay-types/process-essays/" target="_blank"&gt;https://writinguniverse.com/essay-types/process-essays/&lt;/A&gt; website to consider the several essay formats that may assist you in completing your essay writing work.&lt;/P&gt;</description>
      <pubDate>Tue, 21 Feb 2023 15:23:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-speed-up-dbx-launch-from-assets/m-p/9686#M5013</guid>
      <dc:creator>ErinArmistead</dc:creator>
      <dc:date>2023-02-21T15:23:54Z</dc:date>
    </item>
    <item>
      <title>Re: How to speed up `dbx launch --from-assets`</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-speed-up-dbx-launch-from-assets/m-p/9687#M5014</link>
      <description>&lt;P&gt;Hi, I have no solution, actually I've just registered to open a very similar ticket, when saw yours.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;According to my experiments getting an &lt;B&gt;already running &lt;/B&gt;VM from the pool (times between events: CREATING - INIT_SCRIPTS_STARTED) can take anything between 5seconds and 5minutes. It's unclear why.&lt;/P&gt;&lt;P&gt;It's actually faster not to use the instance pools according to my experiments.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Additionally it would be great if at least the global ini scripts would be applied on the VMs when they are started in the pool (via that one could do some time consuming init steps independently of the job start).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Since you received no answer in a reltively long time, I guess one has to accept that it's just simply slow..... :'(&lt;/P&gt;</description>
      <pubDate>Tue, 21 Feb 2023 16:05:07 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-speed-up-dbx-launch-from-assets/m-p/9687#M5014</guid>
      <dc:creator>tonkol</dc:creator>
      <dc:date>2023-02-21T16:05:07Z</dc:date>
    </item>
  </channel>
</rss>

