<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Runtime increases exponentially from 11.3 to 13.3 in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/runtime-increases-exponentially-from-11-3-to-13-3/m-p/101171#M40572</link>
    <description>&lt;DIV class="du-bois-light-typography css-ooisui"&gt;Hello! It's possible that the increase in runtime when upgrading from Spark 3.3.0 (DBR 11.3) to Spark 3.4.1 (DBR 13.3) is due to changes in the underlying R runtime or package versions. When you upgrade to a new version of Spark, the R packages that you use may also be updated, which can affect their performance.&lt;/DIV&gt;
&lt;DIV class="du-bois-light-typography css-ooisui"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="du-bois-light-typography css-ooisui"&gt;To determine whether the issue is related to the R packages, you can try creating a new zip file of the R packages using DBR 13.3 and then using that zip file in your job. This will ensure that the R packages are compatible with the new version of Spark.&lt;/DIV&gt;
&lt;DIV class="du-bois-light-typography css-ooisui"&gt;&lt;SPAN&gt;If the job still takes a long time to complete, there may be other factors contributing to the increase in runtime. In this case, you can try profiling your code to identify any performance bottlenecks.&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV class="du-bois-light-typography css-ooisui"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="du-bois-light-typography css-ooisui"&gt;&lt;SPAN&gt;Additionally, you can try adjusting the Spark configuration settings to optimize the performance of your job. For example, you can try increasing the number of partitions or adjusting the memory settings.&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV class="du-bois-light-typography css-ooisui"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="du-bois-light-typography css-ooisui"&gt;&lt;SPAN&gt;I hope this helps!&lt;/SPAN&gt;&lt;/DIV&gt;</description>
    <pubDate>Fri, 06 Dec 2024 06:42:17 GMT</pubDate>
    <dc:creator>Sidhant07</dc:creator>
    <dc:date>2024-12-06T06:42:17Z</dc:date>
    <item>
      <title>Runtime increases exponentially from 11.3 to 13.3</title>
      <link>https://community.databricks.com/t5/data-engineering/runtime-increases-exponentially-from-11-3-to-13-3/m-p/89061#M37674</link>
      <description>&lt;P&gt;Hello. I am using R on databricks and using the below approach.&amp;nbsp;&lt;/P&gt;&lt;P&gt;My Spark version:&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Single node: i3.2xlarge · On-demand · DBR: 11.3 LTS (includes Apache Spark 3.3.0, Scala 2.12) · us-east-1a, the job takes 1 hour&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;I install all R packages (including a geospatial package terra) in my notebook and zip the installed R packages so that I don't have to install the packages again and again.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I deploy a job which does following:&lt;/P&gt;&lt;P&gt;1. Get the zip R packages and unzip&lt;/P&gt;&lt;P&gt;2. load the library&amp;nbsp;&lt;/P&gt;&lt;P&gt;3. do stuff&lt;/P&gt;&lt;P&gt;The job takes an hour to complete.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;However, when I update the Spark&amp;nbsp; to below, my run times increase exponentially.&amp;nbsp;&lt;BR /&gt;&lt;SPAN&gt;Single node: i3.2xlarge · On-demand · DBR: 13.3 LTS (includes Apache Spark 3.4.1, Scala 2.12) · us-east-1a&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;I am not a spark expert but why is changing 11.3 to 13.3 increases the run time? Would the ideal solution be that I create the zip packages again but using the 13.3 instead of 11.3?&lt;/P&gt;</description>
      <pubDate>Sat, 07 Sep 2024 22:26:08 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/runtime-increases-exponentially-from-11-3-to-13-3/m-p/89061#M37674</guid>
      <dc:creator>simple89</dc:creator>
      <dc:date>2024-09-07T22:26:08Z</dc:date>
    </item>
    <item>
      <title>Re: Runtime increases exponentially from 11.3 to 13.3</title>
      <link>https://community.databricks.com/t5/data-engineering/runtime-increases-exponentially-from-11-3-to-13-3/m-p/101171#M40572</link>
      <description>&lt;DIV class="du-bois-light-typography css-ooisui"&gt;Hello! It's possible that the increase in runtime when upgrading from Spark 3.3.0 (DBR 11.3) to Spark 3.4.1 (DBR 13.3) is due to changes in the underlying R runtime or package versions. When you upgrade to a new version of Spark, the R packages that you use may also be updated, which can affect their performance.&lt;/DIV&gt;
&lt;DIV class="du-bois-light-typography css-ooisui"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="du-bois-light-typography css-ooisui"&gt;To determine whether the issue is related to the R packages, you can try creating a new zip file of the R packages using DBR 13.3 and then using that zip file in your job. This will ensure that the R packages are compatible with the new version of Spark.&lt;/DIV&gt;
&lt;DIV class="du-bois-light-typography css-ooisui"&gt;&lt;SPAN&gt;If the job still takes a long time to complete, there may be other factors contributing to the increase in runtime. In this case, you can try profiling your code to identify any performance bottlenecks.&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV class="du-bois-light-typography css-ooisui"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="du-bois-light-typography css-ooisui"&gt;&lt;SPAN&gt;Additionally, you can try adjusting the Spark configuration settings to optimize the performance of your job. For example, you can try increasing the number of partitions or adjusting the memory settings.&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV class="du-bois-light-typography css-ooisui"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="du-bois-light-typography css-ooisui"&gt;&lt;SPAN&gt;I hope this helps!&lt;/SPAN&gt;&lt;/DIV&gt;</description>
      <pubDate>Fri, 06 Dec 2024 06:42:17 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/runtime-increases-exponentially-from-11-3-to-13-3/m-p/101171#M40572</guid>
      <dc:creator>Sidhant07</dc:creator>
      <dc:date>2024-12-06T06:42:17Z</dc:date>
    </item>
  </channel>
</rss>

