<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to enable AQE in foreachbatch mode in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-to-enable-aqe-in-foreachbatch-mode/m-p/101559#M40723</link>
    <description>&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;SPAN&gt;I am processing the daily data from checkpoint to checkpoint everyday by using for each batch in streaming way.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;PRE&gt;df.writeStream.&lt;SPAN class=""&gt;format&lt;/SPAN&gt;(&lt;SPAN class=""&gt;"delta"&lt;/SPAN&gt;)
        .option(&lt;SPAN class=""&gt;"checkpointLocation"&lt;/SPAN&gt;, &lt;SPAN class=""&gt;"dbfs/loc"&lt;/SPAN&gt;)
        .foreachBatch(transform_and_upsert)
        .outputMode(&lt;SPAN class=""&gt;"update"&lt;/SPAN&gt;)
        .trigger(availableNow=&lt;SPAN class=""&gt;True&lt;/SPAN&gt;)
        .start()&lt;/PRE&gt;&lt;P&gt;Due to skewness I want to enable aqe and set skewJoin optimization true&lt;/P&gt;&lt;PRE&gt;spark.conf.&lt;SPAN class=""&gt;set&lt;/SPAN&gt;(&lt;SPAN class=""&gt;"spark.sql.adaptive.enabled"&lt;/SPAN&gt;, &lt;SPAN class=""&gt;"true"&lt;/SPAN&gt;)
spark.conf.&lt;SPAN class=""&gt;set&lt;/SPAN&gt;(&lt;SPAN class=""&gt;"spark.sql.adaptive.skewJoin.enabled"&lt;/SPAN&gt;, &lt;SPAN class=""&gt;"true"&lt;/SPAN&gt;)
spark.conf.&lt;SPAN class=""&gt;set&lt;/SPAN&gt;(&lt;SPAN class=""&gt;"spark.sql.adaptive.forceOptimizeSkewedJoin"&lt;/SPAN&gt;, &lt;SPAN class=""&gt;"true"&lt;/SPAN&gt;)&lt;/PRE&gt;&lt;P&gt;However, when I checked the Spark UI settings, the value was set to false: spark.sql.adaptive.enabled = false.&lt;/P&gt;&lt;P&gt;I am using Databricks DBR 14.3.x-photon-scala2.12 with Photon enabled.&lt;/P&gt;&lt;P&gt;According to this&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;A href="https://www.databricks.com/blog/adaptive-query-execution-structured-streaming" target="_blank" rel="nofollow noopener noreferrer"&gt;https://www.databricks.com/blog/adaptive-query-execution-structured-streaming&lt;/A&gt;, AQE supports streaming for each batch query starting from DBR 13.2.&lt;/P&gt;&lt;P&gt;&amp;nbsp;here is the settings in dataframe properties tab&lt;/P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="mjedy78_0-1733819593344.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/13415iD3D0D0968B325D9A/image-size/medium?v=v2&amp;amp;px=400" role="button" title="mjedy78_0-1733819593344.png" alt="mjedy78_0-1733819593344.png" /&gt;&lt;/span&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
    <pubDate>Tue, 10 Dec 2024 08:33:34 GMT</pubDate>
    <dc:creator>mjedy78</dc:creator>
    <dc:date>2024-12-10T08:33:34Z</dc:date>
    <item>
      <title>How to enable AQE in foreachbatch mode</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-enable-aqe-in-foreachbatch-mode/m-p/101559#M40723</link>
      <description>&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;SPAN&gt;I am processing the daily data from checkpoint to checkpoint everyday by using for each batch in streaming way.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;PRE&gt;df.writeStream.&lt;SPAN class=""&gt;format&lt;/SPAN&gt;(&lt;SPAN class=""&gt;"delta"&lt;/SPAN&gt;)
        .option(&lt;SPAN class=""&gt;"checkpointLocation"&lt;/SPAN&gt;, &lt;SPAN class=""&gt;"dbfs/loc"&lt;/SPAN&gt;)
        .foreachBatch(transform_and_upsert)
        .outputMode(&lt;SPAN class=""&gt;"update"&lt;/SPAN&gt;)
        .trigger(availableNow=&lt;SPAN class=""&gt;True&lt;/SPAN&gt;)
        .start()&lt;/PRE&gt;&lt;P&gt;Due to skewness I want to enable aqe and set skewJoin optimization true&lt;/P&gt;&lt;PRE&gt;spark.conf.&lt;SPAN class=""&gt;set&lt;/SPAN&gt;(&lt;SPAN class=""&gt;"spark.sql.adaptive.enabled"&lt;/SPAN&gt;, &lt;SPAN class=""&gt;"true"&lt;/SPAN&gt;)
spark.conf.&lt;SPAN class=""&gt;set&lt;/SPAN&gt;(&lt;SPAN class=""&gt;"spark.sql.adaptive.skewJoin.enabled"&lt;/SPAN&gt;, &lt;SPAN class=""&gt;"true"&lt;/SPAN&gt;)
spark.conf.&lt;SPAN class=""&gt;set&lt;/SPAN&gt;(&lt;SPAN class=""&gt;"spark.sql.adaptive.forceOptimizeSkewedJoin"&lt;/SPAN&gt;, &lt;SPAN class=""&gt;"true"&lt;/SPAN&gt;)&lt;/PRE&gt;&lt;P&gt;However, when I checked the Spark UI settings, the value was set to false: spark.sql.adaptive.enabled = false.&lt;/P&gt;&lt;P&gt;I am using Databricks DBR 14.3.x-photon-scala2.12 with Photon enabled.&lt;/P&gt;&lt;P&gt;According to this&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;A href="https://www.databricks.com/blog/adaptive-query-execution-structured-streaming" target="_blank" rel="nofollow noopener noreferrer"&gt;https://www.databricks.com/blog/adaptive-query-execution-structured-streaming&lt;/A&gt;, AQE supports streaming for each batch query starting from DBR 13.2.&lt;/P&gt;&lt;P&gt;&amp;nbsp;here is the settings in dataframe properties tab&lt;/P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="mjedy78_0-1733819593344.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/13415iD3D0D0968B325D9A/image-size/medium?v=v2&amp;amp;px=400" role="button" title="mjedy78_0-1733819593344.png" alt="mjedy78_0-1733819593344.png" /&gt;&lt;/span&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 10 Dec 2024 08:33:34 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-enable-aqe-in-foreachbatch-mode/m-p/101559#M40723</guid>
      <dc:creator>mjedy78</dc:creator>
      <dc:date>2024-12-10T08:33:34Z</dc:date>
    </item>
    <item>
      <title>Re: How to enable AQE in foreachbatch mode</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-enable-aqe-in-foreachbatch-mode/m-p/101587#M40735</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/135709"&gt;@mjedy78&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;Did you set the config at cluster level or notebook level??&lt;BR /&gt;Can you try to set these config in cluster properties and check if that helps!&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 10 Dec 2024 12:21:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-enable-aqe-in-foreachbatch-mode/m-p/101587#M40735</guid>
      <dc:creator>MuthuLakshmi</dc:creator>
      <dc:date>2024-12-10T12:21:54Z</dc:date>
    </item>
    <item>
      <title>Re: How to enable AQE in foreachbatch mode</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-enable-aqe-in-foreachbatch-mode/m-p/101597#M40739</link>
      <description>&lt;P&gt;I have tried both,&lt;BR /&gt;What I am triggering is a job, in job first I set within notebook level by adding spark.conf.set&lt;BR /&gt;Then I also added some configs in job cluster&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="mjedy78_1-1733835664120.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/13425i0C922D32CB176CFC/image-size/medium?v=v2&amp;amp;px=400" role="button" title="mjedy78_1-1733835664120.png" alt="mjedy78_1-1733835664120.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 10 Dec 2024 13:01:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-enable-aqe-in-foreachbatch-mode/m-p/101597#M40739</guid>
      <dc:creator>mjedy78</dc:creator>
      <dc:date>2024-12-10T13:01:27Z</dc:date>
    </item>
    <item>
      <title>Re: How to enable AQE in foreachbatch mode</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-enable-aqe-in-foreachbatch-mode/m-p/101646#M40758</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/89478"&gt;@MuthuLakshmi&lt;/a&gt;&amp;nbsp;any idea?&lt;/P&gt;</description>
      <pubDate>Tue, 10 Dec 2024 18:03:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-enable-aqe-in-foreachbatch-mode/m-p/101646#M40758</guid>
      <dc:creator>mjedy78</dc:creator>
      <dc:date>2024-12-10T18:03:18Z</dc:date>
    </item>
  </channel>
</rss>

