<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: BUG - withColumns in pyspark doesn't handle empty dictionary in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/bug-withcolumns-in-pyspark-doesn-t-handle-empty-dictionary/m-p/136944#M50668</link>
    <description>&lt;P&gt;Hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/60098"&gt;@K_Anudeep&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried using Environment Version 3, 2, and 1 but still got the same error. Attached is a screenshot with version 3.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Dhruv22_0-1761912353654.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/21219i66AB81E8B1C1F311/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Dhruv22_0-1761912353654.png" alt="Dhruv22_0-1761912353654.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 31 Oct 2025 12:07:00 GMT</pubDate>
    <dc:creator>Dhruv-22</dc:creator>
    <dc:date>2025-10-31T12:07:00Z</dc:date>
    <item>
      <title>BUG - withColumns in pyspark doesn't handle empty dictionary</title>
      <link>https://community.databricks.com/t5/data-engineering/bug-withcolumns-in-pyspark-doesn-t-handle-empty-dictionary/m-p/136875#M50659</link>
      <description>&lt;P&gt;Today, while reading a delta load my notebook failed and I wanted to report a bug. The&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;EM&gt;withColumns&lt;/EM&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;command does not tolerate an empty dictionary and gives the following error in PySpark.&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;&lt;SPAN&gt;flat_tuple &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;namedtuple&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"flat_tuple"&lt;/SPAN&gt;&lt;SPAN&gt;, [&lt;/SPAN&gt;&lt;SPAN&gt;"old_col"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"new_col"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"logic"&lt;/SPAN&gt;&lt;SPAN&gt;])&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;# flat_tuple(old_col, new_col, logic)&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;flat_tuples &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; [&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;flat_tuple&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"Coordinates"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"Coordinates"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;extract_coordinates_udf&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"Coordinates"&lt;/SPAN&gt;&lt;SPAN&gt;)[&lt;/SPAN&gt;&lt;SPAN&gt;"coordinates"&lt;/SPAN&gt;&lt;SPAN&gt;]))&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt; , &lt;/SPAN&gt;&lt;SPAN&gt;flat_tuple&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"CreatedById"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"CreatedById"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"CreatedById"&lt;/SPAN&gt;&lt;SPAN&gt;)[&lt;/SPAN&gt;&lt;SPAN&gt;"$oid"&lt;/SPAN&gt;&lt;SPAN&gt;])&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt; , &lt;/SPAN&gt;&lt;SPAN&gt;flat_tuple&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"CreationDate"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"CreationDate"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"CreationDate"&lt;/SPAN&gt;&lt;SPAN&gt;)[&lt;/SPAN&gt;&lt;SPAN&gt;"$date"&lt;/SPAN&gt;&lt;SPAN&gt;][&lt;/SPAN&gt;&lt;SPAN&gt;"$numberLong"&lt;/SPAN&gt;&lt;SPAN&gt;])&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt; , &lt;/SPAN&gt;&lt;SPAN&gt;flat_tuple&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"Names"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"Names"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"Names"&lt;/SPAN&gt;&lt;SPAN&gt;)[&lt;/SPAN&gt;&lt;SPAN&gt;0&lt;/SPAN&gt;&lt;SPAN&gt;][&lt;/SPAN&gt;&lt;SPAN&gt;"LanguageValue"&lt;/SPAN&gt;&lt;SPAN&gt;])&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt; , &lt;/SPAN&gt;&lt;SPAN&gt;flat_tuple&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"Location"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"LocationCoordinates"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;extract_coordinates_udf&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"Location"&lt;/SPAN&gt;&lt;SPAN&gt;)[&lt;/SPAN&gt;&lt;SPAN&gt;"coordinates"&lt;/SPAN&gt;&lt;SPAN&gt;]))&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt; , &lt;/SPAN&gt;&lt;SPAN&gt;flat_tuple&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"Location"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"LocationType"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"Location"&lt;/SPAN&gt;&lt;SPAN&gt;)[&lt;/SPAN&gt;&lt;SPAN&gt;"type"&lt;/SPAN&gt;&lt;SPAN&gt;])&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt; , &lt;/SPAN&gt;&lt;SPAN&gt;flat_tuple&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"_id"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"sectorId"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"_id"&lt;/SPAN&gt;&lt;SPAN&gt;)[&lt;/SPAN&gt;&lt;SPAN&gt;"$oid"&lt;/SPAN&gt;&lt;SPAN&gt;])&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;]&lt;BR /&gt;&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;final_flat_cols &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; {tup.new_col: tup.logic &lt;/SPAN&gt;&lt;SPAN&gt;for&lt;/SPAN&gt;&lt;SPAN&gt; tup &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; flat_tuples &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; tup.old_col &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; df.columns}&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;df &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df.&lt;/SPAN&gt;&lt;SPAN&gt;withColumns&lt;/SPAN&gt;&lt;SPAN&gt;(final_flat_cols)&lt;BR /&gt;&lt;BR /&gt;-- Output&lt;BR /&gt;&lt;SPAN class=""&gt;AssertionError: &lt;/SPAN&gt; [Trace ID: 00-68d8e7cacb471da60efe65d0ef17703d-a3b270f251715df4-00]&lt;BR /&gt;&lt;/SPAN&gt;&lt;/PRE&gt;&lt;/DIV&gt;&lt;P&gt;This case is handled in normal PySpark and I don't want to write a special if-else clause to check for the columns of dataframe before running&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;EM&gt;withColumns&lt;/EM&gt;. It would be great if it could be handled internally.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Currently, I'm using the following to handle this&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;&lt;SPAN&gt;flat_col_lst &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; [tup.logic.&lt;/SPAN&gt;&lt;SPAN&gt;alias&lt;/SPAN&gt;&lt;SPAN&gt;(tup.new_col) &lt;/SPAN&gt;&lt;SPAN&gt;for&lt;/SPAN&gt;&lt;SPAN&gt; tup &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; flat_tuples &lt;/SPAN&gt;&lt;SPAN&gt;if&lt;/SPAN&gt;&lt;SPAN&gt; tup.old_col &lt;/SPAN&gt;&lt;SPAN&gt;in&lt;/SPAN&gt;&lt;SPAN&gt; df.columns]&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;df &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; df.&lt;/SPAN&gt;&lt;SPAN&gt;select&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'*'&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;*&lt;/SPAN&gt;&lt;SPAN&gt;flat_col_lst)&lt;/SPAN&gt;&lt;/PRE&gt;&lt;/DIV&gt;</description>
      <pubDate>Fri, 31 Oct 2025 05:58:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/bug-withcolumns-in-pyspark-doesn-t-handle-empty-dictionary/m-p/136875#M50659</guid>
      <dc:creator>Dhruv-22</dc:creator>
      <dc:date>2025-10-31T05:58:44Z</dc:date>
    </item>
    <item>
      <title>Re: BUG - withColumns in pyspark doesn't handle empty dictionary</title>
      <link>https://community.databricks.com/t5/data-engineering/bug-withcolumns-in-pyspark-doesn-t-handle-empty-dictionary/m-p/136923#M50665</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/99515"&gt;@Dhruv-22&lt;/a&gt;&amp;nbsp;,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have tested this internally, and this seems to be a bug with the new&lt;STRONG&gt; Serverless env version 4&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="K_Anudeep_0-1761907767534.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/21213i0F91FFBBFFD7E010/image-size/medium?v=v2&amp;amp;px=400" role="button" title="K_Anudeep_0-1761907767534.png" alt="K_Anudeep_0-1761907767534.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;As a solution, you can try switching the &lt;STRONG&gt;version to 3&lt;/STRONG&gt;&amp;nbsp;as shown bleow and re-run the above code, and it should work.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="K_Anudeep_1-1761907968909.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/21214i0F79514EF8F36A37/image-size/medium?v=v2&amp;amp;px=400" role="button" title="K_Anudeep_1-1761907968909.png" alt="K_Anudeep_1-1761907968909.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 31 Oct 2025 10:54:10 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/bug-withcolumns-in-pyspark-doesn-t-handle-empty-dictionary/m-p/136923#M50665</guid>
      <dc:creator>K_Anudeep</dc:creator>
      <dc:date>2025-10-31T10:54:10Z</dc:date>
    </item>
    <item>
      <title>Re: BUG - withColumns in pyspark doesn't handle empty dictionary</title>
      <link>https://community.databricks.com/t5/data-engineering/bug-withcolumns-in-pyspark-doesn-t-handle-empty-dictionary/m-p/136944#M50668</link>
      <description>&lt;P&gt;Hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/60098"&gt;@K_Anudeep&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried using Environment Version 3, 2, and 1 but still got the same error. Attached is a screenshot with version 3.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Dhruv22_0-1761912353654.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/21219i66AB81E8B1C1F311/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Dhruv22_0-1761912353654.png" alt="Dhruv22_0-1761912353654.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 31 Oct 2025 12:07:00 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/bug-withcolumns-in-pyspark-doesn-t-handle-empty-dictionary/m-p/136944#M50668</guid>
      <dc:creator>Dhruv-22</dc:creator>
      <dc:date>2025-10-31T12:07:00Z</dc:date>
    </item>
    <item>
      <title>Re: BUG - withColumns in pyspark doesn't handle empty dictionary</title>
      <link>https://community.databricks.com/t5/data-engineering/bug-withcolumns-in-pyspark-doesn-t-handle-empty-dictionary/m-p/136946#M50669</link>
      <description>&lt;P&gt;Hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/99515"&gt;@Dhruv-22&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Did you apply the version and create a new session/clear the existing session before running it? It should work on Env version 3 as mentioned in my repro below.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 31 Oct 2025 12:15:48 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/bug-withcolumns-in-pyspark-doesn-t-handle-empty-dictionary/m-p/136946#M50669</guid>
      <dc:creator>K_Anudeep</dc:creator>
      <dc:date>2025-10-31T12:15:48Z</dc:date>
    </item>
    <item>
      <title>Re: BUG - withColumns in pyspark doesn't handle empty dictionary</title>
      <link>https://community.databricks.com/t5/data-engineering/bug-withcolumns-in-pyspark-doesn-t-handle-empty-dictionary/m-p/136967#M50673</link>
      <description>&lt;P&gt;Yeah, I created a new session. I tried it 3-4 times.&lt;/P&gt;</description>
      <pubDate>Fri, 31 Oct 2025 12:33:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/bug-withcolumns-in-pyspark-doesn-t-handle-empty-dictionary/m-p/136967#M50673</guid>
      <dc:creator>Dhruv-22</dc:creator>
      <dc:date>2025-10-31T12:33:44Z</dc:date>
    </item>
    <item>
      <title>Re: BUG - withColumns in pyspark doesn't handle empty dictionary</title>
      <link>https://community.databricks.com/t5/data-engineering/bug-withcolumns-in-pyspark-doesn-t-handle-empty-dictionary/m-p/137013#M50678</link>
      <description>&lt;P&gt;Sure! let me try once again and get back&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 31 Oct 2025 14:48:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/bug-withcolumns-in-pyspark-doesn-t-handle-empty-dictionary/m-p/137013#M50678</guid>
      <dc:creator>K_Anudeep</dc:creator>
      <dc:date>2025-10-31T14:48:37Z</dc:date>
    </item>
    <item>
      <title>Re: BUG - withColumns in pyspark doesn't handle empty dictionary</title>
      <link>https://community.databricks.com/t5/data-engineering/bug-withcolumns-in-pyspark-doesn-t-handle-empty-dictionary/m-p/137246#M50715</link>
      <description>&lt;P&gt;Hey &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/60098"&gt;@K_Anudeep&lt;/a&gt;, did you get anything?&lt;/P&gt;</description>
      <pubDate>Sat, 01 Nov 2025 13:18:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/bug-withcolumns-in-pyspark-doesn-t-handle-empty-dictionary/m-p/137246#M50715</guid>
      <dc:creator>Dhruv-22</dc:creator>
      <dc:date>2025-11-01T13:18:18Z</dc:date>
    </item>
  </channel>
</rss>

