<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: concat_ws() throws AnalysisException when too many columns are supplied in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/concat-ws-throws-analysisexception-when-too-many-columns-are/m-p/25860#M18055</link>
    <description>&lt;UL&gt;&lt;LI&gt;at least one of column names can have some strange char, whitespace or something,&lt;/LI&gt;&lt;LI&gt;or at least one of column type is not compatible (for example StructType)&lt;/LI&gt;&lt;LI&gt;you can separate your code to two or more steps. First generate list of columns as some variable, than create column to concatenate than new column with sha that column. It is easier to debug and also more efficient for spark as it use lazy evolution and logical/physical plans and adaptive query execution.&lt;/LI&gt;&lt;/UL&gt;</description>
    <pubDate>Fri, 11 Mar 2022 12:15:44 GMT</pubDate>
    <dc:creator>Hubert-Dudek</dc:creator>
    <dc:date>2022-03-11T12:15:44Z</dc:date>
    <item>
      <title>concat_ws() throws AnalysisException when too many columns are supplied</title>
      <link>https://community.databricks.com/t5/data-engineering/concat-ws-throws-analysisexception-when-too-many-columns-are/m-p/25859#M18054</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;i'm using concat_ws in scala to calculate a checksum for the dataframe, i.e.:&lt;/P&gt;&lt;P&gt;df.withColumn("CHECKSUM", sha2(functions.concat_ws("", dataframe.columns.map(col): _*), 512))&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have one example here with just 24 columns that already throws the following exception: org.apache.spark.sql.AnalysisException: cannot resolve 'concat_ws('', &amp;lt;list of the columns)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Any ideas what's happening? I assume the list get's too long (char wise), but I have no idea how to make this work.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Fri, 11 Mar 2022 11:47:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/concat-ws-throws-analysisexception-when-too-many-columns-are/m-p/25859#M18054</guid>
      <dc:creator>gzenz</dc:creator>
      <dc:date>2022-03-11T11:47:03Z</dc:date>
    </item>
    <item>
      <title>Re: concat_ws() throws AnalysisException when too many columns are supplied</title>
      <link>https://community.databricks.com/t5/data-engineering/concat-ws-throws-analysisexception-when-too-many-columns-are/m-p/25860#M18055</link>
      <description>&lt;UL&gt;&lt;LI&gt;at least one of column names can have some strange char, whitespace or something,&lt;/LI&gt;&lt;LI&gt;or at least one of column type is not compatible (for example StructType)&lt;/LI&gt;&lt;LI&gt;you can separate your code to two or more steps. First generate list of columns as some variable, than create column to concatenate than new column with sha that column. It is easier to debug and also more efficient for spark as it use lazy evolution and logical/physical plans and adaptive query execution.&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Fri, 11 Mar 2022 12:15:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/concat-ws-throws-analysisexception-when-too-many-columns-are/m-p/25860#M18055</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2022-03-11T12:15:44Z</dc:date>
    </item>
  </channel>
</rss>

