<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Can we assigee default value in select columns in Spark sql when the column is not present? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/can-we-assigee-default-value-in-select-columns-in-spark-sql-when/m-p/13321#M8022</link>
    <description>&lt;P&gt;As it requires some manipulation, it will be easier to handle it as dataframe in Python as there you can just use:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;df.schema.fieldNames.contains("col2.C")&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;and apply logic accoridngly&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 06 Jan 2023 22:28:22 GMT</pubDate>
    <dc:creator>Hubert-Dudek</dc:creator>
    <dc:date>2023-01-06T22:28:22Z</dc:date>
    <item>
      <title>Can we assigee default value in select columns in Spark sql when the column is not present?</title>
      <link>https://community.databricks.com/t5/data-engineering/can-we-assigee-default-value-in-select-columns-in-spark-sql-when/m-p/13317#M8018</link>
      <description>&lt;P&gt;Im reading avro file and loading into table. The avro data is nested data.&lt;/P&gt;&lt;P&gt;Now from this table im trying to extract the necessary elements using spark sql. Using explode function when there is array data. Now the challenge is there are cases like the element that needs to be extracted might not present in avro data in that case default null value should be returned in select statement instead of throwing error.&lt;/P&gt;</description>
      <pubDate>Fri, 06 Jan 2023 12:10:50 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-we-assigee-default-value-in-select-columns-in-spark-sql-when/m-p/13317#M8018</guid>
      <dc:creator>Manojkumar</dc:creator>
      <dc:date>2023-01-06T12:10:50Z</dc:date>
    </item>
    <item>
      <title>Re: Can we assigee default value in select columns in Spark sql when the column is not present?</title>
      <link>https://community.databricks.com/t5/data-engineering/can-we-assigee-default-value-in-select-columns-in-spark-sql-when/m-p/13318#M8019</link>
      <description>&lt;P&gt;Usually, in such cases, I create an empty (template) table with all necessary columns and then append data to it.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;So read Avro, explode and then append to the template table.&lt;/P&gt;</description>
      <pubDate>Fri, 06 Jan 2023 12:14:13 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-we-assigee-default-value-in-select-columns-in-spark-sql-when/m-p/13318#M8019</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2023-01-06T12:14:13Z</dc:date>
    </item>
    <item>
      <title>Re: Can we assigee default value in select columns in Spark sql when the column is not present?</title>
      <link>https://community.databricks.com/t5/data-engineering/can-we-assigee-default-value-in-select-columns-in-spark-sql-when/m-p/13319#M8020</link>
      <description>&lt;P&gt;Hi @manoj kumar​&amp;nbsp;&lt;/P&gt;&lt;P&gt;An easiest way would be to make use of unmanaged delta tables and while loading data into the path of that table, you can enable mergeSchema to be true. This handles all the schema differences, incase column is not present as null and if new column pops up, then all the previous records as null etc. &lt;/P&gt;</description>
      <pubDate>Fri, 06 Jan 2023 12:54:58 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-we-assigee-default-value-in-select-columns-in-spark-sql-when/m-p/13319#M8020</guid>
      <dc:creator>UmaMahesh1</dc:creator>
      <dc:date>2023-01-06T12:54:58Z</dc:date>
    </item>
    <item>
      <title>Re: Can we assigee default value in select columns in Spark sql when the column is not present?</title>
      <link>https://community.databricks.com/t5/data-engineering/can-we-assigee-default-value-in-select-columns-in-spark-sql-when/m-p/13320#M8021</link>
      <description>&lt;P&gt;Hi Hubert, thank you for the quick reply,&lt;/P&gt;&lt;P&gt;to append data, extraction of data from nested data itself is failing in case if the respective derive element is failing.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;example:&lt;/P&gt;&lt;P&gt;| Col1 |  col2 |&lt;/P&gt;&lt;P&gt;-------------&lt;/P&gt;&lt;P&gt;| Hello | { A:1, B:2, C: [AA:11, BB: 22]}&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;My sql is like &lt;/P&gt;&lt;P&gt;select col2.b, explode(col2.c) from tab;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;now in the above case if C element is missing then above select should not fail but return null.&lt;/P&gt;&lt;P&gt;Kindly help &lt;/P&gt;</description>
      <pubDate>Fri, 06 Jan 2023 14:01:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-we-assigee-default-value-in-select-columns-in-spark-sql-when/m-p/13320#M8021</guid>
      <dc:creator>Manojkumar</dc:creator>
      <dc:date>2023-01-06T14:01:59Z</dc:date>
    </item>
    <item>
      <title>Re: Can we assigee default value in select columns in Spark sql when the column is not present?</title>
      <link>https://community.databricks.com/t5/data-engineering/can-we-assigee-default-value-in-select-columns-in-spark-sql-when/m-p/13321#M8022</link>
      <description>&lt;P&gt;As it requires some manipulation, it will be easier to handle it as dataframe in Python as there you can just use:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;df.schema.fieldNames.contains("col2.C")&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;and apply logic accoridngly&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 06 Jan 2023 22:28:22 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-we-assigee-default-value-in-select-columns-in-spark-sql-when/m-p/13321#M8022</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2023-01-06T22:28:22Z</dc:date>
    </item>
  </channel>
</rss>

