<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Delta Table created on s3 has all null values in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/delta-table-created-on-s3-has-all-null-values/m-p/23146#M15946</link>
    <description>&lt;P&gt;@John Constantine​&amp;nbsp;,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Try to load it as DataFrame (spark.read.delta(path)) and validate what is loading,&lt;/LI&gt;&lt;LI&gt;It could be easier to mount the S3 location as a folder to ensure that all data is there (dbutils or %fs to check) and that the connection is working correctly.&lt;/LI&gt;&lt;LI&gt;Try also REFRESH [TABLE] table_name,&lt;/LI&gt;&lt;LI&gt;Share more code, not sure what is loaded precisely. For example, the delta folder should be loaded, not a particular file,&lt;/LI&gt;&lt;LI&gt;There are parts/versions of delta in the delta folder written as a parquet. You can load them separately to DEBUG is all ok.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 11 Apr 2022 20:05:13 GMT</pubDate>
    <dc:creator>Hubert-Dudek</dc:creator>
    <dc:date>2022-04-11T20:05:13Z</dc:date>
    <item>
      <title>Delta Table created on s3 has all null values</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-table-created-on-s3-has-all-null-values/m-p/23145#M15945</link>
      <description>&lt;P&gt;I have data in a Spark Dataframe and I write it to an s3 location. It has some complex datatypes like structs etc. When I create the table on top on the s3 location by using &lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;CREATE TABLE IF NOT EXISTS table_name
USING DELTA
LOCATION 's3://.../...';&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The table has all null values in it and I am not sure what is going wrong&lt;/P&gt;</description>
      <pubDate>Mon, 11 Apr 2022 19:54:25 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-table-created-on-s3-has-all-null-values/m-p/23145#M15945</guid>
      <dc:creator>Constantine</dc:creator>
      <dc:date>2022-04-11T19:54:25Z</dc:date>
    </item>
    <item>
      <title>Re: Delta Table created on s3 has all null values</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-table-created-on-s3-has-all-null-values/m-p/23146#M15946</link>
      <description>&lt;P&gt;@John Constantine​&amp;nbsp;,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Try to load it as DataFrame (spark.read.delta(path)) and validate what is loading,&lt;/LI&gt;&lt;LI&gt;It could be easier to mount the S3 location as a folder to ensure that all data is there (dbutils or %fs to check) and that the connection is working correctly.&lt;/LI&gt;&lt;LI&gt;Try also REFRESH [TABLE] table_name,&lt;/LI&gt;&lt;LI&gt;Share more code, not sure what is loaded precisely. For example, the delta folder should be loaded, not a particular file,&lt;/LI&gt;&lt;LI&gt;There are parts/versions of delta in the delta folder written as a parquet. You can load them separately to DEBUG is all ok.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 11 Apr 2022 20:05:13 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-table-created-on-s3-has-all-null-values/m-p/23146#M15946</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2022-04-11T20:05:13Z</dc:date>
    </item>
  </channel>
</rss>

