<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Hive Serde table vs Delta table in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/hive-serde-table-vs-delta-table/m-p/33192#M24266</link>
    <description>&lt;P&gt;it means where the actual data is stored: in your databricks account (managed, let databricks handle the data) or in an external storage (data lake, S3 etc) where you define how the data is stored.&lt;/P&gt;</description>
    <pubDate>Wed, 15 Dec 2021 14:08:13 GMT</pubDate>
    <dc:creator>-werners-</dc:creator>
    <dc:date>2021-12-15T14:08:13Z</dc:date>
    <item>
      <title>Hive Serde table vs Delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/hive-serde-table-vs-delta-table/m-p/33189#M24263</link>
      <description>&lt;P&gt;This might be stupid question. Does the Hive Serde table have the same features (e.g. transactions) comparing to the Delta table?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I tried to find the information in the Databricks documentation but I cannot find a clear answer.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I create the Hive Serde table using this SQL statement&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;CREATE EXTERNAL TABLE mydb.mytable (col1 string, col2 boolean)
ROW FORMAT serde 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://path/to/table/';&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;and I create the Delta table&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;CREATE mydb.mytable
  (col1 string, col2 boolean)
  USING DELTA
  LOCATION 's3://path/to/table'&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 15 Dec 2021 12:47:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/hive-serde-table-vs-delta-table/m-p/33189#M24263</guid>
      <dc:creator>herry</dc:creator>
      <dc:date>2021-12-15T12:47:59Z</dc:date>
    </item>
    <item>
      <title>Re: Hive Serde table vs Delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/hive-serde-table-vs-delta-table/m-p/33190#M24264</link>
      <description>&lt;P&gt;serde is serializer / deserializer, in that case just to Parquet format . Delta is based on parquet (snapshot in delta is just regular parquet file) but have in addition commits etc. in separate files. So you saved file just as parquet but by using CREATE TABLE USING DELTA you converted it to delta umanaged table.&lt;/P&gt;</description>
      <pubDate>Wed, 15 Dec 2021 13:01:17 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/hive-serde-table-vs-delta-table/m-p/33190#M24264</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2021-12-15T13:01:17Z</dc:date>
    </item>
    <item>
      <title>Re: Hive Serde table vs Delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/hive-serde-table-vs-delta-table/m-p/33191#M24265</link>
      <description>&lt;P&gt;What does "delta unmanaged table" mean comparing to "delta managed table"?&lt;/P&gt;</description>
      <pubDate>Wed, 15 Dec 2021 13:21:01 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/hive-serde-table-vs-delta-table/m-p/33191#M24265</guid>
      <dc:creator>herry</dc:creator>
      <dc:date>2021-12-15T13:21:01Z</dc:date>
    </item>
    <item>
      <title>Re: Hive Serde table vs Delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/hive-serde-table-vs-delta-table/m-p/33192#M24266</link>
      <description>&lt;P&gt;it means where the actual data is stored: in your databricks account (managed, let databricks handle the data) or in an external storage (data lake, S3 etc) where you define how the data is stored.&lt;/P&gt;</description>
      <pubDate>Wed, 15 Dec 2021 14:08:13 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/hive-serde-table-vs-delta-table/m-p/33192#M24266</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2021-12-15T14:08:13Z</dc:date>
    </item>
    <item>
      <title>Re: Hive Serde table vs Delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/hive-serde-table-vs-delta-table/m-p/33193#M24267</link>
      <description>&lt;P&gt;How about the ACID transactions (commits) and Z-Ordering features? Are they available in the Hive Serde table?&lt;/P&gt;</description>
      <pubDate>Wed, 15 Dec 2021 14:28:49 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/hive-serde-table-vs-delta-table/m-p/33193#M24267</guid>
      <dc:creator>herry</dc:creator>
      <dc:date>2021-12-15T14:28:49Z</dc:date>
    </item>
    <item>
      <title>Re: Hive Serde table vs Delta table</title>
      <link>https://community.databricks.com/t5/data-engineering/hive-serde-table-vs-delta-table/m-p/33194#M24268</link>
      <description>&lt;P&gt;AFAIK Hive SerDe is just Serializer and Deserializer (write and read data to/from storage).&lt;/P&gt;&lt;P&gt;Hive uses SerDe (and FileFormat) to read and write table rows.  So it is not an actual file format like parquet, orc and also delta lake (which I consider a separate file format even though it is parquet on steroids).&lt;/P&gt;&lt;P&gt;So the comparison with delta lake is kinda awkward.&lt;/P&gt;&lt;P&gt;A better comparison would be Delta Lake vs Iceberg or Hudi.&lt;/P&gt;&lt;P&gt;&lt;A href="https://databricks.com/session_na20/a-thorough-comparison-of-delta-lake-iceberg-and-hudi" alt="https://databricks.com/session_na20/a-thorough-comparison-of-delta-lake-iceberg-and-hudi" target="_blank"&gt;https://databricks.com/session_na20/a-thorough-comparison-of-delta-lake-iceberg-and-hudi&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 16 Dec 2021 09:35:13 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/hive-serde-table-vs-delta-table/m-p/33194#M24268</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2021-12-16T09:35:13Z</dc:date>
    </item>
  </channel>
</rss>

