<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Is it a bug in DEEP CLONE? in Get Started Discussions</title>
    <link>https://community.databricks.com/t5/get-started-discussions/is-it-a-bug-in-deep-clone/m-p/43771#M948</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I'm trying to modify a delta table using following approach:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Shallow clone of the table (source_table)&lt;/LI&gt;&lt;LI&gt;Modification of the the clone (clonned_table)&lt;/LI&gt;&lt;LI&gt;Deep clone of the modified table to the source table.&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;Source delta table has&amp;nbsp;26 752 rows. Current Delta table version is: 123&lt;/P&gt;&lt;P&gt;This is my code:&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;CREATE TABLE clonned_table&amp;nbsp;SHALLOW CLONE&amp;nbsp;source_table;&lt;/P&gt;&lt;P&gt;ALTER TABLE clonned_table&amp;nbsp;ALTER COLUMN&amp;nbsp;column_1&amp;nbsp;COMMENT 'a comment';&lt;BR /&gt;ALTER TABLE clonned_table&amp;nbsp;SET TBLPROPERTIES ('schema_version' = '8');&lt;/P&gt;&lt;P&gt;REPLACE TABLE source_table&amp;nbsp;DEEP CLONE clonned_table;&lt;BR /&gt;I run last command 13 times&lt;/P&gt;&lt;P&gt;After that I verified number of rows before and after every replace.&lt;/P&gt;&lt;P&gt;before:&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 123 -&amp;gt; result&amp;nbsp;26 752 rows - expected value&amp;nbsp;&lt;/P&gt;&lt;P&gt;after:&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 124 -&amp;gt; result&amp;nbsp;26 752 rows - expected value&amp;nbsp;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 125 -&amp;gt; result&amp;nbsp;26 752 rows - expected value&amp;nbsp;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 126 -&amp;gt; &lt;STRONG&gt;result&amp;nbsp;0 rows - !!!&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 127 -&amp;gt; result&amp;nbsp;26 752 rows - expected value&amp;nbsp;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 128 -&amp;gt; result&amp;nbsp;26 752 rows - expected value&amp;nbsp;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 129 -&amp;gt; &lt;STRONG&gt;result&amp;nbsp;0 rows - !!!&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 130 -&amp;gt; result&amp;nbsp;26 752 rows - expected value&amp;nbsp;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 131 -&amp;gt; &lt;STRONG&gt;result&amp;nbsp;0 rows - !!!&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 132 -&amp;gt; result&amp;nbsp;26 752 rows - expected value&amp;nbsp;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 133 -&amp;gt; &lt;STRONG&gt;result&amp;nbsp;0 rows - !!!&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 134 -&amp;gt; result&amp;nbsp;26 752 rows - expected value&amp;nbsp;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 135 -&amp;gt; &lt;STRONG&gt;result&amp;nbsp;0 rows - !!!&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 136 -&amp;gt; result&amp;nbsp;26 752 rows - expected value&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tested it on DB 12.2 LTS but last 3 execution using 13.3 LTS&lt;/P&gt;&lt;P&gt;It looks like&amp;nbsp;DEEP CLONE sometime doesn't work properly.&lt;/P&gt;&lt;P&gt;What is the most interesting one time DEEP CLONE copied a few records.&lt;/P&gt;&lt;P&gt;In the delta log file I can see file attached to the version&lt;/P&gt;&lt;P&gt;Sometime it is just&amp;nbsp;"add" command&amp;nbsp;but sometime it is&amp;nbsp;"remove" and "add" the same file but there is no relation&amp;nbsp;between these two types of log and result.&lt;/P&gt;</description>
    <pubDate>Wed, 06 Sep 2023 10:14:08 GMT</pubDate>
    <dc:creator>norbitek</dc:creator>
    <dc:date>2023-09-06T10:14:08Z</dc:date>
    <item>
      <title>Is it a bug in DEEP CLONE?</title>
      <link>https://community.databricks.com/t5/get-started-discussions/is-it-a-bug-in-deep-clone/m-p/43771#M948</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I'm trying to modify a delta table using following approach:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Shallow clone of the table (source_table)&lt;/LI&gt;&lt;LI&gt;Modification of the the clone (clonned_table)&lt;/LI&gt;&lt;LI&gt;Deep clone of the modified table to the source table.&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;Source delta table has&amp;nbsp;26 752 rows. Current Delta table version is: 123&lt;/P&gt;&lt;P&gt;This is my code:&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;CREATE TABLE clonned_table&amp;nbsp;SHALLOW CLONE&amp;nbsp;source_table;&lt;/P&gt;&lt;P&gt;ALTER TABLE clonned_table&amp;nbsp;ALTER COLUMN&amp;nbsp;column_1&amp;nbsp;COMMENT 'a comment';&lt;BR /&gt;ALTER TABLE clonned_table&amp;nbsp;SET TBLPROPERTIES ('schema_version' = '8');&lt;/P&gt;&lt;P&gt;REPLACE TABLE source_table&amp;nbsp;DEEP CLONE clonned_table;&lt;BR /&gt;I run last command 13 times&lt;/P&gt;&lt;P&gt;After that I verified number of rows before and after every replace.&lt;/P&gt;&lt;P&gt;before:&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 123 -&amp;gt; result&amp;nbsp;26 752 rows - expected value&amp;nbsp;&lt;/P&gt;&lt;P&gt;after:&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 124 -&amp;gt; result&amp;nbsp;26 752 rows - expected value&amp;nbsp;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 125 -&amp;gt; result&amp;nbsp;26 752 rows - expected value&amp;nbsp;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 126 -&amp;gt; &lt;STRONG&gt;result&amp;nbsp;0 rows - !!!&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 127 -&amp;gt; result&amp;nbsp;26 752 rows - expected value&amp;nbsp;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 128 -&amp;gt; result&amp;nbsp;26 752 rows - expected value&amp;nbsp;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 129 -&amp;gt; &lt;STRONG&gt;result&amp;nbsp;0 rows - !!!&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 130 -&amp;gt; result&amp;nbsp;26 752 rows - expected value&amp;nbsp;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 131 -&amp;gt; &lt;STRONG&gt;result&amp;nbsp;0 rows - !!!&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 132 -&amp;gt; result&amp;nbsp;26 752 rows - expected value&amp;nbsp;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 133 -&amp;gt; &lt;STRONG&gt;result&amp;nbsp;0 rows - !!!&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 134 -&amp;gt; result&amp;nbsp;26 752 rows - expected value&amp;nbsp;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 135 -&amp;gt; &lt;STRONG&gt;result&amp;nbsp;0 rows - !!!&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;SELECT COUNT(*) FROM&amp;nbsp;source_table&amp;nbsp;VERSION AS OF 136 -&amp;gt; result&amp;nbsp;26 752 rows - expected value&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tested it on DB 12.2 LTS but last 3 execution using 13.3 LTS&lt;/P&gt;&lt;P&gt;It looks like&amp;nbsp;DEEP CLONE sometime doesn't work properly.&lt;/P&gt;&lt;P&gt;What is the most interesting one time DEEP CLONE copied a few records.&lt;/P&gt;&lt;P&gt;In the delta log file I can see file attached to the version&lt;/P&gt;&lt;P&gt;Sometime it is just&amp;nbsp;"add" command&amp;nbsp;but sometime it is&amp;nbsp;"remove" and "add" the same file but there is no relation&amp;nbsp;between these two types of log and result.&lt;/P&gt;</description>
      <pubDate>Wed, 06 Sep 2023 10:14:08 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/is-it-a-bug-in-deep-clone/m-p/43771#M948</guid>
      <dc:creator>norbitek</dc:creator>
      <dc:date>2023-09-06T10:14:08Z</dc:date>
    </item>
    <item>
      <title>Re: Is it a bug in DEEP CLONE?</title>
      <link>https://community.databricks.com/t5/get-started-discussions/is-it-a-bug-in-deep-clone/m-p/43821#M957</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;&amp;nbsp;thanks for the response.&lt;/P&gt;&lt;P&gt;I do not modify data in the source and cloned table between executions of "deep" clone operations.&lt;/P&gt;&lt;P&gt;I modify metadata only before deep cloning&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Parquet file is not deleted from the storage at all there is no new files as well.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;It looks like this is metadata operation.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Every time 3 new files are generated&amp;nbsp;in the delta log: json, crc and checkpoint&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;As I mentioned I executed the same command but delta log is different&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;for example:&lt;/SPAN&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;for version 128 that returns 26 752 rows&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;SPAN&gt;"remove":{"path":"part-00000-72d1d631-a961-479a-a7eb-1580e8665d78-c000.snappy.parquet"&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;"add":{"path":"part-00000-72d1d631-a961-479a-a7eb-1580e8665d78-c000.snappy.parquet"&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;and all detailed information (including statistics)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;for version 135 that returns 0 rows&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;SPAN&gt;"remove":{"path":"part-00000-72d1d631-a961-479a-a7eb-1580e8665d78-c000.snappy.parquet"&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;"add":{"path":"part-00000-72d1d631-a961-479a-a7eb-1580e8665d78-c000.snappy.parquet"&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;and all detailed information (including statistics)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;for version 136 that returns 26 752 rows&amp;nbsp;&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;SPAN&gt;"add":{"path":"part-00000-72d1d631-a961-479a-a7eb-1580e8665d78-c000.snappy.parquet"&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;and all detailed information (including statistics)&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 06 Sep 2023 14:42:24 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/is-it-a-bug-in-deep-clone/m-p/43821#M957</guid>
      <dc:creator>norbitek</dc:creator>
      <dc:date>2023-09-06T14:42:24Z</dc:date>
    </item>
  </channel>
</rss>

