<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to Update Identity Column for a Databricks Table in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-to-update-identity-column-for-a-databricks-table/m-p/117025#M45415</link>
    <description>&lt;P&gt;Hi All,&lt;/P&gt;&lt;P&gt;I have a databricks table with the below DDL:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;CREATE  TABLE default.Test (
  ID BIGINT GENERATED ALWAYS AS IDENTITY (START WITH 1 INCREMENT BY 1),
  StopFromDateTime TIMESTAMP,
  StopToDateTime TIMESTAMP,
  User STRING)
USING delta
TBLPROPERTIES (
  'delta.autoOptimize.autoCompact' = 'true',
  'delta.autoOptimize.optimizeWrite' = 'true',
  'delta.columnMapping.mode' = 'name',
  'delta.minReaderVersion' = '2',
  'delta.minWriterVersion' = '6')&lt;/LI-CODE&gt;&lt;P&gt;I have inserted 3 rows of data into the table and the table with the data looks like below:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;ID   StartDateTime   StopDateTime  User
1    2020-01-01      2020-01-02    Josh
2    2021-01-01      2021-01-02    Alex
3    2022-01-01      2022-01-02    Ross&lt;/LI-CODE&gt;&lt;P&gt;When i look at the history of the table it has 4 versions, 1(Create),2,3&amp;amp;4(Insert). I had to delete the records for ID 2&amp;amp;3 so instead of deleting it i did a restore of the table to version 2 which has the first insert so my table now has the data like this&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;ID   StartDateTime   StopDateTime  User
1    2020-01-01      2020-01-02    Josh&lt;/LI-CODE&gt;&lt;P&gt;I have inserted another record into the table now for and after inserting the new record my table looks like this:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;ID   StartDateTime   StopDateTime  User
1    2020-01-01      2020-01-02    Josh
4    2023-01-01      2023-01-02    Steve&lt;/LI-CODE&gt;&lt;P&gt;I was hoping the new ID would be 2 as i restored it to a version before ID 2&amp;amp;3 are inserted however its getting 4.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can someone provide any inputs on how can i reset the identity on this table?&lt;/P&gt;</description>
    <pubDate>Tue, 29 Apr 2025 18:00:31 GMT</pubDate>
    <dc:creator>LearnDB1234</dc:creator>
    <dc:date>2025-04-29T18:00:31Z</dc:date>
    <item>
      <title>How to Update Identity Column for a Databricks Table</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-update-identity-column-for-a-databricks-table/m-p/117025#M45415</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;&lt;P&gt;I have a databricks table with the below DDL:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;CREATE  TABLE default.Test (
  ID BIGINT GENERATED ALWAYS AS IDENTITY (START WITH 1 INCREMENT BY 1),
  StopFromDateTime TIMESTAMP,
  StopToDateTime TIMESTAMP,
  User STRING)
USING delta
TBLPROPERTIES (
  'delta.autoOptimize.autoCompact' = 'true',
  'delta.autoOptimize.optimizeWrite' = 'true',
  'delta.columnMapping.mode' = 'name',
  'delta.minReaderVersion' = '2',
  'delta.minWriterVersion' = '6')&lt;/LI-CODE&gt;&lt;P&gt;I have inserted 3 rows of data into the table and the table with the data looks like below:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;ID   StartDateTime   StopDateTime  User
1    2020-01-01      2020-01-02    Josh
2    2021-01-01      2021-01-02    Alex
3    2022-01-01      2022-01-02    Ross&lt;/LI-CODE&gt;&lt;P&gt;When i look at the history of the table it has 4 versions, 1(Create),2,3&amp;amp;4(Insert). I had to delete the records for ID 2&amp;amp;3 so instead of deleting it i did a restore of the table to version 2 which has the first insert so my table now has the data like this&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;ID   StartDateTime   StopDateTime  User
1    2020-01-01      2020-01-02    Josh&lt;/LI-CODE&gt;&lt;P&gt;I have inserted another record into the table now for and after inserting the new record my table looks like this:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;ID   StartDateTime   StopDateTime  User
1    2020-01-01      2020-01-02    Josh
4    2023-01-01      2023-01-02    Steve&lt;/LI-CODE&gt;&lt;P&gt;I was hoping the new ID would be 2 as i restored it to a version before ID 2&amp;amp;3 are inserted however its getting 4.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can someone provide any inputs on how can i reset the identity on this table?&lt;/P&gt;</description>
      <pubDate>Tue, 29 Apr 2025 18:00:31 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-update-identity-column-for-a-databricks-table/m-p/117025#M45415</guid>
      <dc:creator>LearnDB1234</dc:creator>
      <dc:date>2025-04-29T18:00:31Z</dc:date>
    </item>
    <item>
      <title>Re: How to Update Identity Column for a Databricks Table</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-update-identity-column-for-a-databricks-table/m-p/117152#M45438</link>
      <description>&lt;P class="my-0"&gt;When you restored your Delta table to an earlier version (version 2), you observed that the next inserted row received an ID of 4, not 2. This is because Delta Lake’s identity column maintains a&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;EM&gt;high watermark&lt;/EM&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;of the last value used, even across restores. The identity generator is designed to guarantee uniqueness and avoid assigning duplicate values, so it will not reissue previously used identity values-even if the underlying data is reverted to a prior state&lt;SPAN class="whitespace-nowrap"&gt;.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="my-0"&gt;By default,&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;there is no built-in command to reset the identity column’s counter&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;in a Delta table. The identity column will always continue from the highest value it has ever assigned, regardless of table restores, deletes, or truncations&lt;SPAN class="whitespace-nowrap"&gt;.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="my-0"&gt;This behavior is intentional to prevent the risk of duplicate identity values, which could otherwise occur if the counter were allowed to rewind&lt;SPAN class="whitespace-nowrap"&gt;.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="my-0"&gt;The only supported and safe way to reset the identity column is to&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;recreate the table&lt;/STRONG&gt;.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 30 Apr 2025 12:36:00 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-update-identity-column-for-a-databricks-table/m-p/117152#M45438</guid>
      <dc:creator>Walter_C</dc:creator>
      <dc:date>2025-04-30T12:36:00Z</dc:date>
    </item>
    <item>
      <title>Re: How to Update Identity Column for a Databricks Table</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-update-identity-column-for-a-databricks-table/m-p/117156#M45440</link>
      <description>&lt;P&gt;Hi Walter,&lt;/P&gt;&lt;P&gt;Thanks for the response, so when i drop and re-create the table i will be losing the data correct? So how can i move the data over if i have to drop and re-create? Or if i restore it to the version number which had 3 records and then delete 2&amp;amp;3 it would still start at 4 right? So any suggestions on how to drop and re-create the table and not lose the data?&lt;/P&gt;</description>
      <pubDate>Wed, 30 Apr 2025 13:07:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-update-identity-column-for-a-databricks-table/m-p/117156#M45440</guid>
      <dc:creator>LearnDB1234</dc:creator>
      <dc:date>2025-04-30T13:07:54Z</dc:date>
    </item>
    <item>
      <title>Re: How to Update Identity Column for a Databricks Table</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-update-identity-column-for-a-databricks-table/m-p/118535#M45653</link>
      <description>&lt;P&gt;If you recreate the table using&amp;nbsp;&lt;SPAN&gt;BIGINT&lt;/SPAN&gt; &lt;SPAN&gt;GENERATED&lt;/SPAN&gt; &lt;SPAN&gt;BY&lt;/SPAN&gt; &lt;SPAN&gt;DEFAULT&amp;nbsp; instead of&amp;nbsp;BIGINT GENERATED&amp;nbsp;ALWAYS you can manipulate the column values.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;"When using the clause&amp;nbsp;GENERATED BY DEFAULT AS IDENTITY, insert operations can specify values for the identity column."&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&lt;A href="https://docs.databricks.com/aws/en/delta/generated-columns#use-identity-columns-in-delta-lake" target="_blank" rel="noopener"&gt;https://docs.databricks.com/aws/en/delta/generated-columns#use-identity-columns-in-delta-lake&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 08 May 2025 14:53:29 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-update-identity-column-for-a-databricks-table/m-p/118535#M45653</guid>
      <dc:creator>pdiamond</dc:creator>
      <dc:date>2025-05-08T14:53:29Z</dc:date>
    </item>
  </channel>
</rss>

