<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Moving data to a delta table keeping the old surrogate ids intact in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/moving-data-to-a-delta-table-keeping-the-old-surrogate-ids/m-p/24261#M16861</link>
    <description>&lt;P&gt;Hello experts! We have a table in our current system that we need to move it (one-off) to a delta in Databricks keeping its Ids (surrogate keys) intact. We think to of the following steps: &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;1. create a new delta table with a "BIGINT GENERATED BY DEFAULT AS IDENTITY" column for the ID&lt;/P&gt;&lt;P&gt;2. move the current data to the new delta table as "GENERATED BY DEFAULT" allows identity insertion to be overridden&lt;/P&gt;&lt;P&gt;3.  "ALTER COLUMN ID SYNC IDENTITY" so that new rows that will be inserted from now on will take the next automatically assigned identity value which will start from n (maximum Id in the table) +1&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;However, the problem with this is that, after the initial load, we cannot alter the surrogate key column to "GENERATED ALWAYS" so that we don't allow overwrites.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;We would appreciate your feedback. Thank you in advance&lt;/P&gt;</description>
    <pubDate>Wed, 02 Nov 2022 13:13:52 GMT</pubDate>
    <dc:creator>elgeo</dc:creator>
    <dc:date>2022-11-02T13:13:52Z</dc:date>
    <item>
      <title>Moving data to a delta table keeping the old surrogate ids intact</title>
      <link>https://community.databricks.com/t5/data-engineering/moving-data-to-a-delta-table-keeping-the-old-surrogate-ids/m-p/24261#M16861</link>
      <description>&lt;P&gt;Hello experts! We have a table in our current system that we need to move it (one-off) to a delta in Databricks keeping its Ids (surrogate keys) intact. We think to of the following steps: &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;1. create a new delta table with a "BIGINT GENERATED BY DEFAULT AS IDENTITY" column for the ID&lt;/P&gt;&lt;P&gt;2. move the current data to the new delta table as "GENERATED BY DEFAULT" allows identity insertion to be overridden&lt;/P&gt;&lt;P&gt;3.  "ALTER COLUMN ID SYNC IDENTITY" so that new rows that will be inserted from now on will take the next automatically assigned identity value which will start from n (maximum Id in the table) +1&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;However, the problem with this is that, after the initial load, we cannot alter the surrogate key column to "GENERATED ALWAYS" so that we don't allow overwrites.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;We would appreciate your feedback. Thank you in advance&lt;/P&gt;</description>
      <pubDate>Wed, 02 Nov 2022 13:13:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/moving-data-to-a-delta-table-keeping-the-old-surrogate-ids/m-p/24261#M16861</guid>
      <dc:creator>elgeo</dc:creator>
      <dc:date>2022-11-02T13:13:52Z</dc:date>
    </item>
    <item>
      <title>Re: Moving data to a delta table keeping the old surrogate ids intact</title>
      <link>https://community.databricks.com/t5/data-engineering/moving-data-to-a-delta-table-keeping-the-old-surrogate-ids/m-p/24262#M16862</link>
      <description>&lt;P&gt;same here, I submitted an idea in the azure databricks portal&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://feedback.azure.com/d365community/idea/d403303c-6761-ed11-a81b-000d3ae5ae95" target="test_blank"&gt;https://feedback.azure.com/d365community/idea/d403303c-6761-ed11-a81b-000d3ae5ae95&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;SET IDENTITY_INSERT ON&lt;/P&gt;&lt;P&gt;when a column is defined as&amp;nbsp;GENERATED ALWAYS, we often need to reload data with exact same key (commonly a key is referenced by other tables)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;similar to SET IDENTITY_INSERT ON in SQL SERVER, it will be very helpful to do the same in delta table.&lt;/P&gt;</description>
      <pubDate>Fri, 11 Nov 2022 02:21:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/moving-data-to-a-delta-table-keeping-the-old-surrogate-ids/m-p/24262#M16862</guid>
      <dc:creator>lizou</dc:creator>
      <dc:date>2022-11-11T02:21:44Z</dc:date>
    </item>
  </channel>
</rss>

