Moving data to a delta table keeping the old surrogate ids intact
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-02-2022 06:13 AM
Hello experts! We have a table in our current system that we need to move it (one-off) to a delta in Databricks keeping its Ids (surrogate keys) intact. We think to of the following steps:
1. create a new delta table with a "BIGINT GENERATED BY DEFAULT AS IDENTITY" column for the ID
2. move the current data to the new delta table as "GENERATED BY DEFAULT" allows identity insertion to be overridden
3. "ALTER COLUMN ID SYNC IDENTITY" so that new rows that will be inserted from now on will take the next automatically assigned identity value which will start from n (maximum Id in the table) +1
However, the problem with this is that, after the initial load, we cannot alter the surrogate key column to "GENERATED ALWAYS" so that we don't allow overwrites.
We would appreciate your feedback. Thank you in advance
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-10-2022 06:21 PM
same here, I submitted an idea in the azure databricks portal
https://feedback.azure.com/d365community/idea/d403303c-6761-ed11-a81b-000d3ae5ae95
SET IDENTITY_INSERT ON
when a column is defined as GENERATED ALWAYS, we often need to reload data with exact same key (commonly a key is referenced by other tables)
similar to SET IDENTITY_INSERT ON in SQL SERVER, it will be very helpful to do the same in delta table.
![](/skins/images/97567C72181EBE789E1F0FD869E4C89B/responsive_peak/images/icon_anonymous_message.png)
![](/skins/images/97567C72181EBE789E1F0FD869E4C89B/responsive_peak/images/icon_anonymous_message.png)