Databricks Community

Rubens · ‎06-02-2023

Here's me use case: I'm migrating out of an old DWH, into Databricks. When moving dimension tables into Databricks, I'd like old SKs (surrogate keys) to be maintained, while creating the SKs column as an IDENTITY column, so new dimension values get a new SK, unique over the older SKs coming from the old DWH.

So, if I have a table d_something, with 2 columns (sk, bk) containing one row:

sk = 12, bk = ABC

I'll copy this into a new Databricks Delta table, and when I insert a new row:

INSERT into d_something (bk)

VALUES (DEF)

A new SK be generated, so:

sk = 12, bk = ABC

sk = 13, bk - DEF

(doesn't have to be sequential, just unique).

By this: https://docs.databricks.com/sql/language-manual/sql-ref-syntax-ddl-alter-table.html

I imagine this should be possible to create the table, populate it manually with old SKs, then alter the SK column into IDENTITY (using SYNC IDENTITY).

So far I managed to create a fresh table with IDENTITY column, such as:

CREATE TABLE sk_get_test_1 (
  sk BIGINT GENERATED ALWAYS AS IDENTITY (START WITH 1 INCREMENT BY 1),
  bk STRING
)

But manually populating the SK column is returns an error the IDENTITY columns cannot be manually populated.

Can I create it a a regular column, populate the old SKs, and then alter to IDENTITY column?

Any other ideas here?

Thanks!!

Anonymous · ‎06-17-2023

Hi @Ronen Levi

Great to meet you, and thanks for your question!

Let's see if your peers in the community have an answer to your question. Thanks.

Databricks Community

how to alter a column into an IDENTITY column

Join Us as a Local Community Builder!

Lakehouse, Lagers & Legends — Bangalore Meetup | December 13

🌟 Community Pulse: Your Weekly Roundup! November 21 – 27, 2025

Join us for another BrickTalk: Vibe-Coding Databricks Apps in Replit with Augusto!

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐