Identity column definition lost using save as tabl...

lizou · ‎05-08-2022

I found an issue:

For a table with an identity column defined.

when the table column is renamed using this method, the identity definition will be removed.

That means using an identity column in a table requires extra attention to check whether the identity column is still there, and the current seed value.

code example

spark.read.table(...) \

.withColumnRenamed("dateOfBirth", "birthDate") \

.write \

.format("delta") \

.mode("overwrite") \

.option("overwriteSchema", "true") \

.saveAsTable(...)

https://docs.databricks.com/delta/delta-batch.html#explicit-schema-update

example of a table with an identity column

CREATE TABLE table_with_identity_col (

RowKey bigint not null GENERATED BY DEFAULT AS IDENTITY (START WITH 1 INCREMENT BY 1) comment 'identity',

UserName string

) USING DELTA;

Identity column definition lost using save as table