- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-08-2022 08:41 AM
I found an issue:
For a table with an identity column defined.
when the table column is renamed using this method, the identity definition will be removed.
That means using an identity column in a table requires extra attention to check whether the identity column is still there, and the current seed value.
code example
spark.read.table(...) \
.withColumnRenamed("dateOfBirth", "birthDate") \
.write \
.format("delta") \
.mode("overwrite") \
.option("overwriteSchema", "true") \
.saveAsTable(...)
https://docs.databricks.com/delta/delta-batch.html#explicit-schema-update
example of a table with an identity column
CREATE TABLE table_with_identity_col (
RowKey bigint not null GENERATED BY DEFAULT AS IDENTITY (START WITH 1 INCREMENT BY 1) comment 'identity',
UserName string
) USING DELTA;