cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Change schema when writing to the Delta format

Biber
New Contributor III

Is it possible to reapply schema in delta files? For example, we have a history with field string but from some point, we need to replace string with struct.

In my case merge option and overwrite schema don't work.

2 ACCEPTED SOLUTIONS

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

You can change a column’s type or name or drop a column by rewriting the table. To do this, use the

overwriteSchema option. But please back up your table first.

spark.read.table("table_name") \
.write \
.format("delta") \
.mode("overwrite") \
.option("overwriteSchema", "true") \
.saveAsTable("table_name")

View solution in original post

Hubert-Dudek
Esteemed Contributor III

Please select my answer as the best one.

After the last updates, it is also possible to rename columns using ALTER, but only when mapping is enabled. Check here https://docs.databricks.com/delta/delta-column-mapping.html

View solution in original post

5 REPLIES 5

Hubert-Dudek
Esteemed Contributor III

You can change a column’s type or name or drop a column by rewriting the table. To do this, use the

overwriteSchema option. But please back up your table first.

spark.read.table("table_name") \
.write \
.format("delta") \
.mode("overwrite") \
.option("overwriteSchema", "true") \
.saveAsTable("table_name")

Hi @Mike Biber​,

Just a friendly follow-up. Did Hubert's response helped you? Let us know if you still need help.

elgeo
Valued Contributor II

Hi @Hubert Dudek​. Do you know if this works also for identity columns? Is there another way to do this? The below returns ParseException. Thank you

tt = spark.read.table("table_name") \

.withColumn("ID",col("ID").cast("BIGINT GENERATED ALWAYS")) \

.write \

.format("delta") \

.mode("overwrite") \

.option("overwriteSchema", "true") \

.saveAsTable("table_name")

Biber
New Contributor III

Hi guys!

Definitely, thank you for your support.

Hubert-Dudek
Esteemed Contributor III

Please select my answer as the best one.

After the last updates, it is also possible to rename columns using ALTER, but only when mapping is enabled. Check here https://docs.databricks.com/delta/delta-column-mapping.html

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.