cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Change schema when writing to the Delta format

Biber
New Contributor III

Is it possible to reapply schema in delta files? For example, we have a history with field string but from some point, we need to replace string with struct.

In my case merge option and overwrite schema don't work.

2 ACCEPTED SOLUTIONS

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

You can change a column’s type or name or drop a column by rewriting the table. To do this, use the

overwriteSchema option. But please back up your table first.

spark.read.table("table_name") \
.write \
.format("delta") \
.mode("overwrite") \
.option("overwriteSchema", "true") \
.saveAsTable("table_name")

View solution in original post

Hubert-Dudek
Esteemed Contributor III

Please select my answer as the best one.

After the last updates, it is also possible to rename columns using ALTER, but only when mapping is enabled. Check here https://docs.databricks.com/delta/delta-column-mapping.html

View solution in original post

5 REPLIES 5

Hubert-Dudek
Esteemed Contributor III

You can change a column’s type or name or drop a column by rewriting the table. To do this, use the

overwriteSchema option. But please back up your table first.

spark.read.table("table_name") \
.write \
.format("delta") \
.mode("overwrite") \
.option("overwriteSchema", "true") \
.saveAsTable("table_name")

Hi @Mike Biber​,

Just a friendly follow-up. Did Hubert's response helped you? Let us know if you still need help.

elgeo
Valued Contributor II

Hi @Hubert Dudek​. Do you know if this works also for identity columns? Is there another way to do this? The below returns ParseException. Thank you

tt = spark.read.table("table_name") \

.withColumn("ID",col("ID").cast("BIGINT GENERATED ALWAYS")) \

.write \

.format("delta") \

.mode("overwrite") \

.option("overwriteSchema", "true") \

.saveAsTable("table_name")

Biber
New Contributor III

Hi guys!

Definitely, thank you for your support.

Hubert-Dudek
Esteemed Contributor III

Please select my answer as the best one.

After the last updates, it is also possible to rename columns using ALTER, but only when mapping is enabled. Check here https://docs.databricks.com/delta/delta-column-mapping.html

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group