Databricks Community

hari · ‎10-14-2022

We didn't need to set partitions for our delta tables as we didn't have many performance concerns and delta lake out-of-the-box optimization worked great for us. But there is now a need to set a specific partition column for some tables to allow concurrent delta merges into the partitions.

We are using unmanaged tables with the data sitting in s3

What is the best way to add/update partition columns on an existing delta table?

I have tried the `ALTER TABLE log ADD PARTITION(date = DATE'2021-09-10');` but it didn't work also this doesn't add partition for all values of date

Also tried rewriting the table and setting partition column with:

(
df.write.format("delta")
.mode("overwrite")
.option("overwriteSchema", "true")
 .partitionBy(<Col Name>)
 .saveAsTable(<Table Name>)
)

But I don't see the partition name when I check the table with `DESCRIBE TABLE`, So not sure if this is the proper way to approach this.

Another option is to recreate the tables as i do see that we can set partition columns while creating a table, But don't really want to do this except maybe as a last resort.