cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

TABLE_OR_VIEW_ALREADY_EXISTS when using saveAsTable with append mode

Mikkel
New Contributor III

 

I ran into an unexpected error while writing a DataFrame to a managed table

 
df.write \
  .mode("append") \
  .option("overwriteSchema", "true") \
  .option("mergeSchema", "true") \
  .saveAsTable(bsr_sink_table_fqn)

Instead of appending, Spark attempted to create the table again and failed with.



```[TABLE_OR_VIEW_ALREADY_EXISTS] Cannot create table or view `customer__castan__production`.`insights`.`used_bsr_insights`
because it already exists.
Choose a different name, drop the existing object, add the IF NOT EXISTS clause, ...
SQLSTATE: 42P07```

The table was actually created but empty. 

Why this is confusing:

  • The .mode("append") call should normally append to the table if it exists, not trigger a CREATE TABLE.

  • The presence of .option("overwriteSchema", "true") and .option("mergeSchema", "true") may be causing Databricks to attempt a DDL operation on the existing table, which conflicts with append mode.

    Questions for the community:

    1. Is this expected behavior when combining append with overwriteSchema/mergeSchema?

    2. Could concurrent writers to the same target table trigger this error even in append mode?

    3. Is there a recommended way to safely append with schema evolution without hitting this TABLE_OR_VIEW_ALREADY_EXISTS error?

       

2 ACCEPTED SOLUTIONS

Accepted Solutions

szymon_dybczak
Esteemed Contributor III

Hi @Mikkel ,

I guess this is expected behaviour. Check following excerpt from Jacek Laskowski's Delta Lake internals book.
It seems that setting overwriteSchema to true will wipe out the old schema and let you create a completely new table. So it doesn't make sense to use this option together with append - it should be used with overwrite mode.

szymon_dybczak_0-1757415544626.png

And yes, if multiple concurrent writers will try to write to the same target table then I guess you will encounter this error because each of them will basically try to recreate the table.

View solution in original post

WiliamRosa
New Contributor III

Hi @Mikkel,

I was testing this myself before posting and ran into exactly what @szymon_dybczak  described. overwriteSchema basically tries to recreate the table even when you’re using append. If all you need is schema evolution, stick with mergeSchema — it only supports additive changes like adding new columns. For anything else (renames, type changes, etc.), you’ll need to run an ALTER TABLE first and then append.
Tks for sharing @szymon_dybczak 

Wiliam Rosa
Data Engineer | Machine Learning Engineer
LinkedIn: linkedin.com/in/wiliamrosa

View solution in original post

3 REPLIES 3

szymon_dybczak
Esteemed Contributor III

Hi @Mikkel ,

I guess this is expected behaviour. Check following excerpt from Jacek Laskowski's Delta Lake internals book.
It seems that setting overwriteSchema to true will wipe out the old schema and let you create a completely new table. So it doesn't make sense to use this option together with append - it should be used with overwrite mode.

szymon_dybczak_0-1757415544626.png

And yes, if multiple concurrent writers will try to write to the same target table then I guess you will encounter this error because each of them will basically try to recreate the table.

WiliamRosa
New Contributor III

Hi @Mikkel,

I was testing this myself before posting and ran into exactly what @szymon_dybczak  described. overwriteSchema basically tries to recreate the table even when you’re using append. If all you need is schema evolution, stick with mergeSchema — it only supports additive changes like adding new columns. For anything else (renames, type changes, etc.), you’ll need to run an ALTER TABLE first and then append.
Tks for sharing @szymon_dybczak 

Wiliam Rosa
Data Engineer | Machine Learning Engineer
LinkedIn: linkedin.com/in/wiliamrosa

Mikkel
New Contributor III

 

Thanks @szymon_dybczak and @WiliamRosa   for the reply. I will probably keep mergeSchema then.  If the options are incompatible or might cause issues I wish there was a warning of sorts. 🙂 

 

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now