Databricks Community

Autel · ‎01-08-2022

HI,

I'm interested to know if multiple executors to append the same hive table using saveAsTable or insertInto sparksql. will that cause any data corruption? What configuration do I need to enable concurrent write to same hive table?

what about the same question for deltalake ?

-werners- · ‎01-10-2022

The Hive table will not like this, as the underlying data is parquet format which is not ACID compliant.

Delta lake however is:

https://docs.delta.io/0.5.0/concurrency-control.html

You can see that inserts do not give conflicts.

View solution in original post

-werners- · ‎01-10-2022

The Hive table will not like this, as the underlying data is parquet format which is not ACID compliant.

Delta lake however is:

https://docs.delta.io/0.5.0/concurrency-control.html

You can see that inserts do not give conflicts.

Autel · ‎01-13-2022

Hi

Thanks for your answer.

I found the deltalake on s3 has the following warning on the aws page.

"Warning

Concurrent writes to the same Delta table from multiple Spark drivers can lead to data loss."

For single driver with multiple executors, will concurrent write to the same table be an issue as well?

-werners- · ‎01-14-2022

No because that is how spark works.

The driver defines which worker writes what and is up to speed with what is going on.

That is also the reason that multiple drivers (read multiple spark programs) can give conflicts as the drivers do not know of each other what they are doing.

Databricks Community

concurrent update to same hive or deltalake table

Join Us as a Local Community Builder!

🚀 Weekly Delta (8 - 14 October): A Look Back at This Week’s Top Community Highlights

Databricks Community Champion - September 2025 - Nayanjyoti Sonowal

BrickCon 2025 — Dec 3–5 | A Community Conference for Databricks Builders

🌟 Community Sparks of the Week | September 26 – October 2 🌟

Solution Accelerator Series | #4 - Toxicity Detection for Gaming