Hi @mehalrathod
This sort of performance regression in Databricks (especially for overwrite) is usually caused by one or more of the following:
Common Causes of Overwrite Slowness
1. Delta Table History or File Explosion
- If the target table is a Delta table, check if the number of files/versions has grown significantly.
- Over time, Delta tables accumulate many small files, especially if OPTIMIZE and VACUUM haven't been run regularly.
- Overwrite may trigger file listing, conflict resolution, or transaction logs processing.
Check:
DESCRIBE HISTORY your_table_name;
Check the number of versions and look for high file count with:
spark.read.format("delta").load("/mnt/path/to/table").inputFiles().length
2. Partition Overwrite Behavior Change
- If you are overwriting a partitioned Delta table, and the overwrite mode changed from `dynamic` to `static`, it could result in writing all partitions, not just affected ones.
Confirm mode:
spark.conf.get("spark.sql.sources.partitionOverwriteMode")
Should be 'dynamic' for best performance
Set it explicitly:
spark.conf.set("spark.sql.sources.partitionOverwriteMode", "dynamic")
3. Compaction or OPTIMIZE Running Concurrently
- Check if any OPTIMIZE or ZORDER operations are running on the table in parallel (scheduled or manual).
- These operations can lock or block writes and make overwrites crawl or fail.
4. Concurrency or Lock Contention
Delta Lake uses optimistic concurrency control โ if another process is holding the lock or continuously modifying the table, your overwrite may be waiting.
ConcurrentAppendException
TransactionConflictException
Also check the _delta_log folder size and metadata load time.
5. Table Metadata or Schema Drift
Large schema evolution, column reorderings, or misalignment between the DF schema and table schema can cause Spark to do heavy metadata planning and validation, which adds time.
Check if the dataframe schema has recently changed subtly (e.g., column types, order, nullability).
Quick Diagnostic Tips
1. Enable Spark UI and Logs Analysis: Look at Stage Details and Job DAG during the overwrite. Often, the issue is in a shuffle or file-level metadata operation.
2. Reproduce the Write on a Smaller Subset:
- Try .limit(10000) and overwrite to same table โ does it still take long?
3. Try Write to a Temp Table:
- Same data, write to a new path/table โ Is performance OK?
- If yes, the issue is with the target Delta table, not the data or compute.
Workarounds & Fixes:
- Run VACUUM and OPTIMIZE on the table periodically.
- Repartition the DF before write to avoid file explosion:
df.repartition(200).write.mode("overwrite").format("delta").saveAsTable("...")
- Try replacing `.saveAsTable()` with direct path write if using external tables.
- Upgrade Runtime if you're on an older DBR version โ file I/O and Delta writers are optimized in newer runtimes.
LR