Performance Issues with Writing Large DataFrames to Managed Tables in Databricks (3.5B+ Rows)

kanikvijay9
Contributor

Hi Community,

I'm working on a large-scale data processing job in Databricks and facing performance and stability issues during the write operations. Here's a detailed breakdown of my use case and environment:

Use Case Overview:

Primary Data Frames:

  1. First Data Frame: Created by reading from 4 external tables (each ~7B rows, 7–15 columns). Final DF size: 3.5B+ rows, mostly strings, nulls, and Booleans.
  2. Second Data Frame: Created from a managed table in Databricks catalog. Size: 3.8B+ rows, 10–15 columns.
  3. Processing Steps:

    • Writing both primary Data Frames to temporary managed tables.
    • Performing multi-key joins using Spark SQL.
    • Final join with another external table (~5B rows, 297.8GiB across 3444 files).
    • Final output is written to a managed Delta table using Liquid Clustering.

Cluster Configuration:

  • Databricks Runtime: 16.4 LTS (includes Apache Spark 3.5.2, Scala 2.12)
  • Driver: Standard_F32s_v4 (160 GB RAM, 32 Cores)
  • Worker: Standard_E32d_v4 (160 GB RAM, 32 Cores)
  • Min Workers: 2 | Max Workers: 18 | Current Workers: 6
  • Unity Catalog Enabled

Spark Configurations:

  • Modified Successfully:
kanikvijay9_0-1755015948307.png

 

  •  Attempted but Failed (Memory Error):
kanikvijay9_1-1755015978065.png

 

  • Error Message:
    • com.databricks.backend.cluster.IllegalSparkContainerMemoryException: INVALID_PARAMETER_VALUE: Specified heap memory (102400 MB) and off heap memory (92475 MB) is above the maximum executor memory (123300 MB) allowed for node type Standard_E20d_v4.

 

Write Logic:

kanikvijay9_0-1755016092143.png

 

What I’ve Tried:

  1. Caching/Persisting: Attempted to cache/persist intermediate Data Frames, but due to the massive size, it was not feasible and led to memory pressure.
  2. Partitioning During Write: Used partition By on multiple columns to optimize write performance, but still facing executor failures and long write times.

Issues Faced:

  1. Executor Failures: While writing the first primary Data Frame to temporary managed tables, 60%+ executors fail.
  2. Write Time: Writing intermediate Data Frames (each taking 8–10 hours) is extremely slow and unstable.
  3. Final Join: Joining with the 5B-row external table is compounding the performance issues.

Looking for Help On:

  • Best practices for writing large Data Frames efficiently.
  • Optimizing memory and executor configurations for large-scale joins and writes.
  • Suggestions for improving write performance with Delta + Liquid Clustering.
  • Any known limitations or tuning tips for Databricks Runtime 10.4 LTS with Spark 3.2.2.

Any insights, suggestions, or shared experiences would be greatly appreciated!

Thanks in advance 
Kanik Vijay