Merge delta tables with data more than 200 million

- - Certifications
- - Learning Paths
- - Databricks Product Tours
- - Get Started Guides
- - Product Platform Updates
- - What's New in Databricks

- - Get Started Resources
- - Events
- - Support FAQs
- - Technical Blog
- - Knowledge Sharing Hub
- - Announcements
- - DatabricksTV

- - Private Groups
- - Skills@Scale

- - Databricks Community Champions
- - Khoros Community Forums Support (Not for Databricks Product Questions)
- - Databricks Community Code of Conduct

Data Engineering

Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.

HI Everyone,

Im trying to merge two delta tables who have data more than 200 million in each of them. These tables are properly optimized. But upon running the job, the job is taking a long time to execute and the memory spills are huger (1TB-3TB) recorded. And the jobs are still running. I work with 5 executor nodes with Standard_DS5_V2 Configuration. Can someone help me on how to optimize the code.