cancel
Showing results for 
Search instead for 
Did you mean: 
Certifications
Join dynamic discussions on Databricks certifications within the Community. Exchange insights, tips, and experiences to help prepare for certification exams and validate your expertise in data engineering, analytics, and machine learning.
cancel
Showing results for 
Search instead for 
Did you mean: 

temporary tables or dataframes

Phani1
Valued Contributor II

Hi Team,

We have to generate over 70 intermediate tables. Should we use temporary tables or dataframes, or should we create delta tables and truncate and reload? Having too many temporary tables could lead to memory problems. In this situation, what is the most effective approach when one intermediate table relies on another?

Regards,

Janga

2 REPLIES 2

Walter_C
Databricks Employee
Databricks Employee

Using temporary tables or dataframes can be a good approach when the data is only needed for the duration of a single session. However, as you mentioned, having too many temporary tables could lead to memory problems.

On the other hand, Delta tables could be a better option when you need to persist the data across multiple sessions or jobs. Delta tables also provide ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. However, creating Delta tables, truncating, and reloading could be more time-consuming and resource-intensive.

In terms of memory management, Databricks' Spark deployment has a specific memory layout with distinct memory zones for storage, execution, and user heap. Spark attempts to dynamically grow and shrink these regions based on usage and certain limits. For large-memory instances, Databricks enables off-heap memory and sets the size of the off-heap zone to 75% of the usable container memory for the instance, leaving the remaining 25% for heap memory.

jose_gonzalez
Databricks Employee
Databricks Employee

Hi @Phani1 , did you were able to review @Walter_C 's response? do you still need help or can you mark it as accepted solution? 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group