cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

I don't have Upsert/Merge use cases. Should I use Delta or can I use Parquet?

User16869510359
Esteemed Contributor
 
2 REPLIES 2

Ryan_Chynoweth
Honored Contributor III

I would recommend using Delta. Delta stores data as parquet files so you still get a lot of the benefits of parquet with Delta. Even though you don't need to merge data, I would assume you will still want to take advantage of the update/delete functionality of delta. Plus delta will offer better optimization techniques to ensure that your data can be queries efficiently (file pruning, z ordering etc.).

The conversion between delta and parquet is easy so you can always test out both and see which you prefer.

Anand_Ladda
Honored Contributor II

Delta has significant value beyond the DML/ACID capabilities. Delta's data organization strategies that @Ryan Chynowethโ€‹ mentions also offer an advantage even for read-only use cases for querying and joining the data. Delta also supports in-place conversion from Parquet. See this for details - https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-convert-to-delta.html

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.