Delta vs. Parquet
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-14-2021 01:21 PM
I'm curious about the benefits of using the Delta file format vs. Parquet. Is there any downside to using Delta?
- Labels:
-
Delta
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-17-2021 04:17 PM
Not really. You get upsides like transactions, time travel, upsert/merge/deletes. There is some cost to that, as Delta manages that by writing and managing many smaller Parquet files and has to re-read them to recreate the current or past state of the data. VACUUMing the data set periodically takes time too. So you may incur a little runtime overhead for these reasons; then again, Delta offers advanced features like z-order indexing and data skipping with Spark that also make it faster to read than Parquet.

