what config do we use to set row groups fro delta tables on data bricks.
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-09-2024 11:49 PM
I have tried multiples way to set row group for delta tables on data bricks notebook its not working where as I am able to set it properly using spark.
I tried
1. val blockSize = 1024 * 1024 * 60
spark.sparkContext.hadoopConfiguration.setInt( "dfs.blocksize", blockSize )
spark.sparkContext.hadoopConfiguration.setInt( "parquet.block.size", blockSize )
2. df.repartition(1).write.option("parquet.block.size",blockSize).format("delta").mode("overwrite").save("<path>")
Same configs are working fine on simple parquet.
df size = 600 MB
block size = 60 MB
Same configs are working fine on simple parquet.
df size = 600 MB
block size = 60 MB
NumRowGroups should be 10
0 REPLIES 0