07-15-2015 11:45 AM
Spark, by default, uses gzip to store parquet files. I would like to change the compression algorithm from gzip to snappy or lz4.
07-15-2015 11:46 AM
You can set the following spark sql property spark.sql.parquet.compression.codec.
In sql:
%sql set spark.sql.parquet.compression.codec=snappy
You can also set in the sqlContext directly:
sqlContext.setConf("spark.sql.parquet.compression.codec.", "snappy")
05-06-2016 11:06 PM
Note the above has a slight typo
You can also set in the sqlContext directly: sqlContext.setConf("spark.sql.parquet.compression.codec", "snappy")
Unfortunately it appears that lz4 isnt supported as a parquet compression codec. Im not sure why as lz4 is supported for io.codec.
07-28-2016 02:01 PM
What are the options if I don't need any compression while writing my dataframe to HDFS as parquet format ?
06-09-2017 09:26 AM
sqlContext.setConf("spark.sql.parquet.compression.codec", "uncompressed")
07-28-2016 03:34 PM
@karthik.thati - Try this
df.write.option("compression","none").mode("overwrite").save("testoutput.parquet")
06-09-2017 09:44 AM
For uncompressed use
sqlContext.setConf("spark.sql.parquet.compression.codec", "uncompressed")
The value highlighted could be one of the four : uncompressed, snappy, gzip, lzo
12-31-2017 06:31 AM
@prakash573: I
I guess spark uses "Snappy" compression for parquet file by default. I'm referring Spark's official document "Learning Spark" , Chapter 9, page # 182, Table 9-3.
Please confirm if this is not correct.
Thank You
Venkat Anampudi
01-16-2020 02:47 AM
Starting from spark version 2.1.0,"snappy" is the default compression and before that version "gzip" is default compression format in spark.
10-01-2019 02:10 AM
spark.sql("set spark.sql.parquet.compression.codec=gzip");
Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.
Click here to register and join today!
Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.