topic Re: Reading snappy.parquet in Data Engineering

Reading snappy.parquet

Hritik_Moon — Mon, 13 Oct 2025 09:58:08 GMT

I stored a dataframe as delta in the catalog. It created multiple folders with snappy.parquet files. Is there a way to read these snappy.parquet files.

it reads with pandas but with spark it gives error "incompatible format"

Re: Reading snappy.parquet

Khaja_Zaffer — Mon, 13 Oct 2025 10:17:16 GMT

Hello good day @Hritik_Moon

That incompatible format is expected as when you try to read in parquet because of presence of delta_log created with delta format which follows acid principals its like AnalysisException.

recommended would be read in delta format only

else: the alternative would be copy those .snappy.parquet files or file into a desired folder and read them seperately.

Let me share a medium article I found for this issue:
https://medium.com/%40ishanpradhan/how-to-read-a-snappy-parquet-file-in-databricks-696538cd0efc

Thank you.
I am waiting for the solution from other contributors as well. they can share their approach.

Re: Reading snappy.parquet

Prajapathy_NKR — Fri, 17 Oct 2025 04:24:05 GMT

@Hritik_Moon

Try to read the file as delta.

path/delta_file_name/
- parquet files
- delta_log/

since you are using spark, use this, spark.read.format("delta").load("path/delta_file_name").

Delta internally stores the data as parquet and delta log contains the metadata of transactions. You don't need to touch these files unless you are experimenting. 🙂

For more info, please go through this, https://docs.databricks.com/aws/en/delta/tutorial.

Hope this solved your issue.