Understanding Photon Row Group Skipping
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ01-24-2025 01:09 AM
Hey guys!
I am using Photon to do a simple point query on a Liquid Clustered table with the purpose of understanding the statistics.
I see that a significant number of files have been pruned (`files pruned`: 1104, `files read`:files read).
However I am not sure I understand what is happening at the row group level. Here are some statistics from Spark UI:
โWhat does "row groups skipped via lazy materialization" mean? Are the rows actually read or not? There is clearly filtering happening at the row or row group level but I don't understand how this works in this simple case.
Thoughts?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ01-30-2025 12:01 AM
Hi @tomvogel01 ,
"row groups skipped via lazy materialization" refers to the process where certain row groups are not physically read into memory during query execution. This is due to the ability of Photon to perform filtering at the row group level, which means that if a row group does not contain any rows that satisfy the query conditions, it can be skipped entirely.

