cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Understanding Photon Row Group Skipping

tomvogel01
New Contributor II

Hey guys!

I am using Photon to do a simple point query on a Liquid Clustered table with the purpose of understanding the statistics. 

I see that a significant number of files have been pruned (`files pruned`: 1104, `files read`:files read).

However I am not sure I understand what is happening at the row group level. Here are some statistics from Spark UI:

Screenshot 2025-01-24 at 10.07.05.png

 What does "row groups skipped via lazy materialization" mean? Are the rows actually read or not? There is clearly filtering happening at the row or row group level but I don't understand how this works in this simple case.

Thoughts?

1 REPLY 1

Sidhant07
Databricks Employee
Databricks Employee

Hi @tomvogel01 ,

"row groups skipped via lazy materialization" refers to the process where certain row groups are not physically read into memory during query execution. This is due to the ability of Photon to perform filtering at the row group level, which means that if a row group does not contain any rows that satisfy the query conditions, it can be skipped entirely.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now