Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-04-2025 04:34 AM - edited 06-04-2025 04:36 AM
Hi @korasino
- Liquid Clustering (LC) tightens file-level min/max stats on UUIDv4, but since Photon already handles dynamic pruning and data skipping using bloom-style filters and table stats, LC adds little to no benefit for point lookups (WHERE uuid = ...) or joins.
- Because UUIDv4 values are random, LC distributes data evenly across files, which can actually hurt clustering on more useful columns like, timestamps reducing performance for time-based queries.
- Photon also handles join filtering efficiently, so LC on UUIDv4 doesn’t help reduce shuffle or I/O further in join-heavy workloads.
- Instead, LC is best used on naturally ordered columns like event timestamps or UUIDv7, where it can meaningfully improve query performance. For UUIDv4, relying on Photon alone is typically the better approach.