implementing liquid clustering for DataFrames directly
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-19-2023 02:01 PM
Hi !! I have a question is it possible to implementing liquid clustering for DataFrames directly saved to delta files (df.write.format("delta").save("path")), The conventional approach involving table creation
Labels:
- Labels:
-
Delta Lake
1 REPLY 1
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-05-2024 08:10 PM
Hi,
Hopefully this question is related to testing and any production data would get persisted to a table but one example is:
df = (
spark.range(10)
.write
.format("delta")
.mode("append")
.save("file:/tmp/data")
)
ALTER TABLE delta.`file:/tmp/data` CLUSTER BY (id);
DESC DETAIL delta.`file:/tmp/data`
OPTIMIZE delta.`file:/tmp/data`;
Thanks.

