cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Unable to apply liquid clustering to a materialized view

sebih
New Contributor II

Hi everyone,

I am trying to create a materialized view with liquid clustering using the code below. However, I realized that the query performance is slower than that of a streaming table with the same data, liquid clustering, and structure. It appears that liquid clustering is not present when I check the materialized view's metadata information. See the related screenshot. When I created the table as a streaming table, I could see that liquid clustering was applied successfully.

Thanks in advance.

sebih_0-1768820672926.png

@DP.materialized_view(
    name="final_table",
    cluster_by=["date"],
    cluster_by_auto=True,
    table_properties={
        "delta.autoOptimize.autoCompact": "auto",
        "delta.autoOptimize.optimizeWrite": "true"
    }
)
def final_table():
    return (
        spark.read.table("my_table_1")
        .unionByName(spark.read.table("my_table_2").drop("id"), allowMissingColumns=True)
    )
1 REPLY 1

szymon_dybczak
Esteemed Contributor III

Hi @sebih ,

Automatic liquid clustering might not select keys for the following reasons:

 

- The table is too small to benefit from liquid clustering.

- You can apply automatic liquid clustering for all Unity Catalog managed tables, regardless of data and query characteristics. The heuristics decide whether it's cost-beneficial to select clustering keys.

https://docs.databricks.com/aws/en/delta/clustering#how-automatic-liquid-clustering-works