Hello everyone,
I'm exploring ways to perform clustering on a feature store table that I've created using the FeatureEngineeringClient in Databricks, and I'm particularly interested in applying liquid clustering to one of the columns.
Here’s the scenario:
I created a feature store table using the following code:
from databricks.feature_engineering import FeatureEngineeringClient, FeatureLookup
# Initialize the FeatureEngineeringClient
fe = FeatureEngineeringClient()
# Define the feature store table with primary key and schema
fe.create_table(
name=table_name,
primary_keys=["wine_id"],
schema=features_df.schema,
description="wine features"
)
# Write data to the feature store table
fe.write_table(
name=table_name,
df=features_df,
mode="merge"
)
Now that I have the feature store table in place with various features, I'd like to apply liquid clustering to one of the columns (or multiple columns).
My Question:
How can I implement liquid clustering on this feature store table in Python? I know that I can enable liquid clustering on an existing unpartitioned Delta table using the following syntax:
ALTER TABLE <table_name>
CLUSTER BY (<clustering_columns>)
but that requires SQL.
Any help or code examples on this would be greatly appreciated!
Thank you!