Liquid Clustering on a Feature Store Table Created with FeatureEngineeringClient

Direo — Wed, 04 Sep 2024 09:56:04 GMT

Hello everyone,

I'm exploring ways to perform clustering on a feature store table that I've created using the FeatureEngineeringClient in Databricks, and I'm particularly interested in applying liquid clustering to one of the columns.

Here’s the scenario:

I created a feature store table using the following code:

from databricks.feature_engineering import FeatureEngineeringClient, FeatureLookup

# Initialize the FeatureEngineeringClient
fe = FeatureEngineeringClient()

# Define the feature store table with primary key and schema
fe.create_table(
name=table_name,
primary_keys=["wine_id"],
schema=features_df.schema,
description="wine features"
)

# Write data to the feature store table
fe.write_table(
name=table_name,
df=features_df,
mode="merge"
)

Now that I have the feature store table in place with various features, I'd like to apply liquid clustering to one of the columns (or multiple columns).

My Question:

How can I implement liquid clustering on this feature store table in Python? I know that I can enable liquid clustering on an existing unpartitioned Delta table using the following syntax:

ALTER TABLE <table_name>
CLUSTER BY (<clustering_columns>)

but that requires SQL.

Any help or code examples on this would be greatly appreciated!

Thank you!

Re: Liquid Clustering on a Feature Store Table Created with FeatureEngineeringClient

Sidhant07 — Mon, 09 Dec 2024 08:58:49 GMT

Hi,

# Set the table name and clustering columns table_name = "feature_store_table" clustering_columns = ["column1", "column2"] # Build the SQL command sql_command = f"ALTER TABLE {table_name} CLUSTER BY ({', '.join(clustering_columns)})" # Execute the SQL command spark.sql(sql_command)

topic Liquid Clustering on a Feature Store Table Created with FeatureEngineeringClient in Data Engineering

Liquid Clustering on a Feature Store Table Created with FeatureEngineeringClient

My Question:

Re: Liquid Clustering on a Feature Store Table Created with FeatureEngineeringClient