Databricks Community

haseeb2001 · ‎02-16-2024

Hi,

I am using a spark pipeline having stages VectoreAssembler, StandardScalor, StringIndexers, VectorAssembler, GbtClassifier. And then logging this pipeline using feature store log_model function as follows:

fe = FeatureStoreClient() // I have tried this using FeatureStoreEngineeringClient too

After defining lookups and creating a training_set, I am logging this model using:

fe.log_model ( model=model_pipeline, artifact_path = "test_model", flavor = mlflow.spark, training_set = training_set, registered_model_name = "registery_name")

After logging this model, I am using fe.score function to get results on my test data. But I am getting the following error:

haseeb2001 · ‎02-19-2024

Hi @Retired_mod , thanks for your response.

The issue I am facing is during fe.score_batch. I have tried logging this pipeline using mlflow only and then tested it for inference too and it worked fine. The issue appears only when I use feature store batch scoring.

I have noticed that when I applied score it used python_function as the backend flavor, while I have registered my model using spark flavor. Any thoughts on this?

Databricks Community

Feature Store with Spark Pipeline

Connect with Databricks Users in Your Area

Databricks Learning Festival (Virtual): 15 January - 31 January 2025

Milestone: DatabricksTV Reaches 100 Videos!

Announcing the new Meta Llama 3.3 model on Databricks

Databricks Community Champion - December 2024 - Sujesh Menon

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences