cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Model flavour using feature store model training log_model()

Edna
New Contributor

Hi I'm have succesfully registered my model using the feature engineering client with the following codes:

with mlflow.start_run():
    # Calculate the ratio of negative class samples to positive class samples
    ratio = (len(y_train) - y_train.sum()) / y_train.sum()

    # Fit model
    xgb_model = xgb.XGBClassifier(scale_pos_weight=ratio)
    xgb_model.fit(X_train, y_train)

    fe.log_model(
      model=xgb_model,
      artifact_path=MODEL_NAME,
      flavor=mlflow.sklearn,
      training_set=training_set,
      registered_model_name=MODEL_NAME
    )

There are two questions:

1. Why is the model still shown as pyfunc in the model registry when the flavor I specified was mlflow.sklearn?

2.  Can I use the following codes for prediction:

model = mlflow.sklearn.load_model(model_version_uri)

# Predict with model
prob_pred = model.predict_proba(df)[:, 1]

or do I must use score_batch()? As I would need prediction to be probabilities instead of 1/0s.

Thanks!

#model_flavor #feature_store #score_batch #xgboost #sklearn

 

 

 

1 REPLY 1

Kumaran
Valued Contributor III
Valued Contributor III

Hello @Edna 

Thank you for contacting Databricks community support.

MLflow allows you to save models using different "flavors," which are essentially different ways of serializing and deserializing models. When you specify flavor=mlflow.sklearn, you're telling MLflow to save the model using the scikit-learn flavor.

However, when you register the model in the model registry, MLflow will automatically create a pyfunc version of the model in addition to the scikit-learn version. This is because pyfunc is a generic flavor that can be used to load and serve models in a variety of environments, regardless of the flavor used to save the model.

So even though you specified flavor=mlflow.sklearn, the model will still be shown as pyfunc in the model registry. This is expected behavior and allows the model to be easily deployed in a variety of environments.

If you want to deploy the model using the scikit-learn flavor specifically, you can do so by specifying the flavor when you load the model from the registry. For example:

 

import mlflow
import xgboost as xgb

Load the model using the scikit-learn flavor
model = mlflow.sklearn.load_model(f"models:/{MODEL_NAME}/1")

Use the model to make predictions
predictions = model.predict(X_test)

In this example, mlflow.sklearn.load_model() is used to load the model using the scikit-learn flavor, even though the model is registered as a pyfunc in the model registry.