cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to get probability score for each prediction from mlflow

Zoumana
New Contributor II

I trained my model and was able to get the batch prediction from that model as specified below. But I want to also get the probability scores for each prediction. Do you have any idea?

Thank you!

logged_model = path_to_model

# Load model as a PyFuncModel.

loaded_model = mlflow.pyfunc.load_model(logged_model)

# Predict on a Pandas DataFrame.

import pandas as pd

loaded_model.predict(pd.DataFrame(data))

1 ACCEPTED SOLUTION

Accepted Solutions

SyedGhouri
New Contributor III

Hi @Kaniz Fatma​ 

The error said 'PyFuncModel' object has no attribute 'predict_proba'.

As shows above, I was using the following to load the model

loaded_model = mlflow.pyfunc.load_model(logged_model) and got the error.

After going through mlflow documentation, I changed it to

loaded_model = mlflow.sklearn.load_model(logged_model) and it is working fine.

It's all good now. Thanks for your time.

Syed

View solution in original post

5 REPLIES 5

Zoumana
New Contributor II

Hi Kaniz,

Great to meet you too!

Thank you for replying to my question.

Best!

SyedGhouri
New Contributor III

Hi @Kaniz Fatma​ 

Sorry for hijacking the post.

My question is - if I am reading a registered model from mlflow, I can only see the option of .predict method but not .predict_proba.

Do we have any straightforward solution to get the probabilities?

Thanks

Syed

SyedGhouri
New Contributor III

Hi @Kaniz Fatma​ 

The error said 'PyFuncModel' object has no attribute 'predict_proba'.

As shows above, I was using the following to load the model

loaded_model = mlflow.pyfunc.load_model(logged_model) and got the error.

After going through mlflow documentation, I changed it to

loaded_model = mlflow.sklearn.load_model(logged_model) and it is working fine.

It's all good now. Thanks for your time.

Syed

SyedGhouri
New Contributor III

Hi @Kaniz Fatma​ 

I do not see the option to select "Best Answer" but feel free to do anything that you think can help this community.

Thanks

Syed

OndrejHavlicek
New Contributor III

Now you can log the model using this parameter:

mlflow.sklearn.log_model(
    ...,  # the usual params
    pyfunc_predict_fn="predict_proba"
)

 which will return probabilities for the first class apparently when using the model for inference (e.g. when loading it using mlflow.pyfunc.spark_udf() ).

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group