cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Shap Values for predictions from registered model

Evan_MCK
Contributor

I have saved a model in the model registry using MLFlow. How can I find the shap values for this model once I have generated predictions in batch mode? 

Shap tree explainer does not support the mlflow pyfunc model type. When I use mlflow.shap.log_explanation(model.predict, data), I get an error that Provided model function fails when applied to the provided data set. It seems to indicate the required columns are missing. All the required columns were included and when I run predictions with the model it generates predictions. model.predict(data).

1 ACCEPTED SOLUTION

Accepted Solutions

Thanks for your help. I was able to figure it out from the documentation but adjustments were needed. The model which was based on the Data Bricks auto ML model was really an sk learn pipeline. I have no y values as this is prediction data, not a test set. I needed to use the mlflow sklearn model.

model = mlflow.sklearn.load_model(model_uri)

For Shap tree explainer (shown in the documentation) I needed to use the tree explainer as the model in the pipeline and manually run the other parts of the pipeline. Like this:

explainer = shap.TreeExplainer(model['regressor'])

observations = model["column_selector"].transform(prediction_data)

observations = model["standardizer"].transform(observations)

shap_values = explainer.shap_values(observations)

Another option was not to use the tree explainer: see: https://towardsdatascience.com/using-shap-values-to-explain-how-your-machine-learning-model-works-73...

explainer = shap.Explainer(model.predict, prediction_data)

shap_values = explainer(prediction_data, max_evals = 2000)

View solution in original post

4 REPLIES 4

Debayan
Esteemed Contributor III
Esteemed Contributor III

Hi, Could you please check if this section helps in the below documentation:

https://www.databricks.com/blog/2019/06/17/detecting-bias-with-shap.html

image

Thanks for your help. I was able to figure it out from the documentation but adjustments were needed. The model which was based on the Data Bricks auto ML model was really an sk learn pipeline. I have no y values as this is prediction data, not a test set. I needed to use the mlflow sklearn model.

model = mlflow.sklearn.load_model(model_uri)

For Shap tree explainer (shown in the documentation) I needed to use the tree explainer as the model in the pipeline and manually run the other parts of the pipeline. Like this:

explainer = shap.TreeExplainer(model['regressor'])

observations = model["column_selector"].transform(prediction_data)

observations = model["standardizer"].transform(observations)

shap_values = explainer.shap_values(observations)

Another option was not to use the tree explainer: see: https://towardsdatascience.com/using-shap-values-to-explain-how-your-machine-learning-model-works-73...

explainer = shap.Explainer(model.predict, prediction_data)

shap_values = explainer(prediction_data, max_evals = 2000)

sean_owen
Honored Contributor II
Honored Contributor II

Yes, TreeExplainer only works on the tree-based model itself. That's fine and the way to use it if you literally only want to explain the model, not the pipeline. If you want to explain anything else like a PIpeline or custom pyfunc model, you need to use KernelExplainer in SHAP (think it's just called Explainer now, yes). It's much slower but can operate on anything.

peacocktv
New Contributor II

Thank you so much for this kind of valuable post its amazing post it may helpful for each visitors. For more information go through my websites here:
peacocktv.com/tv | peacocktv.com/tv/xbox

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.