cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Shap Values for predictions from registered model

Evan_MCK
Contributor

I have saved a model in the model registry using MLFlow. How can I find the shap values for this model once I have generated predictions in batch mode? 

Shap tree explainer does not support the mlflow pyfunc model type. When I use mlflow.shap.log_explanation(model.predict, data), I get an error that Provided model function fails when applied to the provided data set. It seems to indicate the required columns are missing. All the required columns were included and when I run predictions with the model it generates predictions. model.predict(data).

1 ACCEPTED SOLUTION

Accepted Solutions

Thanks for your help. I was able to figure it out from the documentation but adjustments were needed. The model which was based on the Data Bricks auto ML model was really an sk learn pipeline. I have no y values as this is prediction data, not a test set. I needed to use the mlflow sklearn model.

model = mlflow.sklearn.load_model(model_uri)

For Shap tree explainer (shown in the documentation) I needed to use the tree explainer as the model in the pipeline and manually run the other parts of the pipeline. Like this:

explainer = shap.TreeExplainer(model['regressor'])

observations = model["column_selector"].transform(prediction_data)

observations = model["standardizer"].transform(observations)

shap_values = explainer.shap_values(observations)

Another option was not to use the tree explainer: see: https://towardsdatascience.com/using-shap-values-to-explain-how-your-machine-learning-model-works-73...

explainer = shap.Explainer(model.predict, prediction_data)

shap_values = explainer(prediction_data, max_evals = 2000)

View solution in original post

4 REPLIES 4

Debayan
Esteemed Contributor III

Hi, Could you please check if this section helps in the below documentation:

https://www.databricks.com/blog/2019/06/17/detecting-bias-with-shap.html

image

Thanks for your help. I was able to figure it out from the documentation but adjustments were needed. The model which was based on the Data Bricks auto ML model was really an sk learn pipeline. I have no y values as this is prediction data, not a test set. I needed to use the mlflow sklearn model.

model = mlflow.sklearn.load_model(model_uri)

For Shap tree explainer (shown in the documentation) I needed to use the tree explainer as the model in the pipeline and manually run the other parts of the pipeline. Like this:

explainer = shap.TreeExplainer(model['regressor'])

observations = model["column_selector"].transform(prediction_data)

observations = model["standardizer"].transform(observations)

shap_values = explainer.shap_values(observations)

Another option was not to use the tree explainer: see: https://towardsdatascience.com/using-shap-values-to-explain-how-your-machine-learning-model-works-73...

explainer = shap.Explainer(model.predict, prediction_data)

shap_values = explainer(prediction_data, max_evals = 2000)

sean_owen
Honored Contributor II

Yes, TreeExplainer only works on the tree-based model itself. That's fine and the way to use it if you literally only want to explain the model, not the pipeline. If you want to explain anything else like a PIpeline or custom pyfunc model, you need to use KernelExplainer in SHAP (think it's just called Explainer now, yes). It's much slower but can operate on anything.

peacocktv
New Contributor II

Thank you so much for this kind of valuable post its amazing post it may helpful for each visitors. For more information go through my websites here:
peacocktv.com/tv | peacocktv.com/tv/xbox

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group