Shapley Value

Before introducing SHAP (SHapley Additive exPlanations), let take a glance on Shapley value which may be a solution concept in cooperative theory of games .

https://miro.medium.com/max/490/1*i1hBNV_nM-qQoOFOvNz8sg.png

Let take a development team as an example. Our target goes to deliver a deep learning model which must finish 100 line of codes while we’ve 3 data scientists (L, M, N). 3 of them must work together so as to deliver the project. Given that:

https://miro.medium.com/max/221/1*ArZVijQ-I_nNpn-0pukSpg.png
https://miro.medium.com/max/949/1*DLL5sCQKeVXboAYIvdgwUw.png

We have 3 player therefore the total combination is 3! Which is 6? The above tables show the contribution consistent with different order of coalition.

https://miro.medium.com/max/492/1*uGjQRe9U0ebC5HxYXAzg3A.png

According to the Sherley value formula, we’ve the above tables. Although the potential of M is 6 times greater than N (30 vs 5), M should get 41.7% of reward while N should get 24.17% reward.

Shapley Additive explantions (SHAP)

The idea is using theory of games to interpret target model. All features are “contributor” and trying to predict the task which is “game” and therefore the “reward” is actual prediction minus the result from explanation model.

In SHAP, feature importance is assigned to each feature which is like mentioned contribution. Let take automobile loan (car loan) as an example. We’ve “New Driver”, “Has Children”, “4 Door” and “Age”.

https://miro.medium.com/max/409/1*fdqZ1XivRBZzuuvyv8yvhw.png

Theoretically, number of combination is 2^n, where n is number of feature. as long as we would like to understand the Shapley value of the “Age”. we’ll predict all of the subsequent combination with and without “Age” feature. there’s some optimization mentioned

By using the the Shapley formula, SHAP will compute all above scenario and returning the typical contribution and . In other word, it’s not talking about the difference when the actual feature missed.

Use Case

SHAP provides multiple explainers for various quite models.

TreeExplainer: Support XGBoost, LightGBM, CatBoost and scikit-learn models by Tree SHAP.

DeepExplainer (DEEP SHAP): Support TensorFlow and Keras models by using DeepLIFT and Shapley values.

GradientExplainer: Support TensorFlow and Keras models.

KernelExplainer (Kernel SHAP): Applying to any models by using LIME and Shapley values.

The following sample code will show how we will use DeepExplainer and KernelExplainer to elucidate text classification problem.

DeepExplainer

explainer = shap.DeepExplainer(pipeline.model, encoded_x_train[:10])

shap_values = explainer.shap_values(encoded_x_test[:1])

x_test_words = prepare_explanation_words(pipeline, encoded_x_test)

y_pred = pipeline.predict(x_test[:1])

print(‘Actual Category: %s, Predict Category: %s’ % (y_test[0], y_pred[0]))

shap.force_plot(explainer.expected_value[0], shap_values[0][0], x_test_words[0])

KernelExplainer

kernel_explainer = shap.KernelExplainer(pipeline.model.predict, encoded_x_train[:10])

kernel_shap_values = kernel_explainer.shap_values(encoded_x_test[:1])

x_test_words = prepare_explanation_words(pipeline, encoded_x_test)

y_pred = pipeline.predict(x_test[:1])

print(‘Actual Category: %s, Predict Category: %s’ % (y_test[0], y_pred[0]))

shap.force_plot(kernel_explainer.expected_value[0], kernel_shap_values[0][0], x_test_words[0])

Takeaway

To access all code, you’ll visit my github repo.

When you read Christoph’s Blog, you’ll ask above code for Shapley Value Explanations part.

Shapley value is that the average contribution of features which are predicting in several situation. In other word, it’s not talking about the difference when the actual feature missed.

SHAP includes multiple algorithms. you’ll inspect paper for more detail on LIME, DeepLIFT, Sharpley value calculations.

It is possible that DeepExplainer and KernelExplainer introduce different results.RL.

For more information on SHAP visit their official website here