Recipes for Model Interpretability
Linear Regression
Assumptions
- Target $y$ and features $X$ have a linear relationship (else not modelable)
- No correlation between features in $X$ (else coefficients have numerical instability)
- Residuals are normally distributed
- Trendline is homoscedastic -- variance is the same everywhere on the line
Interpretation
- Product of coefficient and feature value $c_1x_1$ is the marginal contribution of $x_1$ in terms of $y$
- Sum up products to get the estimated value for target $y = \sum{c_ix_i}_{i=0}^n$
- Interaction terms $x_ix_j$ allow for two variables to have a non-additive effect on the target
import xgboost
import shap
shap.initjs()
# train an XGBoost model
X, y = shap.datasets.boston()
model = xgboost.XGBRegressor().fit(X, y)
# explain the model's predictions using SHAP
# (same syntax works for LightGBM, CatBoost, scikit-learn, transformers, Spark, etc.)
explainer = shap.Explainer(model)
shap_values = explainer(X)
# visualize the first prediction's explanation
shap.plots.waterfall(shap_values[0])
SHAP
- Maybe better to use DALEX, which includes a SHAP implementation. It is better maintained at this point than the original SHAP from slundberg (over 1k issues open)
- Uses a game theory to attribute predictions to features in a model
- Seems to be used extensively now, but some debate on soundness: