{"id":21108,"date":"2019-02-13T11:10:18","date_gmt":"2019-02-13T10:10:18","guid":{"rendered":"https:\/\/www.inovex.de\/blog\/?p=15078"},"modified":"2025-03-19T07:30:05","modified_gmt":"2025-03-19T06:30:05","slug":"machine-learning-interpretability","status":"publish","type":"post","link":"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/","title":{"rendered":"Machine Learning Interpretability: Do You Know What Your Model Is Doing?"},"content":{"rendered":"<p>Machine learning has a great potential to improve data products and business processes. It is used to propose products and <a href=\"https:\/\/open.blogs.nytimes.com\/2015\/08\/11\/building-the-next-new-york-times-recommendation-engine\/\">news articles<\/a> that we might be interested in as well as to steer autonomous vehicles and to <a href=\"https:\/\/en.wikipedia.org\/wiki\/AlphaGo\">challenge human experts in non-trivial games<\/a>. Although machine learning models perform extraordinary well in solving those tasks, we need to be aware of the latent risks that arise through inadvertently <a href=\"https:\/\/www.technologyreview.com\/s\/608986\/forget-killer-robotsbias-is-the-real-ai-danger\/\">encoding bias<\/a>, responsible for <a href=\"http:\/\/proceedings.mlr.press\/v81\/buolamwini18a\/buolamwini18a.pdf\">discriminating individuals<\/a> and <a href=\"https:\/\/hrdag.org\/usa\/\">strengthening preconceptions<\/a>, or mistakenly <a href=\"https:\/\/www.inovex.de\/blog\/causal-inference-and-propensity-score-methods\">taking random correlation for causation<\/a>. In her book <a href=\"https:\/\/weaponsofmathdestructionbook.com\">&#8222;Weapons of Math Destruction&#8220;<\/a>, Cathy O&#8217;Neil even went so far as to say that improvident use of algorithms can perpetuate inequality and threaten democracy. <a href=\"https:\/\/en.wikipedia.org\/wiki\/Filter_bubble\">Filter bubbles<\/a>, <a href=\"https:\/\/en.wikipedia.org\/wiki\/Tay_(bot)\">racist chat bots<\/a>, and <a href=\"https:\/\/www.cs.cmu.edu\/~sbhagava\/papers\/face-rec-ccs16.pdf\">foolable face detection<\/a> are prominent examples of malicious outcomes of learning algorithms. With great power comes great responsibility\u2014wise words that every practitioner should keep in mind.<\/p>\n<p>With the adoption of GDPR, there are now EU-wide regulations concerning automated individual decision-making and profiling (Art. 22, also termed &#8222;right to explanation&#8220;), engaging companies to give individuals information about processing, to introduce ways for them to request intervention and to even carry out regular checks to make sure that the systems are working <a href=\"https:\/\/arxiv.org\/abs\/1606.08813\">as intended<\/a>. Recent research in computational ethics propose to raise awareness to optimization criteria like <a href=\"https:\/\/nickbostrom.com\/ethics\/artificial-intelligence.pdf\">fairness<\/a>, <a href=\"https:\/\/doi.org\/10.1007\/978-3-642-32378-2_8\">safety<\/a>\u00a0(<a href=\"https:\/\/arxiv.org\/abs\/1606.06565\">2<\/a>) and <a href=\"https:\/\/arxiv.org\/abs\/1602.04938\">transparency<\/a> when developing machine learning models. However, unlike usual performance metrics, these constraints are much harder if not impossible to quantify. Nevertheless, designing decision systems that are understandable not only for their creators, but also for their customers and users, is key to achieving trust and acceptance in the mainstream.<!--more--><\/p>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_83 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\"><p class=\"ez-toc-title\" style=\"cursor:inherit\"><\/p>\n<\/div><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#The-Problem-with-Blackboxes\" >The Problem with Blackboxes<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#Interpretability-and-Explanation-Techniques\" >Interpretability and Explanation Techniques<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#Explaining-a-Blackbox-Model\" >Explaining a Blackbox Model<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#Data-Import-and-Analysis\" >Data Import and Analysis<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#Data-Modelling\" >Data Modelling<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#Interpretation\" >Interpretation<\/a><ul class='ez-toc-list-level-4' ><li class='ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#Feature-Importance\" >Feature Importance<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#Feature-Effects\" >Feature Effects<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#Interaction-Strength\" >Interaction Strength<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-4'><a class=\"ez-toc-link ez-toc-heading-10\" href=\"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#Global-Surrogate\" >Global Surrogate<\/a><\/li><\/ul><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-11\" href=\"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#Conclusion\" >Conclusion<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-12\" href=\"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#Read-on\" >Read on<\/a><\/li><\/ul><\/nav><\/div>\n<h2><span class=\"ez-toc-section\" id=\"The-Problem-with-Blackboxes\"><\/span>The Problem with Blackboxes<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>In recent years, ever falling cost of memory allows companies to collect huge amounts of data, while the evolution of distributed systems enables data\u00a0processing at large scale. Consequently, with the (re-)emergence of deep learning and the maturity of dedicated machine learning frameworks, we are now able to tackle complex non-linear problems within high dimensional domains like NLP or computer vision. The underlying models are usually quite complex in terms of structure, containing a huge number of parameters to optimize across several interaction layers. In the following, these types of models are referred to as blackboxes, since humans are not able to explain their behaviour by simply looking at their internals.<\/p>\n<p>The problem with blackboxes is the lack of trust caused by their opaque nature. A decision system should be doing the right thing in the right way but we are usually not able to guarantee that a certain prediction is derived in a way that it should have been. Consequently, it is hard to predict the models&#8216; future behaviour and to fix it in a targeted way in case of failure. The <a href=\"https:\/\/arxiv.org\/abs\/1602.04938\">Dog-or-Wolf-classifier<\/a>, described by Ribeiro et al., which turned out to be nothing else but a snow detector on steroids, is an illustrative example of a model that, despite of its predictive power, is not aligned with its problem domain. Furthermore, even seemingly robust classifiers are fooled by adversarial examples which are synthetic instances like generated images that are <a href=\"https:\/\/arxiv.org\/abs\/1412.6572\">optical illusions to the algorithm<\/a>.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Interpretability-and-Explanation-Techniques\"><\/span>Interpretability and Explanation Techniques<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Interpretability in the context of machine learning describes the process of revealing causes of predictions and explaining a derived decision in a way that is understandable to humans. The ability to understand the causes that lead to a certain prediction enables data scientists to ensure that a model is consistent to the domain knowledge of an expert. An intuitive definition of interpretability in the context of machine learning is provided by Been Kim and Finale Doshi-Velez in &#8222;Towards A Rigorous Science of Interpretable Machine Learning&#8220;, where they describe it as the ability to explain or to present [a models decision process] in understandable terms <a href=\"https:\/\/arxiv.org\/abs\/1702.08608\">to a human<\/a>. This usually means at least identifying the most relevant features and their kind of influence (e.g. linear, monotone, etc.) to the models&#8216; predictions. In the context of machine learning, we can think of explanations as vehicles that facilitate interpretability.<\/p>\n<p>Predictive models can roughly be distinguished between <a href=\"https:\/\/christophm.github.io\/interpretable-ml-book\">intrinsically interpretable models and non-interpretable &#8222;blackbox&#8220; models<\/a>. Intrinsically interpretable models are known to be easy for humans to understand. An example of an interpretable model is a decision tree, since it exhibits an intuitive, rule-based decision process. In contrast to that, neural networks can be classified as blackboxes due to their complex internal structure, which is tremendously harder for humans to grasp. Having said that, the distinction between interpretable and non-interpretable models is not obvious. Interpretable models are usually simple and somehow self-describing, showing moderate predictive performance. On the other hand, blackbox models have much better accuracy at the cost of\u00a0comprehensibility. This is the reason why a trade-off between accuracy and interpretability has to be made in many cases.<\/p>\n<p>There are basically two practices to approach interpretability:<\/p>\n<ol>\n<li style=\"font-weight: 400;\">Use intrinsically interpretable models<\/li>\n<li style=\"font-weight: 400;\">Apply post-hoc interpretability techniques to an existing model<\/li>\n<\/ol>\n<p>The first approach is restricted to certain types of rather simplistic, mostly linear or rule-based models or to models that meet specific constraints regarding sparsity or monotonicity. It is widely adopted in the industry, but it leads data scientists to applying over-simplistic models to complex tasks. Furthermore, even models that are known to be easy for humans to understand can become vastly complex. Think about a linear model that relies on heavily engineered features or a deep and widely nested decision tree. To read more about criticism and misconceptions about interpretability, see <a href=\"https:\/\/arxiv.org\/abs\/1606.03490\">Zachary C. Liptons comprehensive recap<\/a> about &#8222;The Mythos of Model Interpretability&#8220;.<\/p>\n<p>The second approach is more flexible since it is applicable to any model. Post-hoc techniques attempt to disaggregate a models predictions in order to identify the main drivers of its decision process. This is usually achieved by varying the input and evaluating changes in the output. The downside of post-hoc techniques is the increased complexity of the prototyping workflow. Furthermore, most interpretability techniques are approximate, hence providing potentially unstable estimates of explanations.<\/p>\n<p>Post-hoc interpretability techniques can be further divided into techniques with a global scope and techniques with a local scope. Global techniques attempt to explain the entire model behaviour. In contrast to that, local techniques explain single predictions or the models behaviour within a closed region. The following table summarizes several post-hoc techniques along with their respective categories:<\/p>\n<table style=\"height: 120px; border-style: none; margin: 15px; border-color: #000000;\" cellpadding=\"5px\">\n<tbody>\n<tr style=\"height: 24px;\">\n<td style=\"width: 102px; height: 24px; padding: 5px; border-style: solid; border-color: #c4c4c4;\"><\/td>\n<td style=\"width: 546px; height: 24px; background-color: #bed8ed; padding: 5px; border-style: solid; border-color: #c4c4c4;\">Global<\/td>\n<td style=\"width: 378px; height: 24px; background-color: #bed8ed; padding: 5px; border-style: solid; border-color: #c4c4c4;\">Local<\/td>\n<\/tr>\n<tr style=\"height: 48px;\">\n<td style=\"width: 102px; height: 48px; background-color: #bed8ed; padding: 5px; border-style: solid; border-color: #c4c4c4;\">Model-specific<\/td>\n<td style=\"width: 546px; height: 48px; padding: 5px; border-style: solid; border-color: #c4c4c4;\">Model Internals<\/p>\n<p>Intrinsic Feature Importance<\/td>\n<td style=\"width: 378px; height: 48px; padding: 5px; border-style: solid; border-color: #c4c4c4;\">Rule Sets (tree structure)<\/td>\n<\/tr>\n<tr style=\"height: 48px;\">\n<td style=\"width: 102px; height: 48px; background-color: #bed8ed; padding: 5px; border-style: solid; border-color: #c4c4c4;\">Model-agnostic<\/td>\n<td style=\"width: 546px; height: 48px; padding: 5px; border-style: solid; border-color: #c4c4c4;\">Partial Dependence Plots<\/p>\n<p>Permutation-based Feature Importance<\/p>\n<p>Global Surrogate Models<\/td>\n<td style=\"width: 378px; height: 48px; padding: 5px; border-style: solid; border-color: #c4c4c4;\">Individual Conditional Expectations<\/p>\n<p>Local Surrogate Models<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Model internals refer to a self-describing structure of an interpretable model, like a decision tree or a linear model. Intrinsic feature importance estimates are usually calculated based on a specific model structure. This can be done by weighting split variables of a tree ensemble or comparing coefficients of a linear model.\u00a0Model-agnostic techniques are especially useful when dealing with more complex models. In the following, I&#8217;ll apply some rather basic yet powerful techniques to explain a blackbox model.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Explaining-a-Blackbox-Model\"><\/span>Explaining a Blackbox Model<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>In the following, we will create a blackbox model to solve a regression task and explain its behaviour by applying post-hoc interpretability techniques. We will use the <a href=\"https:\/\/cran.r-project.org\/web\/packages\/iml\/\">iml<\/a> package which provides many tools to explain blackbox models in combination with the <a href=\"https:\/\/cran.r-project.org\/web\/packages\/mlr\/index.html\">mlr<\/a> package which provides a comprehensive and unified interface for prototyping. Both packages work seamlessly together and support many classification and regression models, e.g. random forests, neural networks, or GBMs.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Data-Import-and-Analysis\"><\/span>Data Import and Analysis<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Let&#8217;s first import all relevant libraries:<\/p>\n<pre class=\"lang:r decode:true\">library(tidyverse)\r\n\r\nlibrary(psych)\r\n\r\nlibrary(gridExtra)\r\n\r\nlibrary(mlr)\r\n\r\nlibrary(kernlab)\r\n\r\nlibrary(mlbench)\r\n\r\nlibrary(iml)\r\n\r\ntheme_set(theme_minimal())\r\n\r\nset.seed(4711)<\/pre>\n<p>We will use the popular <a href=\"https:\/\/search.r-project.org\/CRAN\/refmans\/mlbench\/html\/BostonHousing.html\">Boston Housing<\/a> dataset which is also used in the <a href=\"https:\/\/cran.r-project.org\/web\/packages\/iml\/vignettes\/intro.html\">iml tutorial<\/a>. The dataset contains housing data for 506 census tracts of Boston from the 1970 census. Each record describes a district of the city of Boston with features like the per capita crime rate by town (crim), the average number of rooms per dwelling (rm) and the nitric oxides concentration (nox). The variable we want to predict is the median house value (medv) of a particular district. It is basically a supervised regression task based on tabular data containing a reasonable number of features. The authors of the tutorial have chosen a random forest approach to predict the target variable. In contrast to that, we will try a Kernel-based SVM.<\/p>\n<p>Let&#8217;s import the data and do some basic analysis:<\/p>\n<pre class=\"lang:r decode:true\">data(BostonHousing, package = \"mlbench\")\r\n\r\npsych::describe(BostonHousing) %&gt;% select(n, mean, sd, median, min, max, range)\r\n\r\nBostonHousing %&gt;%\r\n\r\n na.omit() %&gt;%\r\n\r\n ggplot(aes(x = medv, stat(count))) +\r\n\r\n geom_density(alpha = 0.2, colour = \"black\", fill=\"grey\") +\r\n\r\n guides(fill = FALSE) +\r\n\r\n xlab(\"Median House Value\") +\r\n\r\n ylab(NULL) +\r\n\r\n theme(axis.text.y=element_blank())<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15126 size-large\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/00_eda-1024x384.png\" alt=\"Basic Analysis\" width=\"1024\" height=\"384\" \/><\/p>\n<p>The dataset contains 506 records, 12 numerical features, a categorical feature (chas) and the target variable (medv). Fortunately, there are no missing values or extreme outliers. The median house value ranges from 5,000 USD to 50,000 USD, showing most values between 15,000 USD and 25,000 USD. There is a small peak at the maximum which could be caused by setting an upper limit at 50,000 USD.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Data-Modelling\"><\/span>Data Modelling<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>We want to predict the median house value using a SVM based on a Gaussian Radial Basis Function (RBF). For this approach, there are basically two hyperparameters to tune:<\/p>\n<ul>\n<li>C: the cost of constraints violation, which weights regularization<\/li>\n<li>sigma: the inverse kernel width for the Gaussian RBF<\/li>\n<\/ul>\n<p>Let&#8217;s perform a repeated 10-fold cross validation for each parameter set of a parameter grid in order to get appropriate values for these hyperparameters:<\/p>\n<pre class=\"lang:r decode:true\"># specify regression task\r\n\r\ntask &lt;- makeRegrTask(\"BB\", data = BostonHousing, target = \"medv\")\r\n\r\n# specify parameter grid\r\n\r\nparamGrid &lt;- makeParamSet(\r\n\r\n makeDiscreteParam(\"C\", values = 10^seq(-1, 1, by=.5)),\r\n\r\n makeDiscreteParam(\"sigma\", values = 10^seq(-1, 0, by=.1)))\r\n\r\n# specify evaluation strategy\r\n\r\nevalStrategy &lt;- makeResampleDesc(\"RepCV\", folds = 10, reps = 3, predict = \"both\")\r\n\r\n# perform grid search\r\n\r\nres &lt;- tuneParams(\r\n\r\n \"regr.ksvm\",\r\n\r\n task = task,\r\n\r\n resampling = evalStrategy,\r\n\r\n par.set = paramGrid,\r\n\r\n control = makeTuneControlGrid(),\r\n\r\n measures = list(\r\n\r\n   setAggregation(rmse, test.mean),\r\n\r\n   setAggregation(rmse, train.mean),\r\n\r\n   setAggregation(rsq, test.mean),\r\n\r\n   setAggregation(rsq, train.mean)),\r\n\r\n show.info = FALSE)\r\n\r\n# show best parameters according to mean rmse on the test data\r\n\r\ndata &lt;- generateHyperParsEffectData(res)$data\r\n\r\ndata[which.min(data$rmse.test.mean), ]<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15206 size-large\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/00_metrics-1024x86.png\" alt=\"\" width=\"1024\" height=\"86\" srcset=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/00_metrics-1024x86.png 1024w, https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/00_metrics-300x25.png 300w, https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/00_metrics-768x65.png 768w, https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/00_metrics-400x34.png 400w, https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/00_metrics-360x30.png 360w, https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/00_metrics.png 1438w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/p>\n<p>Having determined appropriate hyperparameters, let&#8217;s fit the final model to the entire dataset:<\/p>\n<pre class=\"lang:r decode:true\">learner &lt;- makeLearner(\"regr.ksvm\", C=10, sigma=0.1)\r\n\r\nmod = mlr::train(learner, task)<\/pre>\n<h3><span class=\"ez-toc-section\" id=\"Interpretation\"><\/span>Interpretation<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>The SVM we have trained in the previous steps performs quite well in solving the regression task. Unfortunately, we are not able to explain it because of its complex structure. In these situations, model-agnostic techniques are helpful since they can be applied to any model. In this section, we will apply some rather basic, yet powerful approaches to explain the overall behaviour of the model.<\/p>\n<p>As a preliminary step, we will wrap the model along with its data in a Predictor object:<\/p>\n<pre class=\"lang:r decode:true\">features &lt;- BostonHousing %&gt;% select(-medv) %&gt;% as.data.frame()\r\n\r\nresponse &lt;- BostonHousing$medv\r\n\r\npredictor &lt;- Predictor$new(model = mod, data = features, y = response)<\/pre>\n<h4><span class=\"ez-toc-section\" id=\"Feature-Importance\"><\/span>Feature Importance<span class=\"ez-toc-section-end\"><\/span><\/h4>\n<p>At first, we will estimate the importance of each feature by evaluating their influence on the model&#8217;s performance. The importance of a feature is determined by repeatedly permuting its values and measuring the degradance of the performance measured by a certain loss function, e.g. mean squared error for this regression task. A feature is assumed to be important if the error significantly increases after a shuffle. In contrast to that, we assume that a permutation of a feature that is not important will not worsen the performance.<\/p>\n<pre class=\"lang:r decode:true\">FeatureImp$new(\r\n\r\n predictor = predictor,\r\n\r\n loss = \"mse\",\r\n\r\n n.repetitions = 20)$plot()<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15116\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/00_feature_importance-300x210.png\" alt=\"Feature Importance Ranking\" width=\"700\" height=\"490\" srcset=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/00_feature_importance-300x210.png 300w, https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/00_feature_importance-400x280.png 400w, https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/00_feature_importance-360x252.png 360w, https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/00_feature_importance.png 500w\" sizes=\"auto, (max-width: 700px) 100vw, 700px\" \/><\/p>\n<p>In the feature importance plot above, we see that two features rm (average number of rooms per dwelling) and lstat (percentage of lower status of the population) contribute by far the most to the model&#8217;s overall performance. Shuffling these features caused the MSE of the model to increase by an amount of around 15 on average (almost 4,000 USD). All other features caused the MSE to increase less than 7 on average which is still an error of 2,600 USD on average.<\/p>\n<p>The advantage of this approach is that it produces a global, aggregated insight which is comparable across several types of models. Unfortunately, it is tied to a certain loss function, computationally intense and not applicable to high dimensional problems like NLP or computer vision.<\/p>\n<h4><span class=\"ez-toc-section\" id=\"Feature-Effects\"><\/span>Feature Effects<span class=\"ez-toc-section-end\"><\/span><\/h4>\n<p>Now that we know which features are relevant, let&#8217;s determine how they influence the predictions by generating partial dependence plots (PDP) and individual conditional expectation curves (ICE). ICE curves show the dependence of the response on a feature per instance. An ICE curve is generated by varying the value of a single feature for a given instance, keeping the others fixed, and applying the model to the modified instance. This process is repeated so that an ICE curve is created for every instance of the dataset. We can draw conclusions about the overall marginal impact of a feature by looking at the PDP curve, which is generated by taking the point-wise average of the underlying ICE curves.<\/p>\n<p>The following plots show the combined Partial Dependence (yellow) and ICE curves\u00a0(black) for the most important features rm and lstat, centered at 0:<\/p>\n<pre class=\"lang:r decode:true\">FeatureEffect$new(\r\n\r\n predictor,\r\n\r\n feature=\"rm\",\r\n\r\n method=\"pdp+ice\",\r\n\r\n center.at=min(features$rm))$plot()\r\n\r\nFeatureEffect$new(\r\n\r\n predictor,\r\n\r\n feature=\"lstat\",\r\n\r\n method=\"pdp+ice\",\r\n\r\n center.at=min(features$lstat))$plot()<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15120 size-full\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/01_feature_effects.png\" alt=\"Feature Effects\" width=\"750\" height=\"320\" srcset=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/01_feature_effects.png 750w, https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/01_feature_effects-300x128.png 300w, https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/01_feature_effects-400x171.png 400w, https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/01_feature_effects-360x154.png 360w\" sizes=\"auto, (max-width: 750px) 100vw, 750px\" \/><\/p>\n<p>We see a relative consistent pattern in the ICE curves of both features. Any irregularities indicate dependency among features (see <a href=\"https:\/\/en.wikipedia.org\/wiki\/Multicollinearity\">multicollinearity<\/a>) since contradicting trends in ICE curves cannot be explained by variation of a single feature alone. Generating a PDP curve based on inconsistent ICE curves could be misleading and therefore should be avoided. This is the reason why PDP curves are based on the assumption of independence.<\/p>\n<p>We further see a monotonic increase on average of the predicted median house value when increasing rm (the average number of rooms per dwelling) and a monotonic decrease on average when increasing lstat (percentage of lower status of the population).<\/p>\n<h4><span class=\"ez-toc-section\" id=\"Interaction-Strength\"><\/span>Interaction Strength<span class=\"ez-toc-section-end\"><\/span><\/h4>\n<p>Now that we identified the most relevant features rm and lstat along with their kind of influence, let&#8217;s take the other features into account by measuring how strongly they interact with each other.\u00a0This is estimated by the <a href=\"https:\/\/projecteuclid.org\/download\/pdfview_1\/euclid.aoas\/1223908046\">H-statistic<\/a> which measures the amount of variation of the response that is caused by interactions. Its value will be greater than zero for a given feature if interactions with any of the other features are relevant to the model. If a particular feature is completely independent of any other feature, the value will be zero.<\/p>\n<pre class=\"lang:r decode:true\">Interaction$new(predictor)$plot()<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15122 size-full\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/02_feature_interactions.png\" alt=\"Feature Interactions\" width=\"500\" height=\"360\" srcset=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/02_feature_interactions.png 500w, https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/02_feature_interactions-300x216.png 300w, https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/02_feature_interactions-400x288.png 400w, https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/02_feature_interactions-360x259.png 360w\" sizes=\"auto, (max-width: 500px) 100vw, 500px\" \/><\/p>\n<p>We see that &#8218;dis&#8216; (weighted distances to five Boston employment centres) shows the highest overall interaction strength of approximately 0.3. In other words, 30 % of the variation of the predicted median house value caused by feature &#8218;dis&#8216; can be attributed to interactions with other features. This sounds reasonable since distance to employment centres alone is not that meaningful to the median house value. When taking other relevant features like &#8218;lstat&#8216; or &#8218;crim&#8216; into account, distance to employment centres becomes more expressive. On the other side, &#8218;crim&#8216;, the per capita crime rate by town, is relatively independent compared to other features.<\/p>\n<p>Let&#8217;s also take a look at the 2-way interactions of the most important features:<\/p>\n<pre class=\"lang:r decode:true\">Interaction$new(predictor, feature = \"rm\")$plot()\r\n\r\nInteraction$new(predictor, feature = \"lstat\")$plot()<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15123 size-full\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/03_individual_interactions.png\" alt=\"Two-Way Feature Interactions\" width=\"750\" height=\"350\" \/><\/p>\n<p>We see that the interactions between &#8218;rm&#8216; and &#8218;lstat&#8216; as well as between &#8218;lstat&#8216; and &#8218;dis&#8216; are most influential to the models predictions. What is also interesting is that interactions with &#8218;rad&#8216;,\u00a0the accessibility to radial highways, is quite expressive. Even if that feature was not determined as one of the most relevant features.<\/p>\n<h4><span class=\"ez-toc-section\" id=\"Global-Surrogate\"><\/span>Global Surrogate<span class=\"ez-toc-section-end\"><\/span><\/h4>\n<p>The last technique I want to demonstrate in this demo is the Global Surrogate. The idea behind it is to approximate a complex model by a simpler model and to draw conclusions about its behaviour by looking at the structure of the simple model. In our case, the simple model will be a decision tree of depth 2. We will fit it to the original features and the outcome of the blackbox model (not the original target variable!). The following plot shows the distribution of the predicted house values for each of the terminal nodes:<\/p>\n<pre class=\"lang:r decode:true \">TreeSurrogate$new(predictor, maxdepth = 2)$plot()<\/pre>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15124 size-full\" src=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/01\/04_global_surrogate.png\" alt=\"\" width=\"500\" height=\"450\" \/><\/p>\n<p>We see that lstat and rm were selected as the main split criteria, which is consistent to the feature importance estimate above. The distributions of the predicted outcome within the terminal nodes are significantly different, so the splits seem to be appropriate. The simple tree surrogate seem to capture the overall model&#8217;s behaviour quite well.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Conclusion\"><\/span>Conclusion<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>While performance metrics are crucial to evaluate a model, they lack explanations. In the vast majority of real-world tasks, it is impossible to capture every considerable aspect in a single numeric quantity to optimize for. The consequence that arises from this is that one cannot be sure that a model provides sound decisions and behaves in an acceptable and predictable way. To address this issue, the prototyping workflow can be complemented by Interpretability techniques in order to understand a model&#8217;s decision process. Furthermore, interpretability facilitates targeted debugging, since issues that arise from\u00a0<a href=\"https:\/\/machinelearningmastery.com\/data-leakage-machine-learning\/\">information leakage<\/a>, <a href=\"https:\/\/newonlinecourses.science.psu.edu\/stat501\/node\/343\/\">multicollinearity<\/a> or\u00a0<a href=\"https:\/\/xkcd.com\/882\/\">random correlations<\/a> can be identified (just to name a few). Especially when dealing with models that affect individuals or in a setting where automated decisions have significant impact, interpretability becomes critical.<\/p>\n<p>Interpretability is an active research area with thousands of academic papers already published. I was not even close to cover it in all of its facets in this blogpost. I would like to mention that there are more advanced techniques like <a href=\"https:\/\/homes.cs.washington.edu\/~marcotcr\/blog\/lime\/\">LIME<\/a> and SHAP\u00a0which can be used to explain single predictions and which can even be applied to image recognition and NLP tasks. Maybe these techniques will be topic of a future blogpost \ud83d\ude09<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Read-on\"><\/span>Read on<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>You might wanna have a look at our <a href=\"https:\/\/www.inovex.de\/en\/our-services\/data-science-deep-learning\/\">deep learning portfolio<\/a>. If you&#8217;re looking for new challenges, you might also want to consider our <a href=\"https:\/\/www.inovex.de\/de\/karriere\/stellenangebote\/\">job offerings<\/a> for Data Scientists, ML Engineers or BI Developer.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Machine learning has a great potential to improve data products and business processes. It is used to propose products and news articles that we might be interested in as well as to steer autonomous vehicles and to challenge human experts in non-trivial games. Although machine learning models perform extraordinary well in solving those tasks, we [&hellip;]<\/p>\n","protected":false},"author":103,"featured_media":15560,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"ep_exclude_from_search":false,"footnotes":""},"tags":[206,264],"service":[76,431],"coauthors":[{"id":103,"display_name":"Marcel Spitzer","user_nicename":"mspitzer"}],"class_list":["post-21108","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","tag-data-science","tag-ml-interpretability","service-artificial-intelligence","service-data-science"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.6 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Machine Learning Interpretability<\/title>\n<meta name=\"description\" content=\"Unlike usual performance metrics, fairness, safety and transparency in machine learning models are much harder if not impossible to quantify. Here are some techniques (and examples) to provide interpretability, to make decision systems understandable not only for their creators, but also for their customers and users.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/\" \/>\n<meta property=\"og:locale\" content=\"de_DE\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Machine Learning Interpretability\" \/>\n<meta property=\"og:description\" content=\"Unlike usual performance metrics, fairness, safety and transparency in machine learning models are much harder if not impossible to quantify. Here are some techniques (and examples) to provide interpretability, to make decision systems understandable not only for their creators, but also for their customers and users.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/\" \/>\n<meta property=\"og:site_name\" content=\"inovex GmbH\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/inovexde\" \/>\n<meta property=\"article:published_time\" content=\"2019-02-13T10:10:18+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-03-19T06:30:05+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/02\/machine-learning-interpretability-hero.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1920\" \/>\n\t<meta property=\"og:image:height\" content=\"1080\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Marcel Spitzer\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/02\/machine-learning-interpretability-hero-1024x576.png\" \/>\n<meta name=\"twitter:creator\" content=\"@inovexgmbh\" \/>\n<meta name=\"twitter:site\" content=\"@inovexgmbh\" \/>\n<meta name=\"twitter:label1\" content=\"Verfasst von\" \/>\n\t<meta name=\"twitter:data1\" content=\"Marcel Spitzer\" \/>\n\t<meta name=\"twitter:label2\" content=\"Gesch\u00e4tzte Lesezeit\" \/>\n\t<meta name=\"twitter:data2\" content=\"18\u00a0Minuten\" \/>\n\t<meta name=\"twitter:label3\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data3\" content=\"Marcel Spitzer\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/machine-learning-interpretability\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/machine-learning-interpretability\\\/\"},\"author\":{\"name\":\"Marcel Spitzer\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#\\\/schema\\\/person\\\/5c0bae71c9067840a9ef27f9bf622b54\"},\"headline\":\"Machine Learning Interpretability: Do You Know What Your Model Is Doing?\",\"datePublished\":\"2019-02-13T10:10:18+00:00\",\"dateModified\":\"2025-03-19T06:30:05+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/machine-learning-interpretability\\\/\"},\"wordCount\":2735,\"commentCount\":2,\"publisher\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/machine-learning-interpretability\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2019\\\/02\\\/machine-learning-interpretability-hero.png\",\"keywords\":[\"Data Science\",\"ML Interpretability\"],\"articleSection\":[\"Analytics\",\"English Content\",\"General\"],\"inLanguage\":\"de\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/machine-learning-interpretability\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/machine-learning-interpretability\\\/\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/machine-learning-interpretability\\\/\",\"name\":\"Machine Learning Interpretability\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/machine-learning-interpretability\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/machine-learning-interpretability\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2019\\\/02\\\/machine-learning-interpretability-hero.png\",\"datePublished\":\"2019-02-13T10:10:18+00:00\",\"dateModified\":\"2025-03-19T06:30:05+00:00\",\"description\":\"Unlike usual performance metrics, fairness, safety and transparency in machine learning models are much harder if not impossible to quantify. Here are some techniques (and examples) to provide interpretability, to make decision systems understandable not only for their creators, but also for their customers and users.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/machine-learning-interpretability\\\/#breadcrumb\"},\"inLanguage\":\"de\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/machine-learning-interpretability\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/machine-learning-interpretability\\\/#primaryimage\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2019\\\/02\\\/machine-learning-interpretability-hero.png\",\"contentUrl\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2019\\\/02\\\/machine-learning-interpretability-hero.png\",\"width\":1920,\"height\":1080,\"caption\":\"A black box with a window cut out\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/machine-learning-interpretability\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Machine Learning Interpretability: Do You Know What Your Model Is Doing?\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#website\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/\",\"name\":\"inovex GmbH\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"de\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#organization\",\"name\":\"inovex GmbH\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2021\\\/03\\\/inovex-logo-16-9-1.png\",\"contentUrl\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/2021\\\/03\\\/inovex-logo-16-9-1.png\",\"width\":1921,\"height\":1081,\"caption\":\"inovex GmbH\"},\"image\":{\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/inovexde\",\"https:\\\/\\\/x.com\\\/inovexgmbh\",\"https:\\\/\\\/www.instagram.com\\\/inovexlife\\\/\",\"https:\\\/\\\/www.linkedin.com\\\/company\\\/inovex\",\"https:\\\/\\\/www.youtube.com\\\/channel\\\/UC7r66GT14hROB_RQsQBAQUQ\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/#\\\/schema\\\/person\\\/5c0bae71c9067840a9ef27f9bf622b54\",\"name\":\"Marcel Spitzer\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"de\",\"@id\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/cropped-22291_2-96x96.jpg029da078b4491202841b116405caf549\",\"url\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/cropped-22291_2-96x96.jpg\",\"contentUrl\":\"https:\\\/\\\/www.inovex.de\\\/wp-content\\\/uploads\\\/cropped-22291_2-96x96.jpg\",\"caption\":\"Marcel Spitzer\"},\"description\":\"As a Data\\\/ML Engineer at inovex, I build streaming and batch pipelines for data processing in distributed systems and use machine learning to make data products smarter.\",\"sameAs\":[\"https:\\\/\\\/www.linkedin.com\\\/in\\\/msp14\\\/\"],\"url\":\"https:\\\/\\\/www.inovex.de\\\/de\\\/blog\\\/author\\\/mspitzer\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Machine Learning Interpretability","description":"Unlike usual performance metrics, fairness, safety and transparency in machine learning models are much harder if not impossible to quantify. Here are some techniques (and examples) to provide interpretability, to make decision systems understandable not only for their creators, but also for their customers and users.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/","og_locale":"de_DE","og_type":"article","og_title":"Machine Learning Interpretability","og_description":"Unlike usual performance metrics, fairness, safety and transparency in machine learning models are much harder if not impossible to quantify. Here are some techniques (and examples) to provide interpretability, to make decision systems understandable not only for their creators, but also for their customers and users.","og_url":"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/","og_site_name":"inovex GmbH","article_publisher":"https:\/\/www.facebook.com\/inovexde","article_published_time":"2019-02-13T10:10:18+00:00","article_modified_time":"2025-03-19T06:30:05+00:00","og_image":[{"width":1920,"height":1080,"url":"https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/02\/machine-learning-interpretability-hero.png","type":"image\/png"}],"author":"Marcel Spitzer","twitter_card":"summary_large_image","twitter_image":"https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/02\/machine-learning-interpretability-hero-1024x576.png","twitter_creator":"@inovexgmbh","twitter_site":"@inovexgmbh","twitter_misc":{"Verfasst von":"Marcel Spitzer","Gesch\u00e4tzte Lesezeit":"18\u00a0Minuten","Written by":"Marcel Spitzer"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#article","isPartOf":{"@id":"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/"},"author":{"name":"Marcel Spitzer","@id":"https:\/\/www.inovex.de\/de\/#\/schema\/person\/5c0bae71c9067840a9ef27f9bf622b54"},"headline":"Machine Learning Interpretability: Do You Know What Your Model Is Doing?","datePublished":"2019-02-13T10:10:18+00:00","dateModified":"2025-03-19T06:30:05+00:00","mainEntityOfPage":{"@id":"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/"},"wordCount":2735,"commentCount":2,"publisher":{"@id":"https:\/\/www.inovex.de\/de\/#organization"},"image":{"@id":"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#primaryimage"},"thumbnailUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/02\/machine-learning-interpretability-hero.png","keywords":["Data Science","ML Interpretability"],"articleSection":["Analytics","English Content","General"],"inLanguage":"de","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/","url":"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/","name":"Machine Learning Interpretability","isPartOf":{"@id":"https:\/\/www.inovex.de\/de\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#primaryimage"},"image":{"@id":"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#primaryimage"},"thumbnailUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/02\/machine-learning-interpretability-hero.png","datePublished":"2019-02-13T10:10:18+00:00","dateModified":"2025-03-19T06:30:05+00:00","description":"Unlike usual performance metrics, fairness, safety and transparency in machine learning models are much harder if not impossible to quantify. Here are some techniques (and examples) to provide interpretability, to make decision systems understandable not only for their creators, but also for their customers and users.","breadcrumb":{"@id":"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#breadcrumb"},"inLanguage":"de","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/"]}]},{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#primaryimage","url":"https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/02\/machine-learning-interpretability-hero.png","contentUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/2019\/02\/machine-learning-interpretability-hero.png","width":1920,"height":1080,"caption":"A black box with a window cut out"},{"@type":"BreadcrumbList","@id":"https:\/\/www.inovex.de\/de\/blog\/machine-learning-interpretability\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.inovex.de\/de\/"},{"@type":"ListItem","position":2,"name":"Machine Learning Interpretability: Do You Know What Your Model Is Doing?"}]},{"@type":"WebSite","@id":"https:\/\/www.inovex.de\/de\/#website","url":"https:\/\/www.inovex.de\/de\/","name":"inovex GmbH","description":"","publisher":{"@id":"https:\/\/www.inovex.de\/de\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.inovex.de\/de\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"de"},{"@type":"Organization","@id":"https:\/\/www.inovex.de\/de\/#organization","name":"inovex GmbH","url":"https:\/\/www.inovex.de\/de\/","logo":{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/www.inovex.de\/de\/#\/schema\/logo\/image\/","url":"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/03\/inovex-logo-16-9-1.png","contentUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/2021\/03\/inovex-logo-16-9-1.png","width":1921,"height":1081,"caption":"inovex GmbH"},"image":{"@id":"https:\/\/www.inovex.de\/de\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/inovexde","https:\/\/x.com\/inovexgmbh","https:\/\/www.instagram.com\/inovexlife\/","https:\/\/www.linkedin.com\/company\/inovex","https:\/\/www.youtube.com\/channel\/UC7r66GT14hROB_RQsQBAQUQ"]},{"@type":"Person","@id":"https:\/\/www.inovex.de\/de\/#\/schema\/person\/5c0bae71c9067840a9ef27f9bf622b54","name":"Marcel Spitzer","image":{"@type":"ImageObject","inLanguage":"de","@id":"https:\/\/www.inovex.de\/wp-content\/uploads\/cropped-22291_2-96x96.jpg029da078b4491202841b116405caf549","url":"https:\/\/www.inovex.de\/wp-content\/uploads\/cropped-22291_2-96x96.jpg","contentUrl":"https:\/\/www.inovex.de\/wp-content\/uploads\/cropped-22291_2-96x96.jpg","caption":"Marcel Spitzer"},"description":"As a Data\/ML Engineer at inovex, I build streaming and batch pipelines for data processing in distributed systems and use machine learning to make data products smarter.","sameAs":["https:\/\/www.linkedin.com\/in\/msp14\/"],"url":"https:\/\/www.inovex.de\/de\/blog\/author\/mspitzer\/"}]}},"_links":{"self":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts\/21108","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/users\/103"}],"replies":[{"embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/comments?post=21108"}],"version-history":[{"count":5,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts\/21108\/revisions"}],"predecessor-version":[{"id":61297,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/posts\/21108\/revisions\/61297"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/media\/15560"}],"wp:attachment":[{"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/media?parent=21108"}],"wp:term":[{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/tags?post=21108"},{"taxonomy":"service","embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/service?post=21108"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.inovex.de\/de\/wp-json\/wp\/v2\/coauthors?post=21108"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}