Interpretability vs Explainability

2 min readDec 12, 2023

Big picture about the interpretability and explainability

Interpretability and explainability are two related concepts that refer to the degree to which a human can understand and trust the outputs of a machine-learning model or system.

The ability to understand the internal mechanics of a system or model is referred to as interpretability. In the context of machine learning, an interpretable model is one whose predictions can be easily understood and traced back to the input features and parameters of the model. Linear models, decision trees, and rule-based systems are often considered interpretable because their decision-making processes are transparent and easy to follow.

Explainability refers to the ability of a model to provide understandable reasons for its decisions. Explainability focuses on providing post-hoc explanations or justifications for predictions made by models that are not inherently interpretable. Understanding the internal workings of complex models, such as deep neural networks, is particularly important. By highlighting relevant features or patterns in the input data, explainability tools and techniques aim to explain why a model made a specific prediction.

Global and local explainability are two perspectives on how to understand and interpret the decisions made by machine learning models. In providing explanations for model predictions, they refer to different levels of granularity. In global explainability, the goal is to understand how a machine learning model makes decisions over the entire dataset. On the other hand, local explainability focuses on a specific prediction or instance. It aims to explain why a particular thing happens

Interpretability vs Explainability

Written by Mohammad Khalooei

No responses yet