Explainable AI: What, Why and How

Explainable AI: What, Why and How

Artificial intelligence (AI) is a branch of computer science that aims to create machines and systems that can perform tasks that normally require human intelligence, such as perception, reasoning, learning, decision-making, and natural language processing. AI has been advancing rapidly in recent years, thanks to the availability of large amounts of data, powerful computing resources, and novel algorithms. AI applications have been transforming various domains, such as healthcare, education, finance, entertainment, and security.

However, as AI becomes more pervasive and influential in our society, there is also a growing demand for understanding how AI works and what are the implications of its decisions. Many AI systems are based on complex and opaque models, such as deep neural networks, that are hard to interpret and explain by humans. These models are often referred to as "black boxes" because their internal logic and reasoning are hidden from the users and stakeholders. This poses several challenges and risks for the adoption and trust of AI, such as:

Accountability: Who is responsible for the outcomes and consequences of AI decisions? How can we ensure that AI systems comply with ethical principles, legal regulations, and social norms?

Fairness: How can we prevent and detect bias and discrimination in AI systems? How can we ensure that AI systems respect the diversity and equality of different groups of people?

Debugging: How can we identify and correct errors and flaws in AI systems? How can we improve the performance and robustness of AI systems?

Education: How can we teach and learn about AI systems? How can we foster the skills and competencies needed to interact with and benefit from AI systems?

To address these challenges and risks, there is a need for explainable AI (XAI), also known as interpretable AI or transparent AI.

Explainable AI (XAI) is a set of methods and techniques that aim to make AI systems more understandable and interpretable by humans. XAI can help users and stakeholders gain insights into the behaviour, logic, and outcomes of AI systems, as well as their potential impact and limitations. XAI can also enable humans to provide feedback and guidance to AI systems, to improve their performance and alignment with human values.

Types of explainable AI

Different types of explainability can be provided by XAI methods, depending on the target audience, the purpose, and the context of the explanation. Some common types of explainability are:

Global explainability: This refers to the overall understanding of how an AI system works, what are its main components, features, parameters, assumptions, objectives, limitations, etc. Global explainability can help developers design and evaluate AI systems, as well as regulators audit and monitor them.

Local explainability: This refers to the specific understanding of how an AI system produces a particular output or decision for a given input or situation. Local explainability can help users comprehend and trust the results and recommendations of AI systems, as well as challenge or appeal to them if needed.

Contrastive explainability: This refers to the comparative understanding of why an AI system prefers one output or decision over another. Contrastive explainability can help users explore alternative scenarios and outcomes, as well as understand the trade-offs and preferences of AI systems.

Counterfactual explainability: This refers to the hypothetical understanding of how an AI system would change its output or decision if some aspects of the input or situation were different. Counterfactual explainability can help users identify the causal factors and influences of AI decisions, as well as discover ways to modify or optimize them.

Methods of explainable AI

Different methods of explainability can be applied by XAI techniques, depending on the type and complexity of the AI system, the level and granularity of the explanation, the format and medium of the explanation, etc. Some common methods of explainability are:

Intrinsic methods: These methods aim to make the AI system itself more interpretable and transparent by using simpler or more structured models, such as decision trees¹, linear models, or rule-based systems. Intrinsic methods can provide global or local explanations that are directly derived from the model structure or parameters.

Post-hoc methods: These methods aim to generate explanations for an existing AI system without modifying it by using external techniques, such as feature importance, feature attribution, saliency maps, or surrogate models. Post-hoc methods can provide local or contrastive explanations that are based on analyzing the input-output relationship or approximating the model behaviour.

Interactive methods: These methods aim to facilitate explanations for an AI system through human-machine interaction by using dialogues, visualizations, or examples. Interactive methods can provide local or counterfactual explanations that are tailored to the user's needs or preferences.

Examples of explainable AI

There are many examples of XAI applications and tools that have been developed and used in various domains and contexts. Some of them are:

IBM Explainable AI: This is a set of tools and frameworks that allows human users to comprehend and trust the results and output created by machine learning algorithms. IBM Explainable AI is used to describe an AI model, its expected impact and potential biases. It helps characterize model accuracy, fairness, transparency and outcomes in AI-powered decision-making.

Google Cloud Explainable AI: This is a set of tools and frameworks to help users understand and interpret predictions made by machine learning models, natively integrated with several of Google's products and services. Google Cloud Explainable AI provides feature attributions for model predictions in AutoML Tables, BigQuery ML and Vertex AI, and visually investigates model behavior using the What-If Tool.

LIME: This is a post-hoc method that explains the predictions of any machine learning classifier by learning an interpretable model locally around the prediction. LIME generates explanations by perturbing the input observing the changes in the output, and then fitting a simple model (such as a linear model or a decision tree) to approximate the local behaviour of the complex model.

SHAP: This is a post-hoc method that explains the output of any machine learning model using Shapley values, which are a concept from game theory that measures how much each feature contributes to the prediction. SHAP assigns each feature an importance score for a particular prediction, based on the average marginal contribution of that feature across all possible subsets of features.

XAI Project: This is a research project funded by the Defense Advanced Research Projects Agency (DARPA) that aims to create a suite of machine learning techniques that produce more explainable models while maintaining a high level of performance. The XAI Project also aims to design human-computer interfaces that enable human users to understand, trust, and manage the emerging generation of artificially intelligent systems.

Some challenges of explainable AI are:

Complexity: AI systems are often based on complex and opaque models, such as deep neural networks, that are hard to interpret and explain by humans. These models may have millions of parameters, nonlinear interactions, and hidden layers that make their logic and reasoning difficult to understand. Moreover, AI systems may operate in dynamic and uncertain environments, where the input data, the output decisions, and the context may change over time. Therefore, providing clear and consistent explanations for AI systems may be challenging.

Trade-off: There may be a trade-off between the performance and the explainability of AI systems. Some AI systems may achieve high accuracy, robustness, and generalization by using complex and opaque models, but at the cost of losing explainability. On the other hand, some AI systems may enhance explainability by using simpler or more structured models, but at the cost of sacrificing performance. Therefore, finding a balance between the performance and the explainability of AI systems may be challenging.

Diversity: There may be a diversity of users, stakeholders, and purposes for explainable AI systems. Different users and stakeholders may have different backgrounds, expectations, preferences, and goals when interacting with AI systems. Different purposes may require different types of explanations, such as global or local, contrastive or counterfactual, intrinsic or post-doc, interactive or static. Therefore, providing appropriate and tailored explanations for different users, stakeholders, and purposes may be challenging.

Evaluation: There may be a lack of standard methods and metrics for evaluating the quality and effectiveness of explainable AI systems. Different explainable AI methods may have different assumptions, limitations, and advantages that make them suitable for different scenarios and applications. Different evaluation criteria may include accuracy, consistency, completeness, fidelity, coherence, relevance, usefulness, satisfaction, trust, etc. Therefore, comparing and assessing different explainable AI methods and systems may be challenging.

One possible way to overcome the complexity challenge of explainable AI is to use post-hoc methods, which generate explanations for an existing AI system without modifying it. Post-hoc methods can provide local or contrastive explanations that are based on analyzing the input-output relationship or approximating the model behaviour. For example, LIME and SHAP are two popular post-hoc methods that explain the predictions of any machine learning model by using feature importance, feature attribution, or surrogate models. Post-hoc methods can help users and stakeholders gain insights into the behaviour and outcomes of complex AI systems, as well as their potential impact and limitations. However, post-hoc methods may also have some drawbacks, such as being computationally expensive, inconsistent, incomplete, or inaccurate. Therefore, post-hoc methods should be used with caution and validation.

Some different methods and metrics can be used to evaluate the quality and effectiveness of post-hoc methods of explainable AI.

Some of the common criteria that are used to assess the explanations generated by posthoc methods are:

Accuracy: This measures how well the explanation matches the output or decision of the original AI system. For example, one can use the mean squared error (MSE) or the R-squared (R2) score to measure the accuracy of a surrogate model that approximates the behaviour of a complex model.

Consistency: This measures how stable and robust the explanation is across different inputs or situations. For example, one can use the Lipschitz constant or the stability score to measure the consistency of a feature attribution method that assigns importance scores to each feature for a given prediction.

Completeness: This measures how much information the explanation provides about the AI system. For example, one can use the coverage or the sparsity score to measure the completeness of a saliency map method that highlights the relevant regions of an input image for a given prediction.

Fidelity: This measures how faithful and reliable the explanation is to the AI system. For example, one can use the faithfulness or the sensitivity score to measure the fidelity of a feature importance method that ranks the features according to their contribution to a given prediction.

Coherence: This measures how understandable and intuitive the explanation is to humans. For example, one can use the plausibility or the persuasiveness score to measure the coherence of a natural language explanation that describes the rationale behind a given prediction.

Relevance: This measures how useful and informative the explanation is for a specific purpose or context. For example, one can use the satisfaction or trust score to measure the relevance of an interactive explanation that allows users to ask questions or provide feedback about a given prediction.

These criteria can be measured quantitatively using various methods, such as mathematical formulas, statistical tests, simulation experiments, or user surveys. However, there is no single best method or metric that can capture all aspects of explainability, and different methods and metrics may have different assumptions, limitations, and advantages. Therefore, it is important to choose appropriate methods and metrics that suit the type and complexity of the AI system, the level and granularity of the explanation, and the format and medium of the explanation.

Some examples of post-hoc methods of explainable AI are:

LIME: This is a method that explains the predictions of any machine learning model by learning an interpretable model locally around the prediction. LIME generates explanations by perturbing the input observing the changes in the output, and then fitting a simple model (such as a linear model or a decision tree) to approximate the local behaviour of the complex model.

SHAP: This is a method that explains the output of any machine learning model using Shapley values, which are a concept from game theory that measures how much each feature contributes to the prediction. SHAP assigns each feature an importance score for a particular prediction, based on the average marginal contribution of that feature across all possible subsets of features.

LRP: This is a method that explains the predictions of deep neural networks by propagating the relevance of the output back to the input layer. LRP assigns each neuron a relevance score that indicates how much it contributes to the prediction, based on the activation and weight values of the network.

Grad-CAM: This is a method that explains the predictions of convolutional neural networks for image classification by generating saliency maps that highlight the regions of the input image that are most relevant for the prediction. Grad-CAM computes the gradient of the output concerning the feature maps of a convolutional layer and then uses a weighted combination of these gradients to produce a coarse localization map.

Ensuring the fairness and accountability of XAI systems is a complex and multifaceted challenge that requires the collaboration of different stakeholders, such as developers, users, regulators, and society.

Some possible ways to address this challenge are:

Designing AI systems with explainability in mind:

This means choosing appropriate AI models, methods, and techniques that can provide meaningful and understandable explanations for their behaviour, logic, and outcomes. This also means considering the needs, preferences, and expectations of the target audience and the purpose and context of the explanation. For example, one can use intrinsic methods to make the AI system itself more interpretable and transparent or use post-hoc methods to generate explanations for an existing AI system without modifying it.

Evaluating the quality and effectiveness of explanations:

This means measuring and assessing how well the explanations match the criteria of accuracy, consistency, completeness, fidelity, coherence, relevance, usefulness, satisfaction, trust, etc. This also means using appropriate methods and metrics to quantify and compare different explanations, such as mathematical formulas, statistical tests, simulation experiments, or user surveys.

Implementing ethical principles and legal regulations for AI systems:

This means ensuring that AI systems comply with the values and norms of human society, such as fairness, privacy, safety, accountability, transparency, etc. This also means establishing clear and enforceable rules and standards for the development, deployment, and use of AI systems, such as ethical guidelines, legal frameworks, certification schemes, auditing mechanisms, etc.

Educating and empowering users and stakeholders about AI systems:

This means providing accessible and comprehensible information and knowledge about AI systems and their explanations. This also means enabling users and stakeholders to interact with and influence AI systems, such as asking questions, providing feedback, challenging or appealing decisions, modifying or optimizing outcomes, etc.

Bias and discrimination are serious issues that can affect the fairness and trustworthiness of XAI systems. Bias and discrimination can occur when an XAI system produces outputs or decisions that are unfair, inaccurate, or harmful to certain groups of people, based on their characteristics, such as age, gender, race, ethnicity, religion, disability, etc.

To prevent and detect bias and discrimination in XAI systems, several steps can be taken, such as:

Designing XAI systems with explainability in mind:

Evaluating the quality and effectiveness of explanations:

Implementing ethical principles and legal regulations for XAI systems: This means ensuring that XAI systems comply with the values and norms of human society, such as fairness, privacy, safety, accountability, transparency, etc. This also means establishing clear and enforceable rules and standards for the development, deployment, and use of XAI systems, such as ethical guidelines, legal frameworks, certification schemes, auditing mechanisms, etc.

Educating and empowering users and stakeholders about XAI systems: This means providing accessible and comprehensible information and knowledge about XAI systems and their explanations. This also means enabling users and stakeholders to interact with and influence XAI systems, such as asking questions, providing feedback, challenging or appealing decisions, modifying or optimizing outcomes, etc.

Investing more in diversifying the AI field itself: This means increasing the representation and participation of different groups of people in the AI community, such as researchers, developers, users, etc. A more diverse AI community would be better equipped to anticipate, review, and spot bias and discrimination in XAI systems and engage communities affected.

Some examples of explainable AI applications in healthcare are:

Disease diagnosis: Explainable AI can help doctors and patients understand how an AI system reaches a diagnosis based on input data, such as medical images, symptoms, or test results. For example, LIME and SHAP are two post-hoc methods that explain the predictions of any machine learning model by using feature importance, feature attribution, or surrogate models.

Treatment protocol: Explainable AI can help doctors and patients understand how an AI system recommends a treatment plan based on the patient's condition, preferences, and goals. For example, IBM Explainable AI is a set of tools and frameworks that allows human users to comprehend and trust the results and output created by machine learning algorithms. IBM Explainable AI is used to describe an AI model, its expected impact and potential biases.

New drug development: Explainable AI can help researchers and regulators understand how an AI system discovers new drug candidates or optimizes existing ones based on the molecular structure, biological activity, and pharmacological properties. For example, DeepChem is a deep learning framework that uses neural networks to model chemical systems and generate explanations for their predictions.

Personalized medicine: Explainable AI can help doctors and patients understand how an AI system tailors a treatment plan to the individual characteristics and needs of each patient, such as their genetic profile, lifestyle, or response to previous treatments. For example, Deep Genomics is a company that uses deep learning to analyze genomic data and provide explanations for how genetic variants affect disease risk and drug response.

How to ensure the privacy and security of XAI systems

One possible way to ensure the privacy and security of XAI systems is to design them with privacy in mind. This can include using secure protocols to protect data and using anonymized data sets to ensure that sensitive information is not exposed. Organizations should also consider using data masking techniques to further protect sensitive data.

Data masking is a process of transforming or replacing original data with fictitious but realistic data, such as replacing names, addresses, or phone numbers with fake ones. Data masking can help preserve the privacy of individuals and organizations, while still allowing XAI systems to provide meaningful and understandable explanations. Data masking can also help prevent data breaches, as it can make it harder for malicious actors to access or misuse sensitive data. Data masking can be applied at different stages of the data life cycle, such as during collection, storage, processing, or analysis.

Data masking can also be applied at different levels of granularity, such as at the record level, the field level, or the value level. Data masking can also use different techniques, such as encryption, hashing, substitution, shuffling, or generalization. Data masking can be customized according to the specific needs and requirements of each XAI system and application. For example, one can use encryption or hashing to mask sensitive data that is not needed for explanation purposes, such as personal identifiers or passwords. One can also use substitution or shuffling to mask sensitive data that is needed for explanation purposes, but not in its original form, such as demographic attributes or medical records. One can also use generalization to mask sensitive data that is needed for explanation purposes, but only at a certain level of detail, such as geographic locations or income ranges.

Data masking is one of the possible solutions for ensuring the privacy and security of XAI systems. However, data masking may also have some limitations and challenges, such as:

Performance: Data masking may affect the performance and accuracy of XAI systems, as it may introduce noise or distortion into the data. For example, data masking may reduce the variability or diversity of the data, which may affect the quality and fidelity of the explanations. Data masking may also increase the complexity or cost of the data processing, which may affect the efficiency and scalability of the XAI systems.

Trade-off: Data masking may involve a trade-off between the privacy and the explainability of XAI systems. For example, data masking may increase the privacy of the data by reducing its identifiability or linkability, but it may also decrease the explainability of the data by reducing its interpretability or informativeness. Data masking may also increase the explainability of the data by preserving its relevance or usefulness, but it may also decrease the privacy of the data by increasing its vulnerability or exposure.

Diversity: Data masking may need to consider the diversity of users, stakeholders, and purposes for XAI systems. For example, different users and stakeholders may have different expectations and preferences for how their data is masked and how their explanations are provided. Different purposes may require different levels and types of data masking and explanation. Therefore, providing appropriate and tailored data masking and explanation for different users, stakeholders, and purposes may be challenging.

Therefore, data masking should be used with caution and validation when designing XAI systems with privacy in mind.

Data masking should also be combined with other methods and techniques to ensure the privacy and security of XAI systems, such as:

Educating and empowering users and stakeholders about XAI systems: This means providing accessible and comprehensible information and knowledge about XAI systems and their explanations. This also means enabling users and stakeholders to interact with and influence XAI systems.

Some examples of explainable AI applications in finance are:

Credit scoring: Explainable AI can help lenders and borrowers understand how an AI system evaluates the creditworthiness of a loan applicant based on input data, such as income, expenses, assets, liabilities, or credit history. For example, FICO Explainable Machine Learning is a framework that combines machine learning models with decision trees to provide transparent and intuitive explanations for credit scores.

Fraud detection: Explainable AI can help banks and customers understand how an AI system detects fraudulent transactions based on input data, such as transaction amount, location, time, or device. For example, Mastercard Decision Intelligence is a platform that uses neural networks to analyze transaction data and provide explanations for fraud alerts.

Portfolio management: Explainable AI can help investors and advisors understand how an AI system recommends a portfolio allocation based on input data, such as risk profile, return objectives, market conditions, or personal preferences. For example, BlackRock Aladdin is a system that uses natural language processing to generate natural language explanations for portfolio decisions.

XAI is not a different type of AI, but rather a set of methods and techniques that aim to make AI systems more understandable and interpretable by humans. XAI can help users and stakeholders gain insights into the behaviour, logic, and outcomes of AI systems, as well as their potential impact and limitations. XAI can also enable humans to provide feedback and guidance to AI systems, to improve their performance and alignment with human values. XAI methods can be classified into different types and categories, depending on the target audience, the purpose, the context, and the format of the explanation.

For example, some common types of explainability are:

Global explainability: This refers to the overall understanding of how an AI system works, what are its main components, features, parameters, assumptions, objectives, limitations, etc.

Local explainability: This refers to the specific understanding of how an AI system produces a particular output or decision for a given input or situation.

Contrastive explainability: This refers to the comparative understanding of why an AI system prefers one output or decision over another.

Counterfactual explainability: This refers to the hypothetical understanding of how an AI system would change its output or decision if some aspects of the input or situation were different.

Some common methods of explainability are:

Intrinsic methods: These methods aim to make the AI system itself more interpretable and transparent by using simpler or more structured models, such as decision trees, linear models, or rule-based systems.

Interactive methods: These methods aim to facilitate explanations for an AI system through human-machine interaction by using dialogues, visualizations, or examples.

Conclusion

Explainable AI is an important and emerging field that seeks to make AI systems more understandable and interpretable by humans. XAI can help users and stakeholders gain insights into the behaviour, logic, and outcomes of AI systems, as well as their potential impact and limitations. XAI can also enable humans to provide feedback and guidance to AI systems, to improve their performance and alignment with human values. XAI methods can be classified into different types and categories, depending on the target audience, the purpose, the context, and the format of the explanation. XAI applications and tools can be found in various domains and contexts, such as healthcare, education, finance, entertainment, and security. XAI is crucial for building trust and confidence when putting AI models into production, as well as adopting a responsible approach to AI development.

Search This Blog

AI Intelligence

Explainable AI: What, Why and How

Popular posts from this blog

AI and speech: how artificial intelligence can help paralyzed people communicate again

Natural Language Processing: An Overview

Artificial Intelligence: History, Applications, and Impacts