Interpretable Neural Networks: Principles and Applications

Xu, Feng; Liu, Zhuoyang

doi:10.3389/frai.2023.974295

REVIEW article

Front. Artif. Intell.
Sec. Machine Learning and Artificial Intelligence
Volume 6 - 2023 | doi: 10.3389/frai.2023.974295

Interpretable Neural Networks: Principles and Applications

Feng Xu^1*

Zhuoyang Liu¹

¹Fudan University, China

The final, formatted version of the article will be published soon.

You just subscribed to receive the final version of the article

In recent years, with the rapid development of deep learning technology, great progress has been made in computer vision, image recognition, pattern recognition and speech signal processing. However, due to the black-box nature of deep neural networks, one cannot explain the parameters in the deep network and why it can perfectly perform the assigned tasks. The interpretability of neural networks has now become a research hotspot in the field of deep learning. It covers a wide range of topics in speech and text signal processing, image processing, differential equation solving and other fields. There are subtle differences in the definition of interpretability in different fields. In this paper, interpretable neural network (INN) methods are divided into the following two directions: model decomposition neural networks, semantic interpretable neural networks. The former mainly constructs an interpretable neural network by converting the analytical model of conventional method into different layers of neural network, and combining the interpretability of conventional model-based method with the powerful learning capability of the neural network. This type of INNs is further classified differnet subtypes depending on which type of models they are derived from, i.e. mathematical models, physical models, and other models. The second type is interpretable network with visual semantic information for user understanding. Its basic idea is to use the visualization of the whole or partial network structure to assign semantic information to the network structure, which further includes convolutional layer output visualization, decision tree extraction, semantic graph, etc. This type of method mainly uses human visual logic to explain the structure of a black-box neural network. So it is a post-network-design method which tries to assign intepretability to a black-box network structure afterwards, as opposed to the pre-network-design method of model-based INNs, which designs intepretable network structure beforehand. This paper reviews recent progresses in these areas as well as various application scenarios of INNs, and discusses existing problems and future development directions.

Keywords: Model decomposition, Semantic graph, Interpretable neural networks, Electromagnetic neural network, Interpretability

Received: 21 Jun 2022; Accepted: 25 Sep 2023.

Copyright: © 2023 Xu and Liu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Mx. Feng Xu, Fudan University, Shanghai, China

REVIEW article

Interpretable Neural Networks: Principles and Applications

People also looked at