Relational Explanations for Visual Domains : A Neural-symbolic Approach Combining ILP and CNNs

With powerful machine learning methods leaving the lab, the need for transparency in automatic decision processes becomes apparent. Only if humans have the ability to scrutinize how a model behaves and what the rationale behind a decision was, they will gain trust in a system. However, current neural network architectures are mainly black-box, so it is not easily possible to comprehend model decisions. Symbolic machine learning approaches that are inherently interpretable already exist for a long time in artificial intelligence research. However, most of these are not actually suitable for real world applications, since they lack the ability to work with raw (image) data, and thus, an accuracy-interpretability trade-off emerges.

The research branch of eXplainable Artificial Intelligence (XAI) advises interpretable surrogate models as a possible solution: Instead of abandoning a powerful model, it is kept as is and an additional interpretable model is generated, mimicking the behavior of the black-box. The gold standard for explaining image processing approaches like convolutional neural networks are visual attribution methods. For an image instance, methods like LIME or Grad-CAM output heatmaps, indicating regions that were influential (in a positive or negative way) for a particular model decision. These methods can give a first idea, what constituents in an image are important, and can pinpoint flaws in the trained model that can stem from e.g. bias in the training data. However, they lack expressiveness and can obscure the importance of e.g. relations that hold between image parts. This becomes particularly important for relational domains, where classification depends not only on the presence of image parts, but also their (spatial) constellation.

This thesis will describe and analyze methods of generating expressive symbolic relational explanations in the form of first-order logic rules. These rules inherently highlight the importance of not only visual concepts, but also of the relations between them. They also can be easily interpreted by humans and can even be converted to natural language in a straight-forward fashion. This work gives approaches for generating explanations in a variety of problem settings: Not only is it possible to explain the decision for a single image, it is also possible to explain the model as a whole. Also, when access to model parameters is given, explanations can benefit from that.

In an attempt to holistically approach relational explanation generation, this work will look at methods on how to be as close to the behavior of the original model as possible and also on how to quantify this "fidelity". Additionally, this work will revisit the generated explanations and ask, what types of explanations are particularly useful for humans.

GND Keywords:

Explainable Artificial Intelligence

;

Maschinelles Lernen

;

Künstliche Intelligenz

;

Neuronales Netz

;

Induktive logische Programmierung

Keywords:

Explainable Machine Learning

;

Artificial Intelligence

;

Convolutional Neural Networks

;

Inductive Logic Programming

;

Neural-symbolic

DDC Classification:

004 Computer science

RVK Classification:

ST 301

Type:

Doctoralthesis

URI:

https://fis.uni-bamberg.de/handle/uniba/104269

Activation date:

November 18, 2024

Permalink https://fis.uni-bamberg.de/handle/uniba/104269

FIS

Full text/File(s)

Question on publication

Options

Full text/File(s)

Question on publication