Deriving Temporal Protoypes from Saliency Map Clusters for the Analysis of Deep-Learning-based Facial Action Unit Classification





Faculty/Professorship: Cognitive Systems  ; Explainable Machine Learning ; University of Bamberg  
Author(s): Finzel, Bettina  ; Kollmann, Rene; Rieger, Ines  ; Pahl, Jaspar ; Schmid, Ute  
Title of the compilation: Proceedings of the LWDA 2021 Workshops: FGWM, KDML, FGWI-BIA, and FGIR
Conference: LWDA 2021 : Lernen, Wissen, Daten, Analysen 2021, Online, September 1-3, 2021
Publisher Information: Aachen, Germany : RWTH Aachen
Year of publication: 2021
Pages: 86-97
Language(s): English
URL: http://ceur-ws.org/Vol-2993/paper-09.pdf
Abstract: 
Reliably determining the emotional state of a person is a difficult task for both humans as well as machines. Automatic detection and evaluation of facial expressions is particularly important if people are unable to express their emotional state themselves, for example due to cognitive impairments. Identifying the presence of Action Units in a human’s face is a psychologically validated approach of quantifying which emotion is expressed. To automate the detection process of Action Units Neural Networks have been trained. However, the black-box nature of Deep Neural Networks provides no insight on the relevant features identified during the decision process. Approaches of Explainable Artificial Intelligence have to be applied to provide an explanation why the network came to a certain conclusion. In this work "Layer-Wise Relevance Propagation" (LRP) in combination with the meta analysis approach "Spectral Relevance Analysis" (SpRAy) is used to derive temporal prototypes from predictions in video sequences. Temporal prototypes provide an aggregated view on the prediction of the network by grouping together similar frames by considering relevance. Additionally, a specific visualization method for temporal prototypes is presented that highlights the most relevant areas for a prediction of an Action Unit. A quantitative evaluation of our approach shows that temporal prototypes aggregate temporal information well. The proposed method can be used to generate concise visual explanations for a sequence of interpretable saliency maps. Based on the above, this work shall provide the foundation for a new temporal analysis method as well as an explanation approach that is supposed to help researchers and experts to gain a deeper understanding of how the underlying network decides which Action Units are active in a particular emotional state.
GND Keywords: Prototyp; Gesichtserkennung; Neuronales Netz; Cluster-Analyse; Affective Computing; Korrelationsanalyse
Keywords: Temporal Prototypes, Facial Action Unit Detection, Layer-wise Relevance Propagation, Spectral Clustering, Affective Computing, Correlation Analysis
DDC Classification: 004 Computer science  
RVK Classification: ST 301   
Type: Conferenceobject
URI: https://fis.uni-bamberg.de/handle/uniba/55037
Release Date: 8. August 2022