Options
Towards Multimodal Emotion Recognition in German Speech Events in Cars using Transfer Learning
Cevher, Deniz; Zepf, Sebastian; Klinger, Roman (2019): Towards Multimodal Emotion Recognition in German Speech Events in Cars using Transfer Learning, in: Proceedings of the 15th Conference on Natural Language Processing (KONVENS 2019), German Society for Computational Linguistics & Language Technology, S. 79–90.
Faculty/Chair:
Author:
Title of the compilation:
Proceedings of the 15th Conference on Natural Language Processing (KONVENS 2019)
Conference:
KONVENS
Publisher Information:
Year of publication:
2019
Pages:
Language:
English
Abstract:
The recognition of emotions by humans is a complex process which considers multiple interacting signals such as facial expressions and both prosody and semantic content of utterances. Commonly, research on automatic recognition of emotions is, with few exceptions, limited to one modality. We describe an in-car experiment for emotion recognition from speech interactions for three modalities: the audio signal of a spoken interaction, the visual signal
of the driver’s face, and the manually transcribed content of utterances of the driver. We use off-the-shelf tools for emotion detection in audio and face and compare that to a neural transfer learning approach for emotion recognition from text which utilizes existing resources from other domains. We see that transfer learning enables models based on out-of-domain corpora to per- form well. This method contributes up to 10 percentage points in F1, with up to 76 micro-average F1 across the emotions joy, annoyance and insecurity. Our findings also indicate that off-the-shelf-tools analyzing face and audio are not ready yet for emotion detection in in-car speech interactions without further adjustments.
of the driver’s face, and the manually transcribed content of utterances of the driver. We use off-the-shelf tools for emotion detection in audio and face and compare that to a neural transfer learning approach for emotion recognition from text which utilizes existing resources from other domains. We see that transfer learning enables models based on out-of-domain corpora to per- form well. This method contributes up to 10 percentage points in F1, with up to 76 micro-average F1 across the emotions joy, annoyance and insecurity. Our findings also indicate that off-the-shelf-tools analyzing face and audio are not ready yet for emotion detection in in-car speech interactions without further adjustments.
GND Keywords: ; ;
Sprachverarbeitung
Gefühl
Kraftwagen
Keywords:
Multimodal Emotion Recognition
DDC Classification:
RVK Classification:
Peer Reviewed:
Yes:
Type:
Conferenceobject
Activation date:
March 7, 2024
Versioning
Question on publication
Permalink
https://fis.uni-bamberg.de/handle/uniba/93915