Natural Multimodal Interaction in the Car - Generating Design Support for Speech, Gesture, and Gaze Interaction while Driving

Faculty/Professorship: University of Bamberg  ; Fakultät Wirtschaftsinformatik und Angewandte Informatik: Abschlussarbeiten 
Author(s): Roider, Florian
Publisher Information: Bamberg : Otto-Friedrich-Universität
Year of publication: 2021
Pages: 18, 207 ; Illustrationen, Diagramme
Supervisor(s): Gross, Tom ; Wolter, Diedrich  
Language(s): English
Dissertation, Otto-Friedrich-Universität Bamberg, 2021
DOI: 10.20378/irb-51826
Licence: Creative Commons - CC BY - Attribution 4.0 International 
URN: urn:nbn:de:bvb:473-irb-518266
Driving a modern car is more than just maneuvering the vehicle on the road. At the same time, drivers want to listen to music, operate the navigation system, compose, and read messages and more. Future cars are turning from simple means for transportation into smart devices on wheels. This trend will continue in the next years together with the advent of automated vehicles. However, technical challenges, legal regulations, and high costs slow down the penetration of automated vehicles. For this reason, a great majority of people will still be driving manually at least for the next decade. Consequently, it must be ensured that all the features of novel infotainment systems can be used easily, efficiently without distracting the driver from the task of driving and still provide a high user experience.
A promising approach to cope with this challenge is multimodal in-car interaction. Multimodal interaction basically describes the combination of different input and output modalities for driver-vehicle interaction. Research has pointed out the potential to create a more flexible, efficient, and robust interaction. In addition to that, the integration of natural interaction modalities such as speech, gestures and gaze, the communication with the car could increase the naturalness of the interaction.
Based on these advantages, the researcher community in the field of automotive user interfaces has produced several interesting concepts for multimodal interaction in vehicles. The problem is that the resulting insights and recommendations are often easily applicable in the design process of other concepts because they too concrete or very abstract. At the same time, concepts focus on different aspects. Some aim to reduce distraction while others want to increase efficiency or provide a better user experience. This makes it difficult to give overarching recommendations on how to combine natural input modalities while driving. As a consequence, interaction designers of in-vehicle systems are lacking adequate design support that enables them to transfer existing knowledge about the design of multimodal in-vehicle applications to their own concepts.
This thesis addresses this gap by providing empirically validated design support for multimodal in-vehicle applications. It starts with a review of existing design support for automotive and multimodal applications. Based on that we report a series of user experiments that investigate various aspects of multimodal in-vehicle interaction with more than 200 participants in lab setups and driving simulators. During these experiments, we assessed the potentials of multimodality while driving, explored how user interfaces can support speech and gestures, and evaluated novel interaction techniques. The insights from these experiments extend existing knowledge from literature in order to create the first pattern collection for multimodal natural in-vehicle interaction. The collection contains 15 patterns that describe solutions for reoccurring problems when combining natural input with speech, gestures, or gaze in the car in a structured way. Finally, we present a prototype of an in-vehicle information system, which demonstrates the application of the proposed patterns and evaluate it in a driving-simulator experiment.
This work contributes to field of automotive user interfaces in three ways. First, it presents the first pattern collection for multimodal natural in-vehicle interaction. Second, it illustrates and evaluates interaction techniques that combine speech and gestures with gaze input. Third, it provides empirical results of a series of user experiments that show the effects of multimodal natural interaction on different factors such as driving performance, glance behavior, interaction efficiency, and user experience.
GND Keywords: Multimodales System; Interaktion; Spracheingabe; Gestenerkennung; Gesichtserkennung; Steuerungssystem
Keywords: multimodal interaction, in-vehicle interation, natural input, speech input, gesture input, gaze input
DDC Classification: 004 Computer science  
RVK Classification: ST 302   
Type: Doctoralthesis
Release Date: 8. December 2021

File Description SizeFormat  
fisba51826_A3a.pdf4.85 MBPDFView/Open