Natural Multimodal Interaction in the Car - Generating Design Support for Speech, Gesture, and Gaze Interaction while Driving

Driving a modern car is more than just maneuvering the vehicle on the road. At the same time, drivers want to listen to music, operate the navigation system, compose, and read messages and more. Future cars are turning from simple means for transportation into smart devices on wheels. This trend will continue in the next years together with the advent of automated vehicles. However, technical challenges, legal regulations, and high costs slow down the penetration of automated vehicles. For this reason, a great majority of people will still be driving manually at least for the next decade. Consequently, it must be ensured that all the features of novel infotainment systems can be used easily, efficiently without distracting the driver from the task of driving and still provide a high user experience.
A promising approach to cope with this challenge is multimodal in-car interaction. Multimodal interaction basically describes the combination of different input and output modalities for driver-vehicle interaction. Research has pointed out the potential to create a more flexible, efficient, and robust interaction. In addition to that, the integration of natural interaction modalities such as speech, gestures and gaze, the communication with the car could increase the naturalness of the interaction.
Based on these advantages, the researcher community in the field of automotive user interfaces has produced several interesting concepts for multimodal interaction in vehicles. The problem is that the resulting insights and recommendations are often easily applicable in the design process of other concepts because they too concrete or very abstract. At the same time, concepts focus on different aspects. Some aim to reduce distraction while others want to increase efficiency or provide a better user experience. This makes it difficult to give overarching recommendations on how to combine natural input modalities while driving. As a consequence, interaction designers of in-vehicle systems are lacking adequate design support that enables them to transfer existing knowledge about the design of multimodal in-vehicle applications to their own concepts.
This thesis addresses this gap by providing empirically validated design support for multimodal in-vehicle applications. It starts with a review of existing design support for automotive and multimodal applications. Based on that we report a series of user experiments that investigate various aspects of multimodal in-vehicle interaction with more than 200 participants in lab setups and driving simulators. During these experiments, we assessed the potentials of multimodality while driving, explored how user interfaces can support speech and gestures, and evaluated novel interaction techniques. The insights from these experiments extend existing knowledge from literature in order to create the first pattern collection for multimodal natural in-vehicle interaction. The collection contains 15 patterns that describe solutions for reoccurring problems when combining natural input with speech, gestures, or gaze in the car in a structured way. Finally, we present a prototype of an in-vehicle information system, which demonstrates the application of the proposed patterns and evaluate it in a driving-simulator experiment.
This work contributes to field of automotive user interfaces in three ways. First, it presents the first pattern collection for multimodal natural in-vehicle interaction. Second, it illustrates and evaluates interaction techniques that combine speech and gestures with gaze input. Third, it provides empirical results of a series of user experiments that show the effects of multimodal natural interaction on different factors such as driving performance, glance behavior, interaction efficiency, and user experience.

GND Keywords:

Multimodales System

;

Interaktion

;

Spracheingabe

;

Gestenerkennung

;

Gesichtserkennung

;

Steuerungssystem

Keywords:

multimodal interaction, in-vehicle interation, natural input, speech input, gesture input, gaze input

DDC Classification:

004 Computer science

RVK Classification:

ST 302

Type:

Doctoralthesis

URI:

https://fis.uni-bamberg.de/handle/uniba/51826

Activation date:

December 8, 2021

Permalink https://fis.uni-bamberg.de/handle/uniba/51826

FIS

Full text/File(s)

Question on publication

Options

Full text/File(s)

Question on publication