Extending Challenge Sets to Uncover Gender Bias in Machine Translation : Impact of Stereotypical Verbs and Adjectives
Faculty/Professorship: | Cognitive Systems |
Author(s): | Troles, Jonas-Dario ![]() ![]() |
Title of the compilation: | Sixth Conference on Machine Translation : proceedings of the conference |
Editors: | Barrault, Loic |
Corporate Body: | Association for Computational Linguistics (ACL) |
Conference: | Sixth Conference on Machine Translation, November 10-11, 2021, Punta Cana |
Publisher Information: | Stroudsburg, PA : Association for Computational Linguistics (ACL) |
Year of publication: | 2021 |
Pages: | 531–541 |
ISBN: | 978-1-954085-94-7 |
Language(s): | English |
URL: | https://aclanthology.org/2021.wmt-1.61 |
Abstract: | Human gender bias is reflected in language and text production. Because state-of-the-art machine translation (MT) systems are trained on large corpora of text, mostly generated by humans, gender bias can also be found in MT. For instance when occupations are translated from a language like English, which mostly uses gender neutral words, to a language like German, which mostly uses a feminine and a masculine version for an occupation, a decision must be made by the MT System. Recent research showed that MT systems are biased towards stereotypical translation of occupations. In 2019 the first, and so far only, challenge set, explicitly designed to measure the extent of gender bias in MT systems has been published. In this set measurement of gender bias is solely based on the translation of occupations. With our paper we present an extension of this challenge set, called WiBeMT, which adds gender-biased adjectives and sentences with gender-biased verbs. The resulting challenge set consists of over 70, 000 sentences and has been translated with three commercial MT systems: DeepL Translator, Microsoft Translator, and Google Translate. Results show a gender bias for all three MT systems. This gender bias is to a great extent significantly influenced by adjectives and to a lesser extent by verbs. |
GND Keywords: | Computerunterstützte Übersetzung; Geschlechterrolle; Bias |
Keywords: | Gender Bias, Machine Translation, Challenge Set |
DDC Classification: | 004 Computer science 300 Social sciences, sociology & anthropology |
RVK Classification: | ST 350 |
Peer Reviewed: | Ja |
International Distribution: | Ja |
Open Access Journal: | Ja |
Type: | Conferenceobject |
URI: | https://fis.uni-bamberg.de/handle/uniba/57197 |
Release Date: | 12. December 2022 |

originated at the
University of Bamberg
University of Bamberg