Pahl, JasparJasparPahlRieger, InesInesRieger0000-0002-8694-762XMöller, AnnaAnnaMöllerWittenberg, ThomasThomasWittenbergSchmid, UteUteSchmid0000-0002-1301-03262022-08-082022-08-082022978-1-4503-9352-2https://fis.uni-bamberg.de/handle/uniba/55036Nowadays, Artificial Intelligence (AI) algorithms show a strong performance for many use cases, making them desirable for real-world scenarios where the algorithms provide high-impact decisions. However, one major drawback of AI algorithms is their susceptibility to bias and resulting unfairness. This has a huge influence for their application, as they have a higher failure rate for certain subgroups. In this paper, we focus on the field of affective computing and particularly on the detection of bias for facial expressions. Depending on the deployment scenario, bias in facial expression models can have a disadvantageous impact and it is therefore essential to evaluate the bias and limitations of the model. In order to analyze the metadata distribution in affective computing datasets, we annotate several benchmark training datasets, containing both Action Units and categorical emotions, with age, gender, ethnicity, glasses, and beards. We show that there is a significantly skewed distribution, particularly for ethnicity and age. Based on this metadata annotation, we evaluate two trained state-of-the-art affective computing algorithms. Our evaluation shows that the strongest bias is in age, with the best performance for persons under 34 and a sharp decrease for older persons. Furthermore, we see an ethnicity bias with varying direction depending on the algorithm, a slight gender bias and worse performance for facial parts occluded by glasses.engaffective computingaction unitscategorical emotionsmetadata post-annotationbiasfairnessdata evaluation,algorithm evaluation004Female, white, 27? : Bias Evaluation on Data and Algorithms for Affect Recognition in Facesconferenceobject10.1145/3531146.3533159