Investigation of Automated and Augmented Covariate Imputation during the Estimation of two Confirmatory Multidimensional Bayesian Latent Variable Models

Faculty/Professorship: Statistics and Econometrics  ; Fakultät Sozial- und Wirtschaftswissenschaften: Abschlussarbeiten 
Author(s): Schnapp, Thorsten
Publisher Information: Bamberg : Otto-Friedrich-Universität
Year of publication: 2019
Pages: vi, 370 ; Illustrationen, Diagramme
Supervisor(s): Aßmann, Christian  ; Carstensen, Claus H. ; Saalfeld, Thomas  
Language(s): English
Dissertation, Otto-Friedrich-Universität Bamberg, 2019 ; Note, that this thesis uses data from the National Educational Panel Study (NEPS): Starting Cohort 4 – 9th Grade, doi:10.5157/NEPS:SC4:9.1.0. From 2008 to 2013, NEPS data were collected as part of the Framework Programme for the Promotion of Empirical Educational Research funded by the German Federal Ministry of Education and Research (BMBF). As of 2014, the NEPS survey is carried out by the Leibniz Institute for Educational Trajectories (LIfBi) at the University of Bamberg in cooperation with a nationwide network.
DOI: 10.20378/irb-46497
Licence: German Act on Copyright 
URN: urn:nbn:de:bvb:473-irb-464970
The purpose of this thesis is to investigate parameter estimation and simultaneous imputation of covariates for two multidimensional Bayesian latent variable models. These models encompass a multivariate confirmatory factor analytic model and a multivariate normal-ogive item response theory (IRT) model. The basic idea for the research conducted stems from the positive findings for IRT models with one latent variable of Aßmann et al. (2015) and Gaasch (2017) respectively, and the research of Preising (2018) concerning dynamic linear panel models, with presenting, aside from estimation, a relief for the burden of determining an imputation model for missing values in covariates by using an automated procedure, i.e. imputation via a classification and regression tree approach (CART).

The models considered within this thesis and later to be extended, originate from the class of confirmatory factor analytic and normal-ogive IRT settings, who find application in psychometric modelling, social and political sciences, medicine and many areas more. Their features get described by means of discussing concepts, first considerations with respect to estimation within a Bayesian framework, mathematical model formulations, different terminologies used in varying contexts, and an overview of considerations with respect to model and parameter identification.

By comparing different latent variable models (i.e. Aßmann et al., 2015, Gaasch, 2017, and Conti et al., 2014) and their application, ways of including covariates and handling of missing values get discussed. Using Conti et al.'s (2014) model as a basis, our derivations transform their explorative factor model into a confirmatory matrix-variate factor and a normal-ogive IRT model with several latent variables and incorporating covariates in two different ways. In order to handle missing values within these covariates, CART imputation is employed. Next to binary and metric scaled measurements, estimation of ordered scaled items is facilitated with the help of an algorithm of Albert et al. (1997), which was already implemented and tested by Gaasch (2017) in his research on IRT models with one latent variable. In order to allow the possibility of defining restrictions, the indicator matrix employed by Conti et al. (2014) gets extended to specify equality and multiplicative structures for loadings and item discrimination parameters respectively. In addition to the dedicated loading structure of Conti et al. (2014) this matrix can be used to specify cross-loading, such that a measurement can be related to several latent variables simultaneously.

In order to test the proposals made, nine factor analytic and twelve IRT model related simulation studies, and an empirical application to competency data stemming from the National Educational Panel study (see Blossfeld et al., 2011) are performed. The findings indicate the general ability to conduct the estimation and benefits for using the CART imputation within the IRT approach. However the factor analytic specification, although in general estimable without missing values, does not provide reasonable results compared to complete-case analysis when using the CART imputation approach. Thus, future directions of further research can be investigating reasons and alternatives for the shortcomings with respect to the factor analytic approach and general extensions to the IRT model.
GND Keywords: Multivariate Daten ; Klassifikations- und Regressionsbaum ; Bayes-Verfahren ; Gibbs-sampling ; Probabilistische Testtheorie ; Faktorenanalyse
Keywords: Multidimensional Latent Variables, Classification and Regression Trees, Bayesian, Gibbs Sampling, MCMC, Item Response Theory, Factor Analysis
DDC Classification: 310 Statistics  
RVK Classification: QH 234   
Type: Doctoralthesis
Release Date: 19. December 2019
Project: Ein Bayesianischer Modellrahmen für die Auswertung von Daten aus Iängsschnittlichen Large-scale Assessments

File Description SizeFormat  
fisba46497_A3a.pdf1.88 MBPDFView/Open