Bayesian estimation of latent trait distributions considering hierarchical structures and partially missing covariate data

Faculty/Professorship: Statistics and Econometrics  ; Leibniz Institute for Educational Trajectories (LIfBi) 
Author(s): Gaasch, Jean-Christoph
Publisher Information: Bamberg : opus
Year of publication: 2017
Pages: xii, 155 ; Illustrationen, Diagramme
Supervisor(s): Rässler, Susanne; Carstensen, Claus H.  
Language(s): English
Dissertation, Otto-Friedrich-Universität Bamberg, 2017
DOI: 10.20378/irbo-50290
Licence: German Act on Copyright 
URN: urn:nbn:de:bvb:473-opus4-502904
Large-scale studies in social sciences often involve the measurement of latent constructs and seek to investigate their relationship with additional variables in subsequent analyses. Within this context the analyst has to face three problems: First, there is uncertainty through the particular indicators which measure the trait of interest. Second, large-scale studies typically exhibit hierarchical structures caused by sampling design or a composite population consisting of clustered observations. Third, uncertainty arises due to the presence of missing values in covariates related to the latent construct. This thesis provides a Bayesian estimation strategy that simultaneously addresses all three issues. I start out with the class of latent regression item response models, which combine the fields of measurement models and structural analysis, and develop a novel algorithm based on the device of data augmentation. Binary and ordered polytomous items can both be included in the analysis. Population heterogeneity is taken into account either through multigroup, finite mixture or random intercept specifications. Sampling from the posterior distribution of parameters is enriched by sampling from the full conditional distributions of missing values in person covariates. Approximations for the distributions of missing values are constructed from classification and regression trees, thus allowing for high flexibility in the incorporation of metric as well as categorical variables and nonlinear relationships. The validity of the proposed strategy is evaluated with respect to statistical accuracy by two simulation studies controlling the missing data generating mechanism. I show that the novel algorithm is capable of recovering all involved parameters in each of the two scenarios and clearly outperforms stochastic regression imputation and complete cases analysis. Two illustrations using data from the National Educational Panel Study on mathematical abilities and eating disorders of ninth grade students demonstrate the empirical usefulness of the method. Finally, I introduce an R package which implements the estimation routines presented in the thesis.
GND Keywords: Markov-Modell; Monte-Carlo-Simulation; Probabilistische Testtheorie; Fehlende Daten; Bevölkerung; Heterogenität; Statistik; Datenverarbeitung
Keywords: item response theory, population heterogeneity, Markov chain Monte Carlo, multiple imputation, statistical computing
DDC Classification: 310 Statistics  
RVK Classification: QH 239     QH 250     QH 235   
Type: Doctoralthesis
Year of publication: 1. December 2017

File SizeFormat  
DissGaaschOPUS_finkse_A3a.pdf11.58 MBPDFView/Open