Options
Mixture density networks for the indirect estimation of reference intervals
Hepp, Tobias; Zierk, Jakob; Rauh, Manfred; u. a. (2022): Mixture density networks for the indirect estimation of reference intervals, in: BMC bioinformatics, Berlin ; Heidelberg: Springer, Jg. 23, Nr. 1, 307, S. 1–17, doi: 10.1186/s12859-022-04846-0.
Faculty/Chair:
Author:
Title of the Journal:
BMC bioinformatics
ISSN:
1471-2105
Publisher Information:
Year of publication:
2022
Volume:
23
Issue:
1, 307
Pages:
Language:
English
Abstract:
Background
Reference intervals represent the expected range of physiological test results in a healthy population and are essential to support medical decision making. Particularly in the context of pediatric reference intervals, where recruitment regulations make prospective studies challenging to conduct, indirect estimation strategies are becoming increasingly important. Established indirect methods enable robust identification of the distribution of “healthy” samples from laboratory databases, which include unlabeled pathologic cases, but are currently severely limited when adjusting for essential patient characteristics such as age. Here, we propose the use of mixture density networks (MDN) to overcome this problem and model all parameters of the mixture distribution in a single step.
Results
Estimated reference intervals from varying settings with simulated data demonstrate the ability to accurately estimate latent distributions from unlabeled data using different implementations of MDNs. Comparing the performance with alternative estimation approaches further highlights the importance of modeling the mixture component weights as a function of the input in order to avoid biased estimates for all other parameters and the resulting reference intervals. We also provide a strategy to generate partially customized starting weights to improve proper identification of the latent components. Finally, the application on real-world hemoglobin samples provides results in line with current gold standard approaches, but also suggests further investigations with respect to adequate regularization strategies in order to prevent overfitting the data.
Conclusions
Mixture density networks provide a promising approach capable of extracting the distribution of healthy samples from unlabeled laboratory databases while simultaneously and explicitly estimating all parameters and component weights as non-linear functions of the covariate(s), thereby allowing the estimation of age-dependent reference intervals in a single step. Further studies on model regularization and asymmetric component distributions are warranted to consolidate our findings and expand the scope of applications.
Reference intervals represent the expected range of physiological test results in a healthy population and are essential to support medical decision making. Particularly in the context of pediatric reference intervals, where recruitment regulations make prospective studies challenging to conduct, indirect estimation strategies are becoming increasingly important. Established indirect methods enable robust identification of the distribution of “healthy” samples from laboratory databases, which include unlabeled pathologic cases, but are currently severely limited when adjusting for essential patient characteristics such as age. Here, we propose the use of mixture density networks (MDN) to overcome this problem and model all parameters of the mixture distribution in a single step.
Results
Estimated reference intervals from varying settings with simulated data demonstrate the ability to accurately estimate latent distributions from unlabeled data using different implementations of MDNs. Comparing the performance with alternative estimation approaches further highlights the importance of modeling the mixture component weights as a function of the input in order to avoid biased estimates for all other parameters and the resulting reference intervals. We also provide a strategy to generate partially customized starting weights to improve proper identification of the latent components. Finally, the application on real-world hemoglobin samples provides results in line with current gold standard approaches, but also suggests further investigations with respect to adequate regularization strategies in order to prevent overfitting the data.
Conclusions
Mixture density networks provide a promising approach capable of extracting the distribution of healthy samples from unlabeled laboratory databases while simultaneously and explicitly estimating all parameters and component weights as non-linear functions of the covariate(s), thereby allowing the estimation of age-dependent reference intervals in a single step. Further studies on model regularization and asymmetric component distributions are warranted to consolidate our findings and expand the scope of applications.
GND Keywords: ; ; ;
Klinisch-psychologischer Test
Entscheidungsunterstützung
Indirekte Messung
Neuronales Netz
Keywords: ; ; ;
Mixture density networks
Reference intervals
Latent class regression
Distributional regression
DDC Classification:
RVK Classification:
Type:
Article
Activation date:
December 15, 2023
Versioning
Question on publication
Permalink
https://fis.uni-bamberg.de/handle/uniba/92450