Options
Seeing the wood for the trees : Predictive margins for random forests
Sönning, Lukas; Grafmiller, Jason (2023): Seeing the wood for the trees : Predictive margins for random forests, in: Corpus Linguistics and Linguistic Theory, Berlin: de Gruyter, Jg. 20, Nr. 1, S. 153–181, doi: 10.1515/cllt-2022-0083.
Faculty/Chair:
Author:
Title of the Journal:
Corpus Linguistics and Linguistic Theory
ISSN:
1613-7035
1613-7027
Publisher Information:
Year of publication:
2023
Volume:
20
Issue:
1
Pages:
Language:
English
Abstract:
Classification trees and random forests offer a number of attractive features to corpus data analysts. However, the way in which these models are typically reported – a decision tree and/or set of variable importance scores – offers insufficient information if interest centers on the (form of) relationship between (multiple) predictors and the outcome. This paper develops predictive margins as an interpretative approach to ensemble techniques such as random forests. These are model summaries in the form of adjusted predictions, which provide a clearer picture of patterns in the data and allow us to query a model on potential non-linear associations and interactions among predictor variables. The present paper outlines the general strategy for forming predictive margins and addresses methodological issues from an explicitly (corpus) linguistic perspective. For illustration, we use data on the English genitive alternation and provide an R package and code for their implementation.
GND Keywords: ; ;
Korpus, Linguistik
Entscheidungsbaum
Klassifikations- und Regressionsbaum
Keywords: ; ; ; ;
average predictive comparisons
classification trees
interpretable machine learning
predictive modeling
random forests
DDC Classification:
RVK Classification:
Peer Reviewed:
Yes:
International Distribution:
Yes:
Type:
Article
Activation date:
May 24, 2023
Versioning
Question on publication
Permalink
https://fis.uni-bamberg.de/handle/uniba/59523