The formal foundation of an evolutionary theory of reinforcement
|Faculty/Professorship:||Foundations in Education|
|Author(s):||Borgstede, Matthias ; Eggert, Frank|
|Publisher Information:||Bamberg : Otto-Friedrich-Universität|
|Year of publication:||2022|
|Source/Other editions:||Behavioural processes : an international journal of comparative and physiological ethology. 186 (2021), May, S. 1-30. - DOI: 10.1016/j.beproc.2021.104370|
|is version of:||10.1016/j.beproc.2021.104370|
|Year of first publication:||2021|
|Licence:||Creative Commons - CC BY-NC-ND - Attribution - NonCommercial - NoDerivatives 4.0 International|
Reinforcement learning is often described by analogy to natural selection. However, there is no coherent theory relating reinforcement learning to evolution within a single formal model of selection. This paper provides the formal foundation of such a unified theory. The model is based on the most general description of natural selection as given by the Price equation. We extend the Price equation to cover reinforcement learning as the result of a behavioral selection process within individuals and relate it to the principle of natural selection via the concept of statistical fitness predictors by means of a multilevel model of behavioral selection.
The main result is the covariance-based law of effect, which describes reinforcement learning on a molar level by means of the covariance between behavioral allocation and a statistical fitness predictor. We further demonstrate how this abstract principle can be applied to derive theoretical explanations of various empirical findings, like conditioned reinforcement, blocking, matching and response deprivation.
Our model is the first to apply the abstract principle of selection to derive a unified description of reinforcement learning and natural selection within a single model. It provides a general analytical tool for behavioral psychology in a similar way that the theory of natural selection does for evolutionary biology. We thus lay the formal foundation of a general theory of reinforcement as the result of behavioral selection on multiple levels.
|GND Keywords:||Verhaltenspsychologie; Operante Konditionierung|
|Keywords:||selection by consequences, behavioral selection, natural selection, reinforcement learning, Price equation, multilevel model of behavioral selection|
|DDC Classification:||150 Psychology|
|RVK Classification:||CP 8000|
|Release Date:||21. November 2022|
originated at the
University of Bamberg
University of Bamberg