Modular inverse reinforcement learning for visuomotor behavior

Autor(en): Rothkopf, Constantin A.
Ballard, Dana H.
Stichwörter: BASAL GANGLIA; Computer Science; Computer Science, Cybernetics; DOPAMINE; Inverse reinforcement learning; Neurosciences; Neurosciences & Neurology; Spatial navigation; Task priorities; Visuomotor behavior
Erscheinungsdatum: 2013
Herausgeber: SPRINGER
Journal: BIOLOGICAL CYBERNETICS
Volumen: 107
Ausgabe: 4
Startseite: 477
Seitenende: 490
Zusammenfassung: 
In a large variety of situations one would like to have an expressive and accurate model of observed animal or human behavior. While general purpose mathematical models may capture successfully properties of observed behavior, it is desirable to root models in biological facts. Because of ample empirical evidence for reward-based learning in visuomotor tasks, we use a computational model based on the assumption that the observed agent is balancing the costs and benefits of its behavior to meet its goals. This leads to using the framework of reinforcement learning, which additionally provides well-established algorithms for learning of visuomotor task solutions. To quantify the agent's goals as rewards implicit in the observed behavior, we propose to use inverse reinforcement learning, which quantifies the agent's goals as rewards implicit in the observed behavior. Based on the assumption of a modular cognitive architecture, we introduce a modular inverse reinforcement learning algorithm that estimates the relative reward contributions of the component tasks in navigation, consisting of following a path while avoiding obstacles and approaching targets. It is shown how to recover the component reward weights for individual tasks and that variability in observed trajectories can be explained succinctly through behavioral goals. It is demonstrated through simulations that good estimates can be obtained already with modest amounts of observation data, which in turn allows the prediction of behavior in novel configurations.
ISSN: 03401200
DOI: 10.1007/s00422-013-0562-6

Show full item record

Google ScholarTM

Check

Altmetric