Scalable reinforcement learning through hierarchical decompositions for weakly-coupled problems

Autor(en): Toutounji, H.
Rothkopf, C.A.
Triesch, J.
Stichwörter: Computational costs; Empirical studies; Hierarchical decompositions; Hierarchical reinforcement learning; Possible solutions; Task dimensions, Animals; Learning algorithms, Reinforcement learning
Erscheinungsdatum: 2011
Journal: 2011 IEEE International Conference on Development and Learning, ICDL 2011
Zusammenfassung: 
Reinforcement Learning, or Reward-Dependent Learning, has been very successful at describing how animals and humans adjust their actions so as to increase their gains and reduce their losses in a wide variety of tasks. Empirical studies have furthermore identified numerous neuronal correlates of quantities necessary for such computations. But, in general it is too expensive for the brain to encode actions and their outcomes with respect to all available dimensions describing the state of the world. This suggests the existence of learning algorithms that are capable of taking advantage of the independencies present in the world and hence reducing the computational costs in terms of representations and learning. A possible solution is to use separate learners for task dimensions with independent dynamics and rewards. But the condition of independence is usually too restrictive. Here, we propose a hierarchical reinforcement learning solution for the more general case in which the dynamics are not independent but weakly coupled and show how to assign credit to the different modules, which solve the task jointly. © 2011 IEEE.
Beschreibung: 
Conference of 2011 IEEE International Conference on Development and Learning, ICDL 2011 ; Conference Date: 24 August 2011 Through 27 August 2011; Conference Code:87020
ISBN: 9781612849904
DOI: 10.1109/DEVLRN.2011.6037351
Externe URL: https://www.scopus.com/inward/record.uri?eid=2-s2.0-80055002863&doi=10.1109%2fDEVLRN.2011.6037351&partnerID=40&md5=167003b600ed2ae77ace6f98bc3d7356

Zur Langanzeige

Seitenaufrufe

1
Letzte Woche
0
Letzter Monat
0
geprüft am 17.05.2024

Google ScholarTM

Prüfen

Altmetric