Learning long-term dependencies with recurrent neural networks

Autor(en): Schaefer, Anton Maximilian
Udluft, Steffen
Zimmermann, Hans-Georg
Stichwörter: backpropagation; Computer Science; Computer Science, Artificial Intelligence; inflation; long-term dependencies; memory; recurrent neural networks; state space model; system identification; UNIVERSAL APPROXIMATORS; vanishing gradient
Erscheinungsdatum: 2008
Herausgeber: ELSEVIER SCIENCE BV
Journal: NEUROCOMPUTING
Volumen: 71
Ausgabe: 13-15
Startseite: 2481
Seitenende: 2488
Zusammenfassung: 
Recurrent neural networks (RNN) unfolded in time are in theory able to map any open dynamical system. Still, they are often blamed to be unable to identify long-term dependencies in the data. Especially when they are trained with backpropagation it is claimed that RNN unfolded in time fail to learn inter-temporal influences more than 10 time steps apart. This paper refutes this often cited statement by giving counter-examples. We show that basic time-delay RNN unfolded in time and formulated as state space models are indeed capable of learning time lags of at least a 100 time steps. We point out that they even possess a self-regularisation characteristic, which adapts the internal error backflow, and analyse their optimal weight initialisation. In addition, we introduce the idea of inflation for modelling of long- and short-term memory and demonstrate that this technique further improves the performance of RNN. (C) 2008 Elsevier B.V. All rights reserved.
ISSN: 09252312
DOI: 10.1016/j.neucom.2007.12.036

Show full item record

Google ScholarTM

Check

Altmetric