Presentation Title
Mean Field Analysis of Recurrent Neural Networks
Faculty Mentor
Maria Spiropolu
Start Date
18-11-2017 9:59 AM
End Date
18-11-2017 11:00 AM
Location
BSC-Ursa Minor 142
Session
Poster 1
Type of Presentation
Poster
Subject Area
physical_mathematical_sciences
Abstract
We analyze the dynamics of recurrent neural networks whose initial weights and biases are randomly distributed. Utilizing mean field theory and making some mean field approximations, we develop a theoretical basis for length scales which limit the depth of signal propagation in a one layer recurrent network with long input or prediction sequences. For specific hyperparameter choices, these length scales diverge. Building on the conclusions of Schoenholz et. al. (1), we argue that random recurrent networks can only be trained if the input signal can propagate through them. This suggests that recurrent networks can be trained with arbitrarily long sequences provided they are initialized sufficiently close to criticality. We begin searching for empirical evidence in support of our conclusions, and provide some preliminary results.
Mean Field Analysis of Recurrent Neural Networks
BSC-Ursa Minor 142
We analyze the dynamics of recurrent neural networks whose initial weights and biases are randomly distributed. Utilizing mean field theory and making some mean field approximations, we develop a theoretical basis for length scales which limit the depth of signal propagation in a one layer recurrent network with long input or prediction sequences. For specific hyperparameter choices, these length scales diverge. Building on the conclusions of Schoenholz et. al. (1), we argue that random recurrent networks can only be trained if the input signal can propagate through them. This suggests that recurrent networks can be trained with arbitrarily long sequences provided they are initialized sufficiently close to criticality. We begin searching for empirical evidence in support of our conclusions, and provide some preliminary results.