Hf-Rnn Supp

Powerful Essays

Hf-Rnn Supp

Learning Recurrent Neural Networks with
Hessian-Free Optimization: Supplementary
Materials
Contents
1 Pseudo-code for the damped Gauss-Newton vector product 2
2 Details of the pathological synthetic problems 3
2.1 The addition, multiplication, and XOR problem . . . . . . . . . . . . 3
2.2 The temporal order problem . . . . . . . . . . . . . . . . . . . . . . 4
2.3 The 3-bit temporal order problem . . . . . . . . . . . . . . . . . . . . 4
2.4 The random permutation problem . . . . . . . . . . . . . . . . . . . 4
2.5 Noiseless memorization . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Details of the natural problems 5
3.1 The bouncing balls problem . . . . . . . . . . . . . . . . . . . . . . 5
3.2 The MIDI dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.3 The speech dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1
1 Pseudo-code for the damped Gauss-Newton vector product
Algorithm 1 Computation of the matrix-vector product of the structurally-damped
Gauss-Newton matrix with the vector v, for the case when e is the tanh non-linearity, g the logistic sigmoid, D and L are the corresponding matching loss functions. The notation reflects the “convex approximation” interpretation of the GN matrix so that we are applying the R operator to the forwards-backwards pass through the linearized and structurally damped objective ~k, and the desired matrix-vector product is given by
Rd~k
d . All derivatives are implicitly evaluated at = n. The previously defined parameter symbols Wph, Whx, Whh, bh, bp binit h will correspond to the parameter vector n if they have no super-script and to the input parameter vector v if they have the ‘v’ superscript.
The Rz notation follows Pearlmutter [1994], and for the purposes of reading the pseudo-code can be interpreted as merely defining a new symbol. We assume that intermediate quantities of the network (e.g. hi) have already been computed (from n).
The operator

References: J.S. Garofolo, L.F. Lamel, W.M. Fisher, J.G. Fiscus, D.S. Pallett, and N.L. Dahlgren. Darpa Timit: Acoustic-phonetic Continuous Speech Corps CD-ROM. US Dept. of Commerce, National Institute of Standards and Technology, 1993. S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 1997. 7 J. Martens. Deep learning via Hessian-free optimization. In Proceedings of the 27th International Conference on Machine Learning (ICML), 2010. B.A. Pearlmutter. Fast exact multiplication by the Hessian. Neural Computation, 1994. 8

Hf-Rnn Supp

You May Also Find These Documents Helpful

Arithmetic Mean and Bounce Plate

Arithmetic Mean and Bounce Plate

Pt1420 Unit 3 Assignment

Pt1420 Unit 3 Assignment

Michelle And Maggie Research Paper

Michelle And Maggie Research Paper

ASTR 101 Notes / Homeworks / Exams

ASTR 101 Notes / Homeworks / Exams

1122

1122

Math/116 Syllabus

Math/116 Syllabus

econ 513 final exam

econ 513 final exam

Optimization Exam Paper

Optimization Exam Paper

The Three Branchs Of Our Government

The Three Branchs Of Our Government

my best

my best

Online Taxi Booking System

Online Taxi Booking System

Execution of Basic Steps with Accuracy While Dancing: Step-by-Step Dance Moves

Execution of Basic Steps with Accuracy While Dancing: Step-by-Step Dance Moves

viral

viral

4.1 Schedule Of Traffic Light For Normal Sequence

4.1 Schedule Of Traffic Light For Normal Sequence

What Is Midi ?

What Is Midi ?

Related Topics

Hf-Rnn Supp

You May Also Find These Documents Helpful

Arithmetic Mean and Bounce Plate

Arithmetic Mean and Bounce Plate

Pt1420 Unit 3 Assignment

Pt1420 Unit 3 Assignment

Michelle And Maggie Research Paper

Michelle And Maggie Research Paper

ASTR 101 Notes / Homeworks / Exams

ASTR 101 Notes / Homeworks / Exams

1122

1122

Math/116 Syllabus

Math/116 Syllabus

econ 513 final exam

econ 513 final exam

Optimization Exam Paper

Optimization Exam Paper

The Three Branchs Of Our Government

The Three Branchs Of Our Government

my best

my best

Online Taxi Booking System

Online Taxi Booking System

Execution of Basic Steps with Accuracy While Dancing: Step-by-Step Dance Moves

Execution of Basic Steps with Accuracy While Dancing: Step-by-Step Dance Moves

viral

viral

4.1 Schedule Of Traffic Light For Normal Sequence

4.1 Schedule Of Traffic Light For Normal Sequence

What Is Midi ?

What Is Midi ?

Related Topics

Report this documents

Please chosse a reason

You'll be redirected