Paper: Optimal Kronecker-Sum Approximation of Real Time Recurrent - PowerPoint PPT Presentation

Paper: Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning Poster: Online & Untruncated Gradients for RNNs Frederik Benzing*, Marcelo Matheus Gauy*, Asier Mujika, Anders Martinsson, Angelika Steger Department for Computer Science, D-INFK Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning | | 1 Online & Untruncated Gradients for RNNs

Recurrent Neural Nets (RNNs) Model temporal and sequential data (RL, audio synthesis, language  modelling,...) One of the key research challenges:  Learn Long-Term dependencies Department for Computer Science, D-INFK Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning | | 2 Online & Untruncated Gradients for RNNs

Training RNNs Truncated Backprop Trough Time (TBPTT) (Williams & Peng, 1990) Output hidden h 10 ... ... h t-1 h t h t+1 state Input Introduces arbitrary Truncation Horizon → no longer term dependencies  Parameter Update Lock during forward & backward pass  Department for Computer Science, D-INFK Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning | | 3 Online & Untruncated Gradients for RNNs

Forward Computing Gradients It looks like you Real Time Recurrent Learning (RTRL) (Williams & Zipser, 1989) want to do RTRL. Forward compute with recurrence Untruncated Gradients  Memory is independent of sequence length  Online parameter updates (no update lock)  BUT: Need n 4 Runtime and n 3 Memory (for n hidden units) → infeasible Department for Computer Science, D-INFK Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning | | 4 Online & Untruncated Gradients for RNNs

Approximate RTRL to save time & space It looks like you Online Recurrent Optimization (UORO) (Tallec & Ollivier, 2017) want to do RTRL. Idea: Don't store G t precisely, but approximately  n x 1 1 x n 2 and unbiasedly approximate recurrence equation. ➢ Memory: n 2 ➢ Runtime: n 3 Department for Computer Science, D-INFK Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning | | 5 Online & Untruncated Gradients for RNNs

Does it work? Part I UORO (Tallec & Ollivier, 2017) and KF-RTRL (Mujika et al., 2018) Character-level PTB Department for Computer Science, D-INFK Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning | | 6 Online & Untruncated Gradients for RNNs

Does it work? Part II Provably optimal approximation – Optimal Kronecker-Sum (OK) (our contribution) Copy Task Character-level PTB Department for Computer Science, D-INFK Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning | | 7 Online & Untruncated Gradients for RNNs

What to remember It looks like you got interested in Truncated BPTT has problems (truncation, update lock)  RTRL. Have a look at RTRL as online & untruncated alternative, but too costly Poster #166.  Our OK approx of RTRL reduces costs by factor n  No performance loss  Break update lock → faster convergence  Theoretically optimal (for certain class of approx)  Still need to reduce computational costs  Department for Computer Science, D-INFK Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning | | 8 Online & Untruncated Gradients for RNNs

Paper: Optimal Kronecker-Sum Approximation of Real Time Recurrent - PowerPoint PPT Presentation

Paper: Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning Poster: Online & Untruncated Gradients for RNNs Frederik Benzing, Marcelo Matheus Gauy, Asier Mujika, Anders Martinsson, Angelika Steger Department for Computer

The Kronecker product and the partition algebra Christopher Bowman Maud De Visscher Rosa

ex Addition: 1-bit half adder A + Sum B Carry out Carry A B Sum out 0 0 A 0 1 Sum

Tensor Invariants and Kronecker Coefficients Jiarui Fei University of California, Riverside

Kronecker coefficients: bounds and complexity Igor Pak, UCLA Triangle Lectures in Combinatorics,

6. Approximation and fitting norm approximation least-norm problems regularized

Misha E. Kilmer Tufts University James Nagy Emory University Lisa Perrone Tufts University

Real- Real -Time Systems Time Systems Real- -Time Systems Time Systems Real

Real Real- -Time Systems Time Systems Designing a real- Designing a real -time system time

Real- Real -time systems time systems Real- Real -time programming time programming

Real graduates, Real graduates, real transitions, real transitions, real stories: real

The ML Language end; Lisp Algol 60 (We will use Standard ML.)

Real Real Real Time Real-Time Time Time Model Checking Model Model Checking Model

PAPER PROJECT 1 SOURCE: http://www.printhaus.es/diferencias-entre-papel/ PAPER PROJECT 1: TYPES

PAPER PROJECT 3 SOURCE: http://www.printhaus.es/diferencias-entre-papel/ PAPER PROJECT 3: TYPES

Approximation Algorithms Subset Sum III Instance : X = { x 1 , . . . , x n } n integer

Basic Ruby Syntax sum = 0 Newline is statement separator i = 1 while i <= 10 do sum += i*i

Parallel Recursion: Ladner-Fischer Parallel Prefix Sum Greg Plaxton Theory in Programming

The (inescapable) p -adics Alex J. Best 5/5/2018 BU Math Retreat 2018 Linear recurrence

Recursion Summary Topics recursion overview simple examples Sierpinski

ECE 2574: Data Structures and Algorithms - Applications of Recursion II C. L. Wyatt Today we

An Invitation to Nested Recurrence Relations CanaDAM June 2013 Steve Tanny Department of

Recurrence Release Update Plan better, easier! Agenda Todays hit list Why youre here

Divide-and-conquer, part 2: Master Theorem Russell Impagliazzo and Miles Jones Thanks to Janine

Computer Science & Engineering 423/823 Design and Analysis of Algorithms Lecture 01 Shall

Paper: Optimal Kronecker-Sum Approximation of Real Time Recurrent - PowerPoint PPT Presentation

Paper: Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning Poster: Online & Untruncated Gradients for RNNs Frederik Benzing*, Marcelo Matheus Gauy*, Asier Mujika, Anders Martinsson, Angelika Steger Department for Computer

The Kronecker product and the partition algebra Christopher Bowman Maud De Visscher Rosa

ex Addition: 1-bit half adder A + Sum B Carry out Carry A B Sum out 0 0 A 0 1 Sum

Tensor Invariants and Kronecker Coefficients Jiarui Fei University of California, Riverside

Kronecker coefficients: bounds and complexity Igor Pak, UCLA Triangle Lectures in Combinatorics,

6. Approximation and fitting norm approximation least-norm problems regularized

Misha E. Kilmer Tufts University James Nagy Emory University Lisa Perrone Tufts University

Real- Real -Time Systems Time Systems Real- -Time Systems Time Systems Real

Real Real- -Time Systems Time Systems Designing a real- Designing a real -time system time

Real- Real -time systems time systems Real- Real -time programming time programming

Real graduates, Real graduates, real transitions, real transitions, real stories: real

The ML Language end; Lisp Algol 60 (We will use Standard ML.)

Real Real Real Time Real-Time Time Time Model Checking Model Model Checking Model

PAPER PROJECT 1 SOURCE: http://www.printhaus.es/diferencias-entre-papel/ PAPER PROJECT 1: TYPES

PAPER PROJECT 3 SOURCE: http://www.printhaus.es/diferencias-entre-papel/ PAPER PROJECT 3: TYPES

Approximation Algorithms Subset Sum III Instance : X = { x 1 , . . . , x n } n integer

Basic Ruby Syntax sum = 0 Newline is statement separator i = 1 while i &lt;= 10 do sum += i*i

Parallel Recursion: Ladner-Fischer Parallel Prefix Sum Greg Plaxton Theory in Programming

The (inescapable) p -adics Alex J. Best 5/5/2018 BU Math Retreat 2018 Linear recurrence

Recursion Summary Topics recursion overview simple examples Sierpinski

ECE 2574: Data Structures and Algorithms - Applications of Recursion II C. L. Wyatt Today we

An Invitation to Nested Recurrence Relations CanaDAM June 2013 Steve Tanny Department of

Recurrence Release Update Plan better, easier! Agenda Todays hit list Why youre here

Divide-and-conquer, part 2: Master Theorem Russell Impagliazzo and Miles Jones Thanks to Janine

Computer Science &amp; Engineering 423/823 Design and Analysis of Algorithms Lecture 01 Shall

Paper: Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning Poster: Online & Untruncated Gradients for RNNs Frederik Benzing, Marcelo Matheus Gauy, Asier Mujika, Anders Martinsson, Angelika Steger Department for Computer

Basic Ruby Syntax sum = 0 Newline is statement separator i = 1 while i <= 10 do sum += i*i

Computer Science & Engineering 423/823 Design and Analysis of Algorithms Lecture 01 Shall