StrassenNets: Deep Learning with a Multiplication Budget
Michael Tschannen∗
michaelt@nari.ee.ethz.ch
13 July 2018 Joint work with Aran Khanna∗ and Anima Anandkumar∗
∗work done at Amazon AI
StrassenNets: Deep Learning with a Multiplication Budget Michael - - PowerPoint PPT Presentation
StrassenNets: Deep Learning with a Multiplication Budget Michael Tschannen michaelt@nari.ee.ethz.ch 13 July 2018 Joint work with Aran Khanna and Anima Anandkumar work done at Amazon AI Motivation Outstanding predictive
michaelt@nari.ee.ethz.ch
∗work done at Amazon AI
2 / 16
2 / 16
3 / 16
3 / 16
3 / 16
3 / 16
3 / 16
4 / 16
4 / 16
4 / 16
4 / 16
4 / 16
4 / 16
4 / 16
5 / 16
5 / 16
5 / 16
5 / 16
5 / 16
6 / 16
BWN TWN TTQ FP
BWN TWN TTQ FP
BWN TWN TTQ FP
9 / 16
BWN TWN TTQ FP
BWN TWN TTQ FP
BWN TWN TTQ FP
6 4 2 1
1 2
9 / 16
BWN TWN TTQ FP
BWN TWN TTQ FP
BWN TWN TTQ FP
6 4 2 1
1 2
9 / 16
◮ Word-level decoder ◮ 2-layer LSTM, 650 units ◮ 2-layer highway network, 650 units ◮ Convolution layer, 1100 filters ◮ Character-level embedding
10 / 16
FP TWN
FP TWN
FP TWN
8 6 4 2 1
1 2 1 4 11 / 16
13 / 16
14 / 16
14 / 16
W B = Quantize(W B) W C = Quantize(W C) conv out = Conv2d( data=in data, weights=W B, in channels=cin,
kernel size=p − 1 + k, stride=p, groups=g) mul out = Multiply( data=conv out, weights=a tilde)
data=mul out, weights=W C, in channels=r,
kernel size=p, stride=p)
16 / 16