character recognition reporter zecheng xie
play

Character Recognition Reporter: Zecheng Xie South China University - PowerPoint PPT Presentation

Accelerating and Compressing LSTM based Model for Online Handwritten Chinese Character Recognition Reporter: Zecheng Xie South China University of Technology August 5 th , 2018 Outline Motivation Difficulties Our approach


  1. Accelerating and Compressing LSTM based Model for Online Handwritten Chinese Character Recognition Reporter: Zecheng Xie South China University of Technology August 5 th , 2018

  2. Outline  Motivation  Difficulties  Our approach  Experiments  Conclusion 2

  3. Motivation  Online handwritten Chinese character recognition (HCCR) is widely used in pen input devices and touch screen devices 3

  4. Motivation Our goal: build fast  The difficulties of online HCCR and compact models  Large number of character classes for on-device  Similarity between characters inference  Diversity of writing styles  Deep learning models are powerful but raise other problems  Models are too large  require large footprint and memory  Computational expensive  consume much energy  The advantages of deploying models on mobile devices  Ease server pressure  Better service latency  Can work offline  Privacy protection  … 4

  5. Difficulties of deploying LSTM based online HCCR models on mobile devices  3755 classes  Model tends to be large  Dependences between time steps  Make the inference slow  Nature of RNNs, unlikely to be changed Unroll of RNN [1] [1] http://colah.github.io/posts/2015-08-Understanding-LSTMs/ 5

  6. Our approach  The proposed framework Prune Cluster Reconstruct The baseline baseline redundant remaining model with SVD connections connections 6

  7. Our approach  Data preprocessing and augmentation  Randomly remove 30% of the points in each character  Perform coordinate normalization  Remove redundant points using method proposed in [1]  Point that is too close to the point before it  Middle point that nearly stands in line with the two points before and after it  Data transform & feature extraction[1] 𝑦 𝑗 , 𝑧 𝑗 , 𝑡 𝑗 , 𝑗 = 1, 2, 3, …  𝑦 𝑗 , 𝑧 𝑗 , ∆𝑦 𝑗 , ∆𝑧 𝑗 , 𝑡 𝑗 = 𝑡 𝑗+1 , (𝑡 𝑗 ≠ 𝑡 𝑗+1 ) , 𝑗 = 1, 2, 3, …  [1] X.-Y. Zhang et al., “Drawing and recognizing Chinese characters with recurrent neural network”, TPAMI, 2017 7

  8. Our approach  Data preprocessing and augmentation [1] X.-Y. Zhang et al., “Drawing and recognizing Chinese characters with recurrent neural network”, TPAMI, 2017 8

  9. Our approach  Baseline model architecture  Input-100LSTM-512LSTM-512FC-3755FC-Output t=1 t=T 512 FC 3755 FC 100 LSTM input 512 LSTM 9

  10. Our approach  Reconstruct network with singular value decomposition (SVD) 𝑗 𝑢 = 𝜏(𝑋 𝑗𝑗 𝑦 𝑢 + 𝑋 ℎ𝑗 ℎ 𝑢−1 + 𝑐 𝑗 ) 𝑔 𝑢 = 𝜏 𝑋 𝑗𝑔 𝑦 𝑢 + 𝑋 ℎ𝑔 ℎ 𝑢−1 + 𝑐 𝑔 𝑕 𝑢 = tanh 𝑋 𝑗𝑕 𝑦 𝑢 + 𝑋 ℎ𝑕 ℎ 𝑢−1 + 𝑐 𝑕 Main 𝑝 𝑢 = 𝜏(𝑋 𝑗𝑝 𝑦 𝑢 + 𝑋 ℎ𝑝 ℎ 𝑢−1 + 𝑐 𝑝 ) computation 𝑑 𝑢 = 𝑔 𝑢 ∗ 𝑑 𝑢−1 + 𝑗 𝑢 ∗ 𝑕 𝑢 ℎ 𝑢 = 𝑝 𝑢 ∗ tanh(𝑑 𝑢 ) 𝑐 𝑗 𝑋 𝑋 𝑗 𝑢 𝜏 𝑗𝑗 ℎ𝑗 𝑋 𝑋 𝑐 𝑔 𝜏 𝑔 𝑗𝑔 ℎ𝑔 𝑢 = * 𝑦 𝑢 + ℎ 𝑢−1 + tanh 𝑕 𝑢 𝑋 𝑋 𝑐 𝑕 𝑗𝑕 ℎ𝑕 𝑝 𝑢 𝜏 𝑋 𝑋 𝑐 𝑝 𝑗𝑝 ℎ𝑝 10 10

  11. Our approach  Reconstruct network with singular value decomposition (SVD) 𝑐 𝑗 𝑋 𝑋 𝑗 𝑢 𝜏 𝑗𝑗 ℎ𝑗 𝑋 𝑋 𝑐 𝑔 𝜏 𝑔 𝑗𝑔 ℎ𝑔 𝑢 = 𝑦 𝑢 + ℎ 𝑢−1 + * 𝑕 𝑢 tanh 𝑋 𝑋 𝑐 𝑕 𝑗𝑕 ℎ𝑕 𝑝 𝑢 𝜏 𝑋 𝑋 𝑐 𝑝 𝑗𝑝 ℎ𝑝 𝑋 𝑗 𝑦 𝑢 𝑋 ℎ ℎ 𝑢−1  Apply SVD to 𝑋 𝑗 and 𝑋 ℎ  𝑋 𝑗 : input connections  𝑋 ℎ : hidden-hidden connections 11 11

  12. Our approach  Efficiency analysis of SVD method  Suppose 𝑋 ∈ ℝ 𝑛×𝑜 , by SVD we have 𝑈 𝑋 𝑛×𝑜 = 𝑉 𝑛×𝑜 Σ 𝑜×𝑜 𝑊 𝑜×𝑜  By reserving proper number of singular values 𝑈 𝑋 𝑛×𝑜 ≈ 𝑉 𝑛×𝑠 Σ 𝑠×𝑠 𝑊 = 𝑉 𝑛×𝑠 𝑂 𝑠×𝑜 𝑜×𝑠  Replace 𝑋 𝑛×𝑜 with 𝑉 𝑛×𝑠 𝑂 𝑠×𝑜  𝑋𝑦 → 𝑉𝑂𝑦 12 12

  13. Our approach  Efficiency analysis of SVD method  For a matrix-vector multiplication 𝑋𝑦 , 𝑋 ∈ ℝ 𝑛×𝑜 , 𝑦 ∈ ℝ 𝑜×1 , the acceleration rate and compression rate with r singular values reserved is given by 𝑛𝑜 𝑆 𝑏 = 𝑆 𝑑 = 𝑛𝑠 + 𝑠𝑜  If 𝑛 = 512, 𝑜 = 128, 𝑠 = 32 , then 𝑆 𝑏 = 𝑆 𝑑 = 3.2 13 13

  14. Our approach  Adaptive drop weight (ADW) [1]  Improvement on “Deep Compression” [2] in which a hard threshold is set  ADW gradually prunes away redundant connections in each layer, which have small absolute values (by sort them during retraining)  After ADW, the network become sparse, K-means based quantization is applied to each layer to gain further compression [1] X. Xiao, L. Jin, et al., “ Building fast and compact convolutional neural networks for offline handwritten Chinese character recognition”, Pattern Recognition, 2017 [2] S. Han,et al., “Deep compression: compressing deep neural network with pruning, trained quantization 14 14 and Huffman coding”, ICLR, 2016

  15. Our approach  The proposed framework - review Prune Cluster Reconstruct The baseline baseline redundant remaining model with SVD connections connections 15 15

  16. Experiments  Training set  CASIA OLHWDB1.0 & OLHWDB1.1  720 writers, 2,693,183 samples, 3755 classes  Test set  ICDAR2013 online competition dataset  60 writers, 224,590 samples, 3755 classes  Data preprocessing and augmentation as mentioned before 16 16

  17. Experiments  Details of the baseline model  Main storage cost: LSTM2, FC1, FC2  Main computation cost: LSTM2 17 17

  18. Experiments  Experimental settings  Consideration of the experimental settings  In our experiments, we found LSTM is more sensitive to input connections than hidden-hidden connections  Most computation latency is introduced by hidden-hidden connections 18 18

  19. Experiments  Experimental results  Intel Core i7-4790, single thread After SVD, model is 10 × smaller, and FLOPs is also reduced by 10 ×  After ADW & quantization, model is 31 × smaller, and FLOPs is further reduced  A minor 0.5% drop of accuracy  19 19

  20. Experiments  Experimental results  Compared with [11], our model is 300 × smaller and 4 × faster on CPU  Compared with [15], our model is 52 × smaller and 109 × faster on CPU [1] W. Yang, L. Jin, et al ., “ Dropsample: A new training method to enhance deep convolutional neural networks for largescale unconstrained handwritten Chinese character recognition”, Pattern Recognition, 2016 20 20 [2] X.-Y. Zhang, et al ., “Online and offline handwritten Chinese character recognition: A comprehensive study and new benchmark”, Pattern Recognition, 2017

  21. Conclusion  SVD is efficient for accelerating computation  ADW also works well for LSTMs  By combining SVD and ADW, we can build fast and compact LSTM based model for online HCCR 21 21

  22. Thank you! Lianwen Jin( 金连文 ), Ph.D, Professor eelwjin@scut.edu.cn lianwen.jin@gmail.com Zecheng Xie( 谢泽澄 ), Ph.D, student Yafeng Yang( 杨亚锋 ), Master, student http://www.hcii-lab.net/ 22 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend