Unconstrained Handwritten Text Recognition Reporter: Zecheng Xie - PowerPoint PPT Presentation

Distilling GRU with Data Augmentation for Unconstrained Handwritten Text Recognition Reporter: Zecheng Xie South China University of Technology August 6 ， 2018

Outline  Problem Definition  Multi-layer Distilling GRU  Data Augmentation  Experiments  Conclusion 2 Problem Definition

Problem Definition Motivation  Handwritten texts with various styles, such as horizontal, overlapping, vertical, and multi-lines texts, are commonly observed in the community.  Most existing handwriting recognition methods only concentrate on one specific kind of text style. The new unconstrained online handwritten text recognition problem 4 Problem Definition

Problem Definition The New Unconstrained OHCTR Problem Overlap Horizontal Horizontal Vertical Multi-line Right-Down Overlap Screw-Rotation Right-Down Crew-Rotation 5 Problem Definition

Problem Definition Novel Perspective Why not focusing on the variation between adjacent points [14,15] . More stable than the pen-tip coordinate — distribute between a specific bound for most situations. The unconstrained text of multiple styles share a very similar feature pattern, the only difference between different text styles is the pen-tip movement between characters. [14] X. Zhang, et al. “Drawing and recognizing Chinese characters with recurrent neural network,” IEEE transactions on pattern analysis and machine intelligence, 2018. [15] L. Sun, et al. “ Deep lstm networks for online Chinese handwriting recognition, in ICFHR 2016. 6 Problem Definition

Multi-layer Distilling GRU Feature Extraction (𝑦 𝑢 , 𝑧 𝑢 ) Feature Extraction 1 0 0 1 0 0 1 1 0 1 𝒋 -th stroke Online Text Sampling Points  Pen-tip Movement  Pen down\up state Multi-layer Distilling GRU 8

Multi-layer Distilling GRU Distilling GRU  GRU can only output feature sequence with the same time step as that of the input data - greatly burden the framework if directly applied in text recognition problem. How to accelerate the training process while not sacrifice performance. Multi-layer Distilling GRU 9

Multi-layer Distilling GRU Distilling GRU 𝑢 𝑢 1- 1- 𝑢 𝑢 𝑢 𝑢 ReLU 𝑢 -1 ℎ = (ℎ 1 , ℎ , … , ℎ 𝑈 ) 𝑢 𝑢 𝑢 𝑢 1- 1- 1- 1- 𝑢 𝑢 𝑢 ℎ ′ = (ℎ 1 ′ , ℎ ′ , … , ℎ 𝑈/𝑂 𝑢 ′ ) 𝑢 𝑢 𝑢 𝑢 -1 -3 -2 hidden state input Multi-layer Distilling GRU 10

Multi-layer Distilling GRU Distilling GRU  Unlike the traditional pooling layer, our 𝑢 𝑢 1- 1- 𝑢 𝑢 distilling operation does not lose 𝑢 𝑢 information from the GRU output ReLU 𝑢  Accelerate the training process while -1 not sacrifice any performance. 𝑢 𝑢 𝑢 𝑢 1- 1- 1- 1- 𝑢 𝑢 𝑢 𝑢 𝑢 𝑢 𝑢 𝑢 -1 -3 -2 hidden state input Multi-layer Distilling GRU 11

Multi-layer Distilling GRU Transcription 𝝆 ： _ 备 _ 受 _ 观观 _ 众 _ 期期 _ 待 _ ‘blank’ … … … … 0.907 0.349 0.1 0.82 0.02 𝝆 ： _ 备 _ 受 _ 观 _ 众 _ 期 _ 待 𝝆 ： _ 备 _ 受 _ 观 _ 众 _ 期期期 _ 待观 … … … … 0.001 0.001 0.789 0.1 0.003 … … … … … … … 0.003 0.003 0.08 0.007 0.004 𝔆 … . . . . . … … … … . . . . . 期 … … … … 备受观众期待 0.002 0.001 0.001 0.001 0.8 … . . . . . … … … … 𝑄 𝒎 𝒕 = 𝑄 (𝝆|𝒕) . . . . . … … … … … 𝝆:𝔆 𝝆 =𝒎 0.001 0.0015 0.002 0.002 0.001 Multi-layer Distilling GRU 12

Multi-layer Distilling GRU Multi-layer Distilling GRU ℎ ′ = (ℎ 1 ′ , ℎ ′ , … , ℎ 𝑈/𝑂 ′ ) 13

Data Augmentation 𝜠 𝒋 , 𝜠 𝒋 : pen movement between the i and i + 1-th Horizontal characters. 𝒏𝒋𝒐 , 𝒋 𝒏𝒃 :the minimum and 𝒋 Vertical maximum x-coordinate value of the i-th character. Overlapping 𝒈 , 𝒋 𝒎 : the x-coordinate values 𝒋 of the first and last points of the i-th character. Multi-lines 𝚬 𝒔 :a random bias generated from an even distribution between (-2, 13). 𝚬 𝒎𝒋𝒐𝒇 :text line length that can Screw rotation be adjusted according to practical situation. All the abovementioned Right-down definitions also apply for the Y- axis . Data Augmentation 15

Experiments  Training Data CASIA-OLHWDB2.0-2.2 [1] Synthetic Unconstrained Data by CASIA-OLHWDB1.0-1.2 [1]  Testing Data ICDAR2013 Test Dataset [2] Synthetic Unconstrained Data by CASIA-OLHWDB1.0-1.2 [1]  Network 2-Layers Distilling GRU ， Distilling Rate=0.25  Hardware GeForce Titan-X GPU Convergence time 208h  95h [1] C. Liu., et al , “ CASIA online and offline Chinese handwriting databases,” 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 37 – 41, 2011 [2] Yin F., et al , “ ICDAR 2013 Chinese handwriting recognition competition,” ICDAR2013 , pp. 1464 – 1470. 17 Experiments

Experiments 18 Experiments

Experiments [3] X. Zhou., et al , IEEE TPAMI, vol. 35, no. 10, pp. 2413 – 2426, 2013. [4] X. Zhou., et al, Pattern Recognition[J], 2014, 47(5): 1904-1916 [29] Z. Xie., et al, IEEE TPAMI, 2017 [30] K. Chen, et al, in ICDAR 2017, vol. 1. IEEE, 2017, pp. 1068 – 1073. 19 Experiments

Experiments Demo 20 Experiments

Conclusion  The new unconstrained text recognition problem is suggested to advance the handwritten text recognition community.  A special perspective of the pen-tip trajectory is suggested to reduce the difference between texts of multiple styles.  A new data augmentation method is developed to synthesize unconstrained handwritten texts of multiple styles  A Multi-layer distilling GRU is proposed to process the input data in a sequential manner  Achieves state-of-the-art results on ICDAR2013 text competition dataset but also shows robust performance on our synthesized handwritten test sets. Conclusion 21

Q & A Tha hank nk you! you! Lianwen Jin( 金连文 ), Ph.D, Professor eelwjin@scut.edu.cn lianwen.jin@gmail.com Zecheng Xie( 谢泽澄 ), Ph.D, student Manfei Liu( 刘曼飞 ), Master, student http://www.hcii-lab.net/ 22 Experiments

Unconstrained Handwritten Text Recognition Reporter: Zecheng Xie - PowerPoint PPT Presentation

Distilling GRU with Data Augmentation for Unconstrained Handwritten Text Recognition Reporter: Zecheng Xie South China University of Technology August 6 2018 Outline Problem Definition Multi-layer Distilling GRU Data Augmentation

Unconstrained Elastic Matching Unconstrained Elastic Matching and Eigen Eigen- -Deformations

Handwritten character recognition Handwritten character recognition using elastic matching based

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Handwritten Chinese Text Recognition Wenchao Wang, Jun Du and Zi-Rui Wang University of Science

Using Eigen- -Deformations in Deformations in Using Eigen Handwritten Character Recognition

Automatic Scoring of Automatic Scoring of Handwritten Essays using Latent Handwritten Essays

Unconstrained Face Recognition and Analysis S. Kevin Zhou Siemens Corporate Research, Inc.

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

Local, Unconstrained Function Optimization COMPSCI 527 Computer Vision COMPSCI 527

Post-Conference Presentation Sunday Oladayo Oladejo Table of Content A Introduction B

A Benchmark Study of Large-scale Unconstrained Face Recognition Shengcai Liao, Zhen Lei, Dong Yi,

Interactive Smoothing of Handwritten Text Images Using a Bilateral Filter Oliver A. Nina, Bryan

Enhancing ICANN Text Accountability 26 June 2014 Text #ICANN50 Text #ICANN50 Text #ICANN50

Add Your Title Here Replace your text here! Replace your text here! Insert your title here 1

Text Text #ICANN51 15 October 2014 Text Text IDN Root Zone LGR Sarmad Hussain IDN Program

Text Text #ICANN51 Contractual Compliance Text Text Contractual Compliance Update

Quantifying the Unextendibility of Entanglement Kun WANG Shenzhen Institute for Quantum Science

Deep Model Compression Xin Wang Oct.31.2016 Some of the contents are borrowed from Hintons

Matching Guided Distillation ECCV 2020 Kaiyu Yue, Jiangfan Deng, and Feng Zhou Algorithm

Session 05 Robust and Resistant Regression V&R 6.5, p. 156 ff A radical change of view

CSE 440: Introduction to HCI User Interface Design, Prototyping, and Evaluation Lecture 03:

Quantum resource theories of quantum channels Xin Wang Baidu Research TQC 2020 Based on

Mining Source Code^3 Mining Idioms, Usages and Edits Dario Di Nucci Research Fellow

Supercomputing Notes Focusing on Science and GPUs A. Norman GPU Impressions Common theme

Unconstrained Handwritten Text Recognition Reporter: Zecheng Xie - PowerPoint PPT Presentation

Distilling GRU with Data Augmentation for Unconstrained Handwritten Text Recognition Reporter: Zecheng Xie South China University of Technology August 6 2018 Outline Problem Definition Multi-layer Distilling GRU Data Augmentation

Unconstrained Elastic Matching Unconstrained Elastic Matching and Eigen Eigen- -Deformations

Handwritten character recognition Handwritten character recognition using elastic matching based

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Handwritten Chinese Text Recognition Wenchao Wang, Jun Du and Zi-Rui Wang University of Science

Using Eigen- -Deformations in Deformations in Using Eigen Handwritten Character Recognition

Automatic Scoring of Automatic Scoring of Handwritten Essays using Latent Handwritten Essays

Unconstrained Face Recognition and Analysis S. Kevin Zhou Siemens Corporate Research, Inc.

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

Local, Unconstrained Function Optimization COMPSCI 527 Computer Vision COMPSCI 527

Post-Conference Presentation Sunday Oladayo Oladejo Table of Content A Introduction B

A Benchmark Study of Large-scale Unconstrained Face Recognition Shengcai Liao, Zhen Lei, Dong Yi,

Interactive Smoothing of Handwritten Text Images Using a Bilateral Filter Oliver A. Nina, Bryan

Enhancing ICANN Text Accountability 26 June 2014 Text #ICANN50 Text #ICANN50 Text #ICANN50

Add Your Title Here Replace your text here! Replace your text here! Insert your title here 1

Text Text #ICANN51 15 October 2014 Text Text IDN Root Zone LGR Sarmad Hussain IDN Program

Text Text #ICANN51 Contractual Compliance Text Text Contractual Compliance Update

Quantifying the Unextendibility of Entanglement Kun WANG Shenzhen Institute for Quantum Science

Deep Model Compression Xin Wang Oct.31.2016 Some of the contents are borrowed from Hintons

Matching Guided Distillation ECCV 2020 Kaiyu Yue, Jiangfan Deng, and Feng Zhou Algorithm

Session 05 Robust and Resistant Regression V&amp;R 6.5, p. 156 ff A radical change of view

CSE 440: Introduction to HCI User Interface Design, Prototyping, and Evaluation Lecture 03:

Quantum resource theories of quantum channels Xin Wang Baidu Research TQC 2020 Based on

Mining Source Code^3 Mining Idioms, Usages and Edits Dario Di Nucci Research Fellow

Supercomputing Notes Focusing on Science and GPUs A. Norman GPU Impressions Common theme

Session 05 Robust and Resistant Regression V&R 6.5, p. 156 ff A radical change of view