Handwritten Chinese Text Recognition Wenchao Wang, Jun Du and Zi-Rui - - PowerPoint PPT Presentation

handwritten chinese text recognition
SMART_READER_LITE
LIVE PREVIEW

Handwritten Chinese Text Recognition Wenchao Wang, Jun Du and Zi-Rui - - PowerPoint PPT Presentation

Parsimonious HMMs for Offline Handwritten Chinese Text Recognition Wenchao Wang, Jun Du and Zi-Rui Wang University of Science and Technology of China ICFHR 2018, Niagara Falls, USA, Aug. 5-8, 2018 1 Background Offline handwritten Chinese


slide-1
SLIDE 1

1

Wenchao Wang, Jun Du and Zi-Rui Wang University of Science and Technology of China

Parsimonious HMMs for Offline Handwritten Chinese Text Recognition

ICFHR 2018, Niagara Falls, USA, Aug. 5-8, 2018

slide-2
SLIDE 2

Background

2

  • Offline handwritten Chinese text recognition (OHCTR) is challenging

– No trajectory information in comparison to the online case – Large vocabulary of Chinese characters – Sequential recognition with the potential segmentation problem

  • Approaches

– Oversegmentation approaches

– Character oversegmentation/classification

– Segmentation-free approaches

– GMM-HMM: Gaussian mixture model - hidden Markov model – MDLSTM-RNN: Multidimensional LSTM-RNN + CTC – DNN-HMM: Deep neural network – hidden Markov model

slide-3
SLIDE 3

Review of HMM Approach for OHCTR

3

  • Left-to-right HMM is adopted to represent Chinese character.
  • The character HMMs are concatenated to model the text line.

The sequence of concatenated character HMMs 得 到 反 映 The observation sequence of sliding windows

slide-4
SLIDE 4

Review of DNN-HMM Approach for OHCTR

4

  • The Bayesian framework

Character modeling Output distribution DNN to calculate state posterior probability

slide-5
SLIDE 5

Motivation

  • High demand of memory and computation from DNN output layer
  • Model redundancy due to similarities among different characters
  • Parsimonious HMMs to address these two problems
  • Decision tree based two-step approach to generate tied-state pool

5

5-state HMM for character 练 ...

Tied-state Pool

5-state HMM for character 冻 5-state HMM for character 缴 ...

slide-6
SLIDE 6

Binary Decision Tree for State Tying

6

O1,P1(x) O2,P2(x) O3,P3(x)

1 2 3

O O O =

  • The parent set has a distribution

, the total log-likelihood of all

  • bservations in on the distribution
  • f is:
  • The child set has a distribution

, the total log-likelihood of all

  • bservations in on the distribution
  • f is:
  • The child set has a distribution

, the total log-likelihood of all

  • bservations in on the distribution
  • f is:
  • The total increase in set-conditioned

log-likelihood of observations due to partitioning is:

1

O

1( )

P x

1

1 1

( ) log( ( ))

x O

L O P x

=

1( )

P x

1

O

2

O

2( )

P x

2( )

P x

2

O

2

2 2

( ) log( ( ))

x O

L O P x

=

3

O

3( )

P x

3( )

P x

3

O

3

3 3

( ) log( ( ))

x O

L O P x

=

2 3 1

( ) ( ) ( ) L O L O L O + − One question

slide-7
SLIDE 7

Step 1: Clustering Characters with Decision Tree

7

Yes No

Is in 愧 怀 怳 忧 快 忱 恍 恢 悦 惋 惯?

Yes Yes No No

Is in 愧 怳 忱 恢 悦 惋 惯? Is in 愉 愤 懈 怖 惝?

Yes No

Is in 慎 懂 性 恼 惊?

Yes No

Is in 慎 懂?

Leaf node Non-leaf node

A tree fragment for tying the first state of HMM

  • All states with the same HMM

position are initially grouped together at the root node.

  • Each node is then recursively

partitioned to maximize the increase in expected log-likelihood with question set.

  • All states in the leaves of the

decision tree are tied together.

slide-8
SLIDE 8

Step 2: Bottom-up Re-clustering

8

Yes No 1 2 i n Yes Yes No No cluster (i,j) cluster (m,n) ... cluster (k,l) ...

Tied-state leaf nodes

  • 1. Calculate the
  • bjf decrease by

clustering each two leaf nodes, push these to this queue.

Minimum Priority Queue

  • 2. If #cluster > N: calculate
  • bjf decrease by clustering

clusters, recluster two cluster with the minimum objf decrease to a new cluster.

Decision Tree ... Tied-state Pool ... ...

  • 3. Generate

Tied-state pool

  • In the second step, the clusters

in leaf nodes obtained in the first step is re-clustered by a bottom-up procedure using sequential greedy optimization.

  • The expected log-likelihood

decrease by combining every two clusters is calculated.

  • A minimum priority queue is

maintained to re-cluster the two clusters with minimum log-likelihood decrease to a new cluster.

slide-9
SLIDE 9

Training Procedure for Parsimonious HMMs

9

  • 1. Training conventional GMM-HMM system
  • 2. Calculating the first-order and second-order statistics based
  • n state-level forced-alignment
  • 3. Two-step algorithm:

First-step: Building the state-tying tree Second-step: Re-clustering the tied-states based on the first-step

  • 4. Parsimonious GMM-HMMs training based on the tied states
  • 5. Parsimonious DNN-HMMs training based on the tied states
slide-10
SLIDE 10

Experiments

  • Training set

CASIA-HWDB database including HWDB1.0, HWDB1.1, HWDB2.0-HWDB2.2

  • Test set

ICDAR-2013 competition set.

  • Vocabulary: 3980 character classes
  • GMM-HMM system

– Each character modeled by a left-to-right HMM with 40-component GMM – Gradient-based features followed by PCA to obtain a 50-dimensional vector

  • DNN-HMM system

– 350-2048-2048-2048-2048-2048-2048-3980*N

  • DNN-PHMM system

– 350-2048-2048-2048-2048-2048-2048-M

10

slide-11
SLIDE 11

HMM vs. PHMM

11

  • Performance saturation with the increase of states for each character
  • PHMM outperforming HMM with the same setting of tied-state number
  • Parsimoniousness of the best PHMM compared with the best HMM
  • Demonstrating the reasonability of the proposed state tying algorithm
slide-12
SLIDE 12

HMM vs. PHMM

12

  • Much more compact by setting the number of tied-states per character < 1
  • DNN-PHMM (Ns=0.5, 9.52%) outperforming DNN-HMM (Ns=1, 11.09%)
slide-13
SLIDE 13

Memory and Computation Costs

13

DNN-PHMM using (1024, 4) setting achieved a comparable CER with DNN- HMM using (2048, 6) setting, 75% of model size and 72% of run-time latency were reduced in DNN-PHMM compared with DNN-HMM.

slide-14
SLIDE 14

State Tying Result Analysis

14

喷 喻 嗅 嗡 吃 咆 哦 哨 嘈 嘲 噬 嚼 客 害 容 密 寇 蜜 穷 穿 突 窃 窍 窑 圃 圆 囚 囤 困 围 固 巨 匝 匠 匡 匣 匪 匹 医 匿 臣 诞 巡 边 逊 辽 达 谜 迁 迂 过 近 这 澜 阐 阑 鬲 闸 闻 闽 润 串 吊 甲 牢 帛 早 平 Left-right Top-bottom Surround Left-surround Bottom-left-surround Top-surround Cross 氛 氢 氦 氨 Top-right-surround 气 口 宀

匚 辶 门 | Tied characters Radical structure Similar part

The Chinese characters with the same or similar radicals were easily tied using the proposed algorithm. This is the reason that the proposed DNN-PHMM with quite compact design can still maintain high recognition performance.

slide-15
SLIDE 15

15

Thanks!