Connectionist Temporal Classification with Maximum Entropy - - PowerPoint PPT Presentation

connectionist temporal classification with maximum
SMART_READER_LITE
LIVE PREVIEW

Connectionist Temporal Classification with Maximum Entropy - - PowerPoint PPT Presentation

Connectionist Temporal Classification with Maximum Entropy Regularization Hu Liu Sheng Jin Changshui Zhang Department of Automation Tsinghua University NeurIPS, 2018 Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization


slide-1
SLIDE 1

Connectionist Temporal Classification with Maximum Entropy Regularization

Hu Liu Sheng Jin Changshui Zhang

Department of Automation Tsinghua University

NeurIPS, 2018

Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 1 / 9

slide-2
SLIDE 2

Introduction to Connectionist Temporal Classification (CTC)

P(‘dog’| ) dd_oo___g doo____gg _____dogg dddoooggg = P

The most suitable path

@Lctc @yt

k

= − 1 p(l|X)yt

k

X

{π|π∈B−1(l),πt=k}

p(π|X) CTC lacks exploration and is prone to fall into worse local minima. Output overconfident paths (overfitting). Output paths with peaky distribution.

CTC Drawbacks:

positive feedback error signal

Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 2 / 9

slide-3
SLIDE 3

Maximum Conditional Entropy Regularization for CTC (EnCTC)

EnCTC:

Lenctc = Lctc − βH(p(π|l, X))

g

  • d

CTC: expected: (High Entropy)

H(p(π|l, X)) = − X

π∈B−1(l)

p(π|X, l) log p(π|X, l).

Entropy-based regularization

Better generalization and exploration. Solve peaky distribution problem. Depict ambiguous segmentation boundaries.

Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 3 / 9

slide-4
SLIDE 4

Equal Spacing CTC (EsCTC)

The spacing of two consecutive elements is nearly the same in many sequential tasks.

g

  • d

d

  • g

d

  • g

g d

  • We adopt equal spacing as a pruning method to rule out unreasonable CTC paths.

Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 4 / 9

slide-5
SLIDE 5

Equal Spacing CTC (EsCTC)

z1 z2 z3

  • o o o o o

d d d _ _ _ _ _ o _ _ _ _ _ d d d d d d d _ _ d _ d d

  • o o

_ _ o _ o o g g g _ _ g _ g g

  • g g

_ g d g g _ g g

  • d

Theorem 3.1. Among all segmentation sequences, the equal spacing one has the maximum entropy. Equal spacing is the best prior without any subjective assumptions.

zs ≤ τ T |l|

z argmax max

p

H(p(π|z, l, X)) = zes Lesctc = − log X

z∈Cτ,T

X

π∈B−1

z

(l)

p(π|X)

Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 5 / 9

slide-6
SLIDE 6

Algorithm and Complexity Analysis

We propose dynamic programming algorithms for EnCTC, EsCTC and EnEsCTC.

¯ γ(t, s) = ( γ(t − 1, s) + γ(t − 1, s − 1) if l0

s = b or l0 s−2 = l0 s

γ(t − 1, s) + γ(t − 1, s − 1) + γ(t − 1, s − 2)

  • therwise

Q(l) = γ(T, |l0|) + γ(T, |l0| − 1) EnCTC EsCTC EnEsCTC pτ(l|X1:T ) = ατ(T, |l|) +

τ T

|l|

X

t0=1

ατ(T − t0, |l|)σ(T − t0 + 1, T, 0) ατ(t, s) = 8 < : Pτ T

|l|

t0=1 ατ(t − t0, s − 1)σ(t − t0 + 1, t, s)

if ls−1 6= ls Pτ T

|l|

t0=2 ατ(t − t0, s − 1)yt−t0+1 b

σ(t − t0 + 2, t, s)

  • therwise

Qτ(l) =γτ(T, |l|) +

τ T

|l|

X

t0=1

γτ(T − t0, |l|)σ(T − t0 + 1, T, 0) + ατ(T − t0, |l|)σ(T − t0 + 1, T, 0) log σ(T − t0 + 1, T, 0)

γτ(t, s) = 8 > > > > > > > > > < > > > > > > > > > : Pτ T

|l|

t0=1

γτ(t − t0, s − 1)σ(t − t0 + 1, t, s) + ατ(t − t0, s − 1)η(t − t0 + 1, t, s) if ls−1 6= ls Pτ T

|l|

t0=2

γτ(t − t0, s − 1)yt−t0+1

b

σ(t − t0 + 2, t, s)+ ατ(t − t0, s − 1)yt−t0+1

b

η(t − t0 + 2, t, s)+ ατ(t − t0, s − 1)yt−t0+1

b

log yt−t0+1

b

σ(t − t0 + 2, t, s)

  • therwise

Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 6 / 9

slide-7
SLIDE 7

Qualitative Analysis

(a) (b) (c) (d)

Error Signal in Training

p a r i t y

CTC EnCTC EsCTC EnEsCTC

Alignment Evaluation

Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 7 / 9

slide-8
SLIDE 8

Results on Scene Text Recognition Benchmarks

Evaluation of model generalization. Method Synth5K CTC 38.1 CTC + LS 42.9 CTC + CP 44.4 EnCTC 45.5 EsCTC 46.3 EnEsCTC 47.2 Comparisons with the state-of-the-art methods. Method IC03 IC13 IIIT5K SVT CRNN 89.4 86.7 78.2 80.8 STAR-Net 89.9 89.1 83.3 83.6 R2AM 88.7 90.0 78.4 80.7 RARE 90.1 88.6 81.9 81.9 EnCTC 90.8 90.0 82.6 81.5 EsCTC 92.6 87.4 81.7 81.5 EnEsCTC 92.0 90.6 82.0 80.6

Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 8 / 9

slide-9
SLIDE 9

For more results and analyses, please come

Poster: Room 210 & 230 AB #106 https://github.com/liuhu-bigeye/enctc.crnn

Hu Liu Sheng Jin Changshui Zhang CTC with Maximum Entropy Regularization NeurIPS, 2018 9 / 9