IN HANDWRITING RECOGNITION CHARACTER AND TEXT RECOGNITION OF KHMER - - PowerPoint PPT Presentation

in handwriting recognition
SMART_READER_LITE
LIVE PREVIEW

IN HANDWRITING RECOGNITION CHARACTER AND TEXT RECOGNITION OF KHMER - - PowerPoint PPT Presentation

1 THE 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION CHARACTER AND TEXT RECOGNITION OF KHMER HISTORICAL PALM LEAF MANUSCRIPTS Dona Valy, Michel Verleysen, Sophea Chhun, and Jean-Christophe Burie August 5-8, 2018


slide-1
SLIDE 1

CHARACTER AND TEXT RECOGNITION OF KHMER HISTORICAL PALM LEAF MANUSCRIPTS

ICHFR2018

THE 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION

1

August 5-8, 2018

Dona Valy, Michel Verleysen, Sophea Chhun, and Jean-Christophe Burie

slide-2
SLIDE 2

Overview

 Khmer Palm Leaf Manuscripts  Task 1: Isolated Character Classification  Task 2: Word/Text Recognition  Conclusion

2

slide-3
SLIDE 3

3

KHMER PALM LEAF MANUSCRIPTS

slide-4
SLIDE 4

Introduction

4

 Palm Leaf Manuscripts or Sleuk Rith in Khmer

 [Sleuk: leaf] + [Rith: to bind/tie together]

KHMER PALM LEAF MANUSCRIPTS | TASK 1 | TASK 2 | CONCLUSION

slide-5
SLIDE 5

Challenges

5

 Degradations and defects

KHMER PALM LEAF MANUSCRIPTS | TASK 1 | TASK 2 | CONCLUSION

slide-6
SLIDE 6

Challenges

6

 Ambiguity of certain characters

 Khmer alphabet (more or less 70 symbols)  Similarity between characters

KHMER PALM LEAF MANUSCRIPTS | TASK 1 | TASK 2 | CONCLUSION

slide-7
SLIDE 7

Challenges

7

 Sequential order of characters composing a word

 Khmer alphabet (more or less 70 symbols)  Irregularity of how characters are combined into words

SA-SUBDA-AEU-NGO

KHMER PALM LEAF MANUSCRIPTS | TASK 1 | TASK 2 | CONCLUSION

slide-8
SLIDE 8

Annotating a character Annotating a word

SleukRith Set

8

 A collection of annotated data created from 657

pages of digitized Khmer palm leaf manuscripts

 Composed of 3 types of annotated data:

 Character/Glyph  Word  Line

KA

KHMER PALM LEAF MANUSCRIPTS | TASK 1 | TASK 2 | CONCLUSION

Available at https://github.com/donavaly/SleukRith-Set

slide-9
SLIDE 9

SleukRith Set

9

 Statistics of SleukRith Set  Character and word image patches

Data Quantity Annotated Characters/Glyphs 301,626 Annotated Words 73,359 Text Lines 3,245

KHMER PALM LEAF MANUSCRIPTS | TASK 1 | TASK 2 | CONCLUSION

Available at https://github.com/donavaly/SleukRith-Set

slide-10
SLIDE 10

10

System

𝑑1: 𝑞1 𝑑2: 𝑞2 … 𝑑𝑜: 𝑞𝑜

TASK1: ISOLATED CHARACTER CLASSIFICATION

slide-11
SLIDE 11

Isolated Character Dataset

11

 Data normalization  Dataset:

 Train: ~113k  Test: ~91k  Number of classes: 111

(a). Original image, (b). Gray scaled and resized to 48x48, (c). Normalized KHMER PALM LEAF MANUSCRIPTS | TASK 1 | TASK 2 | CONCLUSION

slide-12
SLIDE 12

Network 1.1: CNN

12

KHMER PALM LEAF MANUSCRIPTS | TASK 1 | TASK 2 | CONCLUSION

slide-13
SLIDE 13

Network 1.2: Column LSTM

13

KHMER PALM LEAF MANUSCRIPTS | TASK 1 | TASK 2 | CONCLUSION

slide-14
SLIDE 14

Network 1.3: Row-Column LSTM

14

KHMER PALM LEAF MANUSCRIPTS | TASK 1 | TASK 2 | CONCLUSION

slide-15
SLIDE 15

Network 1.4: CNN-LSTM

15

KHMER PALM LEAF MANUSCRIPTS | TASK 1 | TASK 2 | CONCLUSION

slide-16
SLIDE 16

Experiments and Results

16

 Training configurations:  Batch size: 300  Samples are reshuffled after each epoch  Stop condition:

◼ average loss does not improve after 𝑂 = 10 consecutive tests

◼ each test is done for every 50 iterations

 Results: top-k error rate

Architecture Error Rate (%) Top 5 Top 1

Network 1.1: CNN 0.65 6.29 Network 1.2: Column LSTM 1.05 8.49 Network 1.3: Row-Column LSTM 0.82 7.00 Network 1.4: Conv-LSTM 0.46 5.01

KHMER PALM LEAF MANUSCRIPTS | TASK 1 | TASK 2 | CONCLUSION

slide-17
SLIDE 17

17

System

PO EI EI SA

TASK2: WORD/TEXT RECOGNITION

slide-18
SLIDE 18

Annotated Word Dataset

18

 Character-Class Map  Dataset:  Train: ~16k  Test: ~8k

(a). Original word image patch, (b). Annotated character information in the word: polygon boundaries of all characters, (c). Character-class map

𝑑ℎ 𝑑𝑥 𝐽ℎ = 72, 𝑜𝑠𝑝𝑥 𝐽𝑥, 𝑜𝑑𝑝𝑚

 Number of character-classes: 134

(including 1 token class for background

  • r blank space)
  • 𝐽ℎ, 𝐽𝑥: height and width of the image (after

possible paddings)

  • 𝑑ℎ, 𝑑𝑥: cell height and width
  • 𝑜𝑠𝑝𝑥 = 𝐽ℎ/𝑑ℎ, 𝑜𝑑𝑝𝑚 = 𝐽𝑥/𝑑𝑥

KHMER PALM LEAF MANUSCRIPTS | TASK 1 | TASK 2 | CONCLUSION

slide-19
SLIDE 19

General Architecture

19

KHMER PALM LEAF MANUSCRIPTS | TASK 1 | TASK 2 | CONCLUSION

slide-20
SLIDE 20

Network 2.1: 1D-LSTM

20

 LSTM Layer of Network 2.1

KHMER PALM LEAF MANUSCRIPTS | TASK 1 | TASK 2 | CONCLUSION

slide-21
SLIDE 21

Network 2.2: 2D-LSTM

21

 LSTM Layer of Network 2.2

KHMER PALM LEAF MANUSCRIPTS | TASK 1 | TASK 2 | CONCLUSION

slide-22
SLIDE 22

Experiments

22

 Training configurations:  Batch size: 30  Samples are sorted and batched according to their width  Stop condition:

◼ average loss does not improve after 𝑂 = 30 consecutive tests ◼ each test is done for every 50 iterations

(a). Initial sample order (b). Sort by the width of each sample (c). Pad each sample to the maximum width in the batch (d). Shuffle batch order

KHMER PALM LEAF MANUSCRIPTS | TASK 1 | TASK 2 | CONCLUSION

slide-23
SLIDE 23

Results

23

 Measurement  Top-k error rate: average error rate of all cells in the

predicted character-class map Architecture Error Rate (%) Top 5 Top 1

Network 2.1: 1D-LSTM 8.46 32.01 Network 2.2: 2D-LSTM 2.40 20.49

(a). Original word image (b). Ground truth character-class map (c). Result predicted by Network 2.1 (d). Result predicted by Network 2.2 KHMER PALM LEAF MANUSCRIPTS | TASK 1 | TASK 2 | CONCLUSION

slide-24
SLIDE 24

24

CONCLUSION

slide-25
SLIDE 25

Conclusion

25

 We present different approaches for two tasks on

medium size datasets constructed from Khmer palm leaf manuscripts :

 Isolated character classification  Word/text recognition

 The predicted character-class map from Task 2 can

be used further to generate the final transcription

  • f the word image

 CTC and/or encoder-decoder mechanism

KHMER PALM LEAF MANUSCRIPTS | TASK 1 | TASK 2 | CONCLUSION

slide-26
SLIDE 26

26

Thank you for your attention!

slide-27
SLIDE 27

References

27

[1]W. Swaileh, J. Lerouge and T. Paquet, "A Unified French/English syllabic model for handwriting recognition," in 15th International Conference on Frontiers in Handwriting Recognition, 2016. [2]T. M. Breuel, "High Performance Text Recognition using a Hybrid Convolutional-LSTM Implementation," in 14th IAPR International Conference on Document Analysis and Recognition, 2017. [3]T. Bluche and R. Messina, "Faster Segmentation-Free Handwritten Chinese Text Recognition with Character," in 15th International Conference on Frontiers in Handwriting Recognition, 2016. [4]X. Yang, D. He, Z. Zhou, D. Kifer and C. L. Giles, "Improving Offline Handwritten Chinese Character," in 14th IAPR International Conference on Document Analysis and Recognition, 2017. [5]M. T. Pavez and S. A. Mahoud, "Offline Arabic handwritten text recognition: a survey," ACM Computing Surveys (CSUR), vol. 45, no. 2, p. 23, 2013. [6]A. Khémiri, A. K. Echi, A. Belaïd and M. Elloumi, "A System for off-line Arabic Handwritten Word Recognition based on Bayesian," in 15th International Conference on Frontiers in Handwriting Recognition, 2016. [7]A. Krizhevsky, I. Sutskever and G. E. Hinton, "Imagenet classification with deep convolutional neural networks.," in Advances in neural information processing systems, 2012. [8]K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," in arXiv preprint arXiv:1409.1556, 2014. [9]K. He, X. Zhang, S. Ren and J. Sun, "Deep residual learning for image recognition," in IEEE conference

  • n computer vision and pattern recognition, 2016.
slide-28
SLIDE 28

References

28

[10]K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk and Y. Bengio, "Learning phrase representations using RNN encoder-decoder for statistical machine translation," in arXiv preprint arXiv:1406.1078, 2014. [11]S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural computation, vol. 9, no. 8, pp. 1735- 1780, 1997. [12]A. Graves and J. Schmidhuber, "Offline handwriting recognition with multidimensional recurrent neural networks," Advances in neural information processing systems, pp. 545-552, 2009. [13]A. Graves, S. Fernández, F. Gomez and J. Schmidhuber, "Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks," in 23rd international conference on Machine learning, 2006. [14]D. Valy, M. Verleysen, S. Chhun and J.-C. Burie, "A New Khmer Palm Leaf Manuscript Dataset for Document Analysis and Recognition - SleukRith Set," in 4th International Workshop on Historical Document Imaging and Processing (HIP), 2017. [15]M. W. A. Kesiman, D. Valy, J.-C. Burie, E. Paulaus, M. Suryani, S. Hadi, M. Verleysen, S. Chhun and J.-

  • M. Ogier, "Benchmarking of Document Image Analysis Tasks for Palm Leaf Manuscripts from Southeast

Asia," Journal of Imaging, vol. 4, no. 2, p. 43, 2018. [16]Y.-C. Wu, F. Yin, Z. Chen and C.-L. Liu, "Handwritten Chinese Text Recognition Using Separable Multi- Dimensional Recurrent Neural Network," in 14th IAPR International Conference on Document Analysis and Recognition, 2017. [17]D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," in arXiv preprint arXiv:1412.6980, 2014.