Classifiers that improve with use Argument George Nagy In-house - PDF document

PRMU Classifiers that improve with use Argument George Nagy In-house training sets are never DocLab large enough, and never representative enough. Rensselaer Polytechnic Institute We must therefore augment them with samples from actual (real-time, real-world) OCR operation. We present some methods to this end. Febuary 19-20, 2004 IEICE-PRMU George Nagy 1 Febuary 19-20, 2004 IEICE-PRMU George Nagy 2 Outline Representation Non-representative training sets x 2 equiprobability contours Supervised learning (continuing classifier education) “Unsupervised” adaptation X X O O Self-corrective, Decision-directed, Auto-label X X X X X OOO O Symbolic Indirect Correlation (SIC) new *** O O O X X X OO Style-constrained classification X samples Weakly-constrained data distributions ( new ***) decision x 1 boundary Linguistic context Feature Space Recommendations of two features Febuary 19-20, 2004 IEICE-PRMU George Nagy 3 Febuary 19-20, 2004 IEICE-PRMU George Nagy 4 Traditional open-loop OCR System How representative is the training set? training patterns and labels training test set meta-parameters (e.g. regularization, estimators) (3) discrete parameter estimation styles transcript classifier labels correction, parameters CLASSIFIER (1) representative (5) weakly reject entry rejects constrained operational patterns (2) adaptable data ( long fields ) (bitmaps) (4) continuous styles ( short fields ) Febuary 19-20, 2004 IEICE-PRMU George Nagy 5 Febuary 19-20, 2004 IEICE-PRMU George Nagy 6 1

Supervised learning Some classifiers Generic OCR System that makes use of post-processed rejects and errors GAUSSIAN keyboarded labels of rejects and errors training LINEAR QUADRATIC set BAYES MULTILAYER NEURAL meta-parameters NETWORK parameter estimation classifier transcript correction, parameters CLASSIFIER reject entry SIMPLE PERCEPTRON operational data NEAREST SUPPORT (bitmaps) NEIGHBOR VECTOR Febuary 19-20, 2004 IEICE-PRMU George Nagy 7 Febuary 19-20, 2004 IEICE-PRMU George Nagy 8 MACHINE Adaptation ( DHS: “Decision directed approximation”) Self-corrective recognition (1966) Field estimation, singlet classification INTIAL REFERENCES NEW REFERENCES classifier assigned labels training set FEATURE EXTRACTOR meta-parameters parameter estimation accepted CATEGORIZER REFERENCE GENERATOR classifier transcript correction, rejected parameters CLASSIFIER reject entry SCANNER SOURCE operational DOCUMENT data (bitmaps) Febuary 19-20, 2004 IEICE-PRMU George Nagy 9 Febuary 19-20, 2004 IEICE-PRMU George Nagy 10 Decision-directed adaptation Results: self-corrective recognition aka self-corrective recognition, auto-label adaptation, (Shelton & Nagy 1966) semi-supervised learning, .... z 7 Training set: 9 fonts, 500 characters/font, U/C 1 7 Test set: 12 fonts, 1500 characters/font, U/C 96 n-tuple features, ternary reference vectors 7 1 adapted to a single font Omnifont Initial error and reject rates: 3.5% 15.2% classifier 1 After self correction: 0.7% 3.7% 1 7 Febuary 19-20, 2004 IEICE-PRMU George Nagy 11 Febuary 19-20, 2004 IEICE-PRMU George Nagy 12 2

Results: adapting both means and variances Results - Baird & Nagy (DR&R 1994) (Harsha Veeramachaneni 2003) 100 fonts, 80 symbols each from Baird’s defect model NIST Hand-printed digit classes, with 50 “Hitachi features” (6,400,000 characters) Train Test % Error Adapt Adapt Size (pt) Error % fonts Best Worst Before means variance reduction improved SD3 SD3 1.1 0.7 0.6 6 x 1.4 100 x 4 x 1.0 SD7 5.0 2.6 2.2 10 x 2.5 93 x 11 x 0.8 SD7 SD3 1.7 0.9 0.8 SD7 2.4 1.6 1.7 12 x 4.4 98 x 34 x 0.9 SD3+SD7 SD3 0.9 0.6 0.6 16 x 7.2 98 x 141 x0.8 SD7 3.2 1.9 1.8 Febuary 19-20, 2004 IEICE-PRMU George Nagy 13 Febuary 19-20, 2004 IEICE-PRMU George Nagy 14 From electronic ink to feature string InkLink Adnan El-Nasan (2003) On-line handwriting recognition Constrained localized polygram matching one unknown word against many reference words, using a lexicon of legal words. The reference set does not include most of the lexicon words! Febuary 19-20, 2004 IEICE-PRMU George Nagy 15 Febuary 19-20, 2004 IEICE-PRMU George Nagy 16 Polygram feature match Feature matching Unknown query word: “founding” (tX8j5XnNeEWXwBXEeNnWwSsXXTwSsXnTRwSsnTBnNewXsXXeNnWwSsX!tWwSsXTwSs)(#$%)(nNewBnNewSsXEeNnWwSNeEj65) A reference word: “amendment” (XLXsEeNnWwSsXTwSsnTwBnNeXBTewB)(XWBETnWwSsnTXBnNewS)(EXNnWBsXt)(nNeXEsSwWnNeEBsnTwSXETnWwSsNewBnNeBsXF!tFwSs)(LwS) Febuary 19-20, 2004 IEICE-PRMU George Nagy 17 Febuary 19-20, 2004 IEICE-PRMU George Nagy 18 3

Query hypothesized as “founding”: good match Query hypothesized as “contract”: poor match (tX8j5XnNeEWXwBXEeNnWwSsXXTwSsXnTRwSsnTBnNewXsXXeNnWwSsX!tWwSsXTwSs)(#$%)(nNewBnNewSsXEeNnWwSNeEj65) (tX8j5XnNeEWXwBXEeNnWwSsXXTwSsXnTRwSsnTBnNewXsXXeNnWwSsX!tWwSsXTwSs)(#$%)(nNewBnNewSsXEeNnWwSNeEj65) (XLXsEeNnWwSsXTwSsnTwBnNeXBTewB)(XWBETnWwSsnTXBnNewS)(EXNnWBsXt)(nNeXEsSwWnNeEBsnTwSXETnWwSsNewBnNeBsXF!tFwSs)(LwS) (XLXsEeNnWwSsXTwSsnTwBnNeXBTewB)(XWBETnWwSsnTXBnNewS)(EXNnWBsXt)(nNeXEsSwWnNeEBsnTwSXETnWwSsNewBnNeBsXF!tFwSs)(LwS) Febuary 19-20, 2004 IEICE-PRMU George Nagy 19 Febuary 19-20, 2004 IEICE-PRMU George Nagy 20 Localized Viterbi trellis search InkLink classification algorithm a m e n d m e n t s a c t 1. The expected location where the unknown matches the reference words is pre-computed r 2. The features matches of the unknown against the reference words are found by string matching. t 3. The hypothesis that corresponds best to the expected o n length and location of the matches is chosen. c Febuary 19-20, 2004 IEICE-PRMU George Nagy 21 Febuary 19-20, 2004 IEICE-PRMU George Nagy 22 Our most/least favorite writers Comparison with external system (four writers we like) 100-word lexicons Febuary 19-20, 2004 IEICE-PRMU George Nagy 23 Febuary 19-20, 2004 IEICE-PRMU George Nagy 24 4

Self-corrective recognition (1966) Auto-label adaptation INTIAL REFERENCES NEW REFERENCES FEATURE EXTRACTOR accepted CATEGORIZER REFERENCE GENERATOR rejected SCANNER SOURCE DOCUMENT Febuary 19-20, 2004 IEICE-PRMU George Nagy 25 Febuary 19-20, 2004 IEICE-PRMU George Nagy 26 Results of adaptation Outline (“auto-label”) in InkLink 30 25 Non-representative training sets 20 Error Rate Supervised learning (continuing classifier education) 15 “Unsupervised” adaptation 10 Self-corrective, Decision-directed, Auto-label 5 Symbolic Indirect Correlation (SIC) 0 0 1 2 3 4 5 Style constrained classification Iteration Number Weakly-constrained data distributions Error rate dropped from 28% to 7% . Linguistic context As good with 100 reference words as with 500 reference Recommendations words without adaptation. Febuary 19-20, 2004 IEICE-PRMU George Nagy 27 Febuary 19-20, 2004 IEICE-PRMU George Nagy 28 Symbolic Indirect Correlation (SIC) Signal graph of lever compared to reference signal graph Match Graphs ~ LEVER ~ ~ L E VE R ~ 0 1 2 3 4 5 6 7 8 9 0 1 2 0 1 2 3 4 5 6 7 8 9 0 1 2 SIGNAL GRAPH 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 8 9 0 1 2 3 4 5 6 7 8 9 0 ~ PERI OD~EVER ~ P E O P L E ~ ~P O ~PERI OD ~ 0 1 2 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 8 9 0 1 2 3 4 5 6 7 8 9 0 P E R I O D ~ E V E R ~ P E O P L E ~ ~ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 8 9 7 ~PERIOD~EVER ~PEOPLE~ ~PERIOD ~EVER ~PEOPLE~ ~PERIOD ~EVER ~PEOPLE~ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 LEXICAL GRAPHS E L R~ 0 1 2 3 4 5 6 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 ~ E E ~LEVER~ 0 ~PERPLEX~ ~LEVER~ Febuary 19-20, 2004 IEICE-PRMU George Nagy 29 Febuary 19-20, 2004 IEICE-PRMU George Nagy 30 5

Classifiers that improve with use Argument George Nagy In-house - PDF document

PRMU Classifiers that improve with use Argument George Nagy In-house training sets are never DocLab large enough, and never representative enough. Rensselaer Polytechnic Institute We must therefore augment them with samples from actual

Nonlinear Classifiers II 2 Nonlinear Classifiers: Introduction Classifiers Supervised

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

Machine Learning Nave Bayes classifiers Types of classifiers We can divide the large

Fusion of Continuous Output Classifiers Classifiers Jacob Hays Amit Pillay James DeFelice

Occasion-level Classifiers or Event-level Classifiers? -Evidence from Child Language Acquisition

CS440/ECE448 Lecture 22: Including Slides by Svetlana Lazebnik, 10/2016 Linear Classifiers

DISC- Improv to Improve DISC- Improv to Improve DISC- Improv to Improve DISC- Improv to Improve

Data Dependence in Data Dependence in Combining Classifiers Combining Classifiers Mohamed

Automatically Evading Classifiers A Case Study on PDF Malware Classifiers Weilin Xu

Evaluation of Classifiers Evaluation of Classifiers ROC Curves ROC Curves Reject Curves Reject

Linear Classifiers: Expressiveness Machine Learning 1 Lecture outline Linear models:

On Robust Trimming of Bayesian Network Classifiers YooJung Choi and Guy Van den Broeck UCLA

Visualization for Explainable Classifiers Yao MING THE HONG KONG UNIVERSITY OF SCIENCE AND

Linear Classifiers and the Perceptron William Cohen February 4, 2008 1 Linear classifiers

MAXIMUM MARGIN CLASSIFIERS MAXIMUM MARGIN CLASSIFIERS Matthieu R Bloch Tuesday, February 11,

Linear, Binary SVM Classifiers COMPSCI 371D Machine Learning COMPSCI 371D Machine

Chinese World Computer Center, CS, NCTU Ref:

15-721 DATABASE SYSTEMS Lecture #09 Storage Models & Data Layout Andy Pavlo / /

15 15 Programacin de Juegos y Grficos en C con Allegro Referencia: C How to Program 5 ed

Vector Graphics 2017 Raster Images Sample - based Graphics Bitmapped Graphics

1 L Feb-18-04 SMD159, Buffers and Texture Mapping Overview Buffers - Additional OpenGL

Database Systems IIB: DBMS-Implementation Chapter 10: More Data Structures for Relations Prof.

Administrative notes February 15, 2018 Our final exam date is out! April 17, 2018 @

1 Lecture Overview Nonlinearity and Gamma Many basic things tying together course I = a

Classifiers that improve with use Argument George Nagy In-house - PDF document

PRMU Classifiers that improve with use Argument George Nagy In-house training sets are never DocLab large enough, and never representative enough. Rensselaer Polytechnic Institute We must therefore augment them with samples from actual

Nonlinear Classifiers II 2 Nonlinear Classifiers: Introduction Classifiers Supervised

Cognitive Modeling Unseen Examples 2 Bayes Classifiers Lecture 14: Naive Bayes Classifiers

Machine Learning Nave Bayes classifiers Types of classifiers We can divide the large

Fusion of Continuous Output Classifiers Classifiers Jacob Hays Amit Pillay James DeFelice

Occasion-level Classifiers or Event-level Classifiers? -Evidence from Child Language Acquisition

CS440/ECE448 Lecture 22: Including Slides by Svetlana Lazebnik, 10/2016 Linear Classifiers

DISC- Improv to Improve DISC- Improv to Improve DISC- Improv to Improve DISC- Improv to Improve

Data Dependence in Data Dependence in Combining Classifiers Combining Classifiers Mohamed

Automatically Evading Classifiers A Case Study on PDF Malware Classifiers Weilin Xu

Evaluation of Classifiers Evaluation of Classifiers ROC Curves ROC Curves Reject Curves Reject

Linear Classifiers: Expressiveness Machine Learning 1 Lecture outline Linear models:

On Robust Trimming of Bayesian Network Classifiers YooJung Choi and Guy Van den Broeck UCLA

Visualization for Explainable Classifiers Yao MING THE HONG KONG UNIVERSITY OF SCIENCE AND

Linear Classifiers and the Perceptron William Cohen February 4, 2008 1 Linear classifiers

MAXIMUM MARGIN CLASSIFIERS MAXIMUM MARGIN CLASSIFIERS Matthieu R Bloch Tuesday, February 11,

Linear, Binary SVM Classifiers COMPSCI 371D Machine Learning COMPSCI 371D Machine

Chinese World Computer Center, CS, NCTU Ref:

15-721 DATABASE SYSTEMS Lecture #09 Storage Models &amp; Data Layout Andy Pavlo / /

15 15 Programacin de Juegos y Grficos en C con Allegro Referencia: C How to Program 5 ed

Vector Graphics 2017 Raster Images Sample - based Graphics Bitmapped Graphics

1 L Feb-18-04 SMD159, Buffers and Texture Mapping Overview Buffers - Additional OpenGL

Database Systems IIB: DBMS-Implementation Chapter 10: More Data Structures for Relations Prof.

Administrative notes February 15, 2018 Our final exam date is out! April 17, 2018 @

1 Lecture Overview Nonlinearity and Gamma Many basic things tying together course I = a

15-721 DATABASE SYSTEMS Lecture #09 Storage Models & Data Layout Andy Pavlo / /