Using a Hidden-Markov Model in Semi-Automatic Indexing of - PowerPoint PPT Presentation

Dec 07, 2023 •360 likes •534 views

Using a Hidden-Markov Model in Semi-Automatic Indexing of Historical Handwritten Records Thomas Packer, Oliver Nina, Ilya Raykhel Computer Science Brigham Young University The Challenge: Indexing Handwriting Millions of historical

Using a Hidden-Markov Model in Semi-Automatic Indexing of Historical Handwritten Records Thomas Packer, Oliver Nina, Ilya Raykhel Computer Science Brigham Young University
The Challenge: Indexing Handwriting • Millions of historical documents. • Many hours of manual indexing. • Years to complete using hundreds of thousands of volunteers. • Previous transcriptions not fully leveraged.
Family Search Indexing Tool
A Solution: On-Line Machine Learning • Holistic handwritten word recognition using a Hidden Markov Model (HMM), based on Lavrenko et al. (2004). • HMM selects words to maximize joint probability: • Word-feature probability model • Word-transition probability model • Word-feature model predicts a word from its visual features. • Word-transition model predicts a word from its neighboring word.
The Process Census Images Transcriptions Labeled Examples Word Feature Rectangle Vectors s Learne Training Model r Examples Classifie Test Results r Examples
Census Images • 3 US Census images • Same census taker • Preprocessing: Kittler's algorithm to threshold images
Extracted Fields • Manually copied bounding rectangles • 3 columns: 1. Relationship to Head (14) 2. Sex (2) 3. Marital Status (4) • 123 rows total • N-fold cross validation • N = 24 (5 rows to test)
Examples to Feature Vectors 25 Numeric Features Extracted: o Scalar Features:  height ( h)  width ( w )  aspect ratio ( w / h )  area (w * h ) o Profile Features:  projection profile  upper/lower word profile  7 lowest scalar values from DFT
HMM and Transition Probability Model • Probability Model: o Hidden Markov Model o State Transition Probabilities
Observation Probability Model o Multi-variate normal distribution:
Accuracies with and without HMM
Accuracies for Separate Columns with and without HMM
Accuracies of HMM for Varying Numbers of Training Examples
Accuracies of “Relationship to Head” for Varying Numbers of Examples
Conclusions and Future Work • 10% correction rate for chosen columns after one page. • Measure indexing time. • Update models in real-time. • Columns with larger vocabularies. • More image preprocessing. • More visual features. • More dependencies among words (in different rows). • More training data.
Questions?

Recommend

Hidden Markov Models Discrete Markov Processes 1 Hidden Markov Models Hidden Markov Models 2

Discrete Markov Processes Hidden Markov Models Inferences from HMMs Training an HMM Discrete Markov Processes Hidden Markov Models Inferences from HMMs Training an HMM Hidden Markov Models Discrete Markov Processes 1 Hidden Markov Models

470 views • 8 slides

CSCE 471/871 Lecture 3: Markov Chains Markov Chains and and Hidden Markov Models Hidden

CSCE 471/871 Lecture 3: CSCE 471/871 Lecture 3: Markov Chains Markov Chains and and Hidden Markov Models Hidden Markov Models Stephen Scott Markov Chains Stephen Scott Hidden Markov Models Specifying an HMM sscott@cse.unl.edu 1

439 views • 26 slides

Outline depmixS4: an R-package for hidden Markov models Hidden Markov Models Ingmar Visser 1

Hidden Markov Models Hidden Markov Models DepmixS4 DepmixS4 Examples Examples Conclusions Conclusions Outline depmixS4: an R-package for hidden Markov models Hidden Markov Models Ingmar Visser 1 & Maarten Speekenbrink 2 DepmixS4 1

121 views • 10 slides

Hidden Markov Models Pratik Lahiri Introduction A hidden Markov model (HMM) is a

Hidden Markov Models Pratik Lahiri Introduction A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved (hidden) states. We call the observed event

741 views • 13 slides

Markov chains and Hidden Markov Models 9000 Markov chains and HMMs We will discuss: Markov

Markov chains and Hidden Markov Models 9000 Markov chains and HMMs We will discuss: Markov chains Hidden Markov Models (HMMs) Algorithms: Viterbi, forward, backward, posterior decoding Profile HMMs Baum-Welch algorithm 9001

1.16k views • 87 slides

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr Conor McArdle EE414 - Markov Chains 1/30 Markov Processes A Markov Process is a stochastic process X t with the Markov property : Pr ( X t n x n |

491 views • 30 slides

The Hidden Markov The Hidden Markov Model (HMM) Model (HMM) 1 Lecture Outline Lecture Outline

Digital Speech Processing Digital Speech Processing Lecture 20 Lecture 20 The Hidden Markov The Hidden Markov Model (HMM) Model (HMM) 1 Lecture Outline Lecture Outline Theory of Markov Models discrete Markov processes

991 views • 87 slides

Hidden Markov Models Steven J Zeil Old Dominion Univ. Fall 2010 1 Discrete Markov Processes

Discrete Markov Processes Hidden Markov Models Inferences from HMMs Training an HMM Hidden Markov Models Steven J Zeil Old Dominion Univ. Fall 2010 1 Discrete Markov Processes Hidden Markov Models Inferences from HMMs Training an HMM

980 views • 32 slides

Markov Models Kunsch, H.R., State Space and Hidden Markov Models . ETH- Zurich, Zurich;

State Space and Hidden Markov Models Kunsch, H.R., State Space and Hidden Markov Models . ETH- Zurich, Zurich; Aliaksandr Hubin Oslo 2014 Contents 1. Introduction 2. Markov Chains 3. Hidden Markov and State Space Model 4. Filtering and

853 views • 37 slides

Markov Chains and Hidden Markov Models COMP 571 Luay Nakhleh, Rice University Markov Chains and

Markov Chains and Hidden Markov Models COMP 571 Luay Nakhleh, Rice University Markov Chains and Hidden Markov Models Modeling the statistical properties of biological sequences and distinguishing regions based on these models For the

1.33k views • 96 slides

Markov Chains and Hidden Markov Models COMP 571 Luay Nakhleh, Rice University 2 Markov Chains

1 Markov Chains and Hidden Markov Models COMP 571 Luay Nakhleh, Rice University 2 Markov Chains and Hidden Markov Models Modeling the statistical properties of biological sequences and distinguishing regions based on these models For the

437 views • 32 slides

Hidden Markov Models Markov Model (Finite State Machine with Probs) Modeling a sequence of

Hidden Markov Models Markov Model (Finite State Machine with Probs) Modeling a sequence of weather observations Hidden Markov Models Assume the states in the machine are not observed and we can observe some output at certain states. Hidden

692 views • 26 slides

Hidden Markov Model (HMM) Sensor Markov assumption: P ( E t | X 0: t , E 1: t 1 ) = P ( E t | X

Hidden Markov Model (HMM) Sensor Markov assumption: P ( E t | X 0: t , E 1: t 1 ) = P ( E t | X t ) Stationary process: transition model P ( X t | X t 1 ) and Hidden Markov Models sensor model P ( E t | X t ) fixed for all t HMM is a

324 views • 3 slides

Model Repair for Markov Decision Model Repair for Markov Decision Model Repair for Markov

Model Repair for Markov Decision Model Repair for Markov Decision Model Repair for Markov Decision Model Repair for Markov Decision Processes Processes Processes Processes Marta Kwiatkowska Department of Computer Science, University of

524 views • 36 slides

Markov Chains and Hidden Markov Models COMP 571 - Spring 2015 Luay Nakhleh, Rice University

Markov Chains and Hidden Markov Models COMP 571 - Spring 2015 Luay Nakhleh, Rice University Markov Chains and Hidden Markov Models Modeling the statistical properties of biological sequences and distinguishing regions based on these models

873 views • 58 slides

Developing and Using Special Developing and Using Special Developing and Using Special Purpose

Developing and Using Special Developing and Using Special Developing and Using Special Purpose Hidden Markov Model Purpose Hidden Markov Model Purpose Hidden Markov Model Databases Databases Databases Martin Gollery Associate Director of

831 views • 81 slides

Hartmann Corporate presentation 1 Hartmann at a glance Global leader in moulded-fibre egg

Hartmann Corporate presentation 1 Hartmann at a glance Global leader in moulded-fibre egg packaging 10 factories and 2,000 employees South American leader in fruit packaging Expected revenue of DKK 2.1-2.2 billion in 2016 Among largest

362 views • 14 slides

Asia-Pacific non-deal roadshow Delivering growth in a challenging market June 2016 Corporate

Asia-Pacific non-deal roadshow Delivering growth in a challenging market June 2016 Corporate strategy A growth focused oil and gas exploration and production company with world class operating credentials Surat Basin gas Strategic enablers

220 views • 19 slides

Pacific Grove City Council MRWMD Update Tim Flanagan, General Manager Peter Skinner, Director of

Monterey Regional Waste Management District Turning Waste Into Resources Since 1951 Pacific Grove City Council MRWMD Update Tim Flanagan, General Manager Peter Skinner, Director of Administration and Finance October 17, 2018 1 Board rd of

501 views • 16 slides

4th Asia-Pacific Regional Forum on Smart Sustainable Cities and e-Government Malcolm Johnson,

27062018 Asia-Pacific Regional Forum DSG V2 4th Asia-Pacific Regional Forum on Smart Sustainable Cities and e-Government Malcolm Johnson, Deputy Secretary-General, ITU, Thanh Ha City, 4 July 2018 H.E. Dr. Phan Tam, Vice Minister of Information

117 views • 7 slides

National Honey Packers and Dealers Association N i l H P k d D l A i i Galveston Texas

True Source Certified TM National Honey Packers and Dealers Association N i l H P k d D l A i i Galveston Texas January 6, 2011 Elise Gagnon President, Odem International/Vice Chair, True Source Honey Lauren Brink Food Services

342 views • 13 slides

Luc Roels and Jim Hyland 750 Enterprise Drive Kingston, NY 12401 845-383-1761 What is a

Luc Roels and Jim Hyland 750 Enterprise Drive Kingston, NY 12401 845-383-1761 What is a Co-packer? Takes produce/raw ingredients and turns it into a new value-added product for the market. In NYC alone there is an estimated $860,000,000 worth

368 views • 10 slides

Denver Gold Forum KEEPING THE FOCUS DESPITE HIGHER GOLD PRICE Nick Holland 19 September 2016

Denver Gold Forum KEEPING THE FOCUS DESPITE HIGHER GOLD PRICE Nick Holland 19 September 2016 Forward looking statements Certain statements in this document constitute forward looking statements within the meaning of Section 27A of the US

415 views • 24 slides

Company Update Disclaimer This presentation may contain forward-looking statements which are made

Exploring and Developing Hydrocarbons offshore Somalia December 2016 Company Update Disclaimer This presentation may contain forward-looking statements which are made in good faith and are based on current expectations or beliefs, as well as

523 views • 25 slides