Latent Models: Sequence Models Beyond HMMs and Machine Translation - PowerPoint PPT Presentation

Latent Models: Sequence Models Beyond HMMs and Machine Translation Alignment CMSC 473/673 UMBC

Outline Review: EM for HMMs Machine Translation Alignment Limited Sequence Models Maximum Entropy Markov Models Conditional Random Fields Recurrent Neural Networks Basic Definitions Example in PyTorch

Why Do We Need Both the Forward and Backward Algorithms? Compute posteriors α(i, s) * β(i, s) = total probability of paths through state s at step i 𝑞 𝑨 𝑗 = 𝑡 𝑥 1 , ⋯ , 𝑥 𝑂 ) = 𝛽 𝑗, 𝑡 ∗ 𝛾(𝑗, 𝑡) 𝛽(𝑂 + 1, END ) α(i, s) * p(s’ | s) * p(obs at i+1 | s’) * β(i+1, s’) = total probability of paths through the s  s’ arc (at time i) 𝑞 𝑨 𝑗 = 𝑡, 𝑨 𝑗+1 = 𝑡 ′ 𝑥 1 , ⋯ , 𝑥 𝑂 ) = 𝛽 𝑗, 𝑡 ∗ 𝑞 𝑡 ′ 𝑡 ∗ 𝑞 obs 𝑗+1 𝑡 ′ ∗ 𝛾(𝑗 + 1, 𝑡′) 𝛽(𝑂 + 1, END )

EM for HMMs p obs (w | s) 0. Assume some value for your parameters p trans (s’ | s) Two step, iterative algorithm 1. E-step: count under uncertainty, assuming these parameters 𝑞 ∗ 𝑨 𝑗 = 𝑡, 𝑨 𝑗+1 = 𝑡 ′ 𝑥 1 , ⋯ , 𝑥 𝑂 ) = 𝑞 ∗ 𝑨 𝑗 = 𝑡 𝑥 1 , ⋯ , 𝑥 𝑂 ) = 𝛽 𝑗, 𝑡 ∗ 𝛾(𝑗, 𝑡) 𝛽 𝑗, 𝑡 ∗ 𝑞 𝑡 ′ 𝑡 ∗ 𝑞 obs 𝑗+1 𝑡 ′ ∗ 𝛾(𝑗 + 1, 𝑡′) 𝛽(𝑂 + 1, END ) 𝛽(𝑂 + 1, END ) 2. M-step: maximize log-likelihood, assuming these uncertain counts estimated counts

EM For HMMs α = computeForwards() β = computeBackwards() (Baum-Welch L = α [N+1][ EN D ] Algorithm) for(i = N; i ≥ 0; --i) { for(next = 0; next < K*; ++next) { c obs (obs i+1 | next) += α [i+1][next]* β [i+1][next]/L for(state = 0; state < K*; ++state) { u = p obs (obs i+1 | next) * p trans (next | state) c trans (next| state) += α [i][state] * u * β [i+1][next]/L } } } update p obs , p trans using c obs , c trans

Semi-Supervised Learning ? ? ?  ? ? ? EM  ? ? ?  ? ? ?  ? ? ?  ? ? ?  ? ? ? ? ? ? labeled data: unlabeled data: • human annotated • raw; not annotated • relatively small/few • plentiful examples

Semi-Supervised Parameter Estimation for HMMs Transition Counts Emission Counts N V end w 1 w 2 W 3 w 4 start 2 0 0 N 2 0 1 2 N 1 2 2 V 0 2 1 0 V 2 1 0 Mixed Transition Counts Mixed Emission Counts N V end w 1 w 2 W 3 w 4 start 3.8 .1 .1 N 2.4 .3 1.2 2.2 N 2.5 2.8 2.1 V .1 2.6 1.3 .3 V 3.4 2.1 .4 Expected Transition Expected Counts Emission Counts N V end w 1 w 2 W 3 w 4 start 1.8 .1 .1 N .4 .3 .2 .2 N 1.5 .8 .1 V .1 .6 .3 .3 V 1.4 1.1 .4

Outline Review: EM for HMMs Machine Translation Alignment Limited Sequence Models Maximum Entropy Markov Models Conditional Random Fields Recurrent Neural Networks Basic Definitions Example in PyTorch

Warren Weaver’s Note When I look at an article in Russian, I say “This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode.” (Warren Weaver, 1947) http://www.mt-archive.info/Weaver-1949.pdf Slides courtesy Rebecca Knowles

Noisy Channel Model language text w language speak or d Decode Rerank язы́к text language translation/ (clean) language decode model model w speak or d observed Russian (noisy) text written in English (clean) English Slides courtesy Rebecca Knowles

Translation Translate French (observed) into English: Le chat est sur la chaise. The cat is on the chair. Slides courtesy Rebecca Knowles

Alignment Le chat est sur la chaise. ? The cat is on the chair. Le chat est sur la chaise. The cat is on the chair. Slides courtesy Rebecca Knowles

Parallel Texts Whereas recognition of the inherent dignity and of the Yolki, pampa ni tlatepanitalotl, ni tlasenkauajkayotl iuan ni kuali nemilistli ipan ni tlalpan, yaya ni moneki moixmatis uan monemilis, equal and inalienable rights of all members of the human ijkinoj nochi kuali tiitstosej ika touampoyouaj. family is the foundation of freedom, justice and peace in the world, Pampa tlaj amo tikixmatij tlatepanitalistli uan tlen kuali nemilistli ipan ni tlalpan, yeka onkatok kualantli, onkatok tlateuilistli, onkatok Whereas disregard and contempt for human rights have majmajtli uan sekinok tlamantli teixpanolistli; yeka moneki ma kuali timouikakaj ika nochi touampoyouaj, ma amo onkaj majmajyotl uan resulted in barbarous acts which have outraged the teixpanolistli; moneki ma onkaj yejyektlalistli, ma titlajtlajtokaj uan ma conscience of mankind, and the advent of a world in which tijneltokakaj tlen tojuantij tijnekij tijneltokasej uan amo tlen ma human beings shall enjoy freedom of speech and belief topanti, kenke, pampa tijnekij ma onkaj tlatepanitalistli. and freedom from fear and want has been proclaimed as the highest aspiration of the common people, Pampa ni tlatepanitalotl moneki ma tiyejyekokaj, ma tijchiuakaj uan ma tijmanauikaj; ma nojkia kiixmatikaj tekiuajtinij, uejueyij tekiuajtinij, ijkinoj amo onkas nopeka se akajya touampoj san tlen ueli kinekis Whereas it is essential, if man is not to be compelled to techchiuilis, technauatis, kinekis technauatis ma tijchiuakaj se have recourse, as a last resort, to rebellion against tyranny tlamantli tlen amo kuali; yeka ni tlatepanitalotl tlauel moneki ipan and oppression, that human rights should be protected by tonemilis ni tlalpan. the rule of law, Pampa nojkia tlauel moneki ma kuali timouikakaj, ma tielikaj keuak tiiknimej, nochi tlen tlakamej uan siuamej tlen tiitstokej ni tlalpan. Whereas it is essential to promote the development of … friendly relations between nations, … http://www.ohchr.org/EN/UDHR/Pages/Language.aspx?LangID=nhn http://www.un.org/en/universal-declaration-human-rights/ Slides courtesy Rebecca Knowles

Preprocessing • Sentence align Whereas recognition of the inherent dignity and of the equal and inalienable rights of all members of the human family is the foundation • of freedom, justice and peace in the world, Clean corpus Whereas disregard and contempt for human rights have resulted in barbarous acts which have outraged the conscience of mankind, and the advent of a world in which human beings shall enjoy freedom of speech • Tokenize and belief and freedom from fear and want has been proclaimed as the highest aspiration of the common people, Whereas it is essential, if man is not to be compelled to have recourse, • as a last resort, to rebellion against tyranny and oppression, that human Handle case rights should be protected by the rule of law, Whereas it is essential to promote the development of friendly relations between nations, • Word segmentation … http://www.un.org/en/universal-declaration-human-rights/ Yolki, pampa ni tlatepanitalotl, ni tlasenkauajkayotl iuan ni kuali nemilistli ipan ni tlalpan, yaya ni moneki moixmatis uan monemilis, ijkinoj nochi kuali tiitstosej ika touampoyouaj. (morphological, BPE, etc.) Pampa tlaj amo tikixmatij tlatepanitalistli uan tlen kuali nemilistli ipan ni tlalpan, yeka onkatok kualantli, onkatok tlateuilistli, onkatok majmajtli uan sekinok tlamantli teixpanolistli; yeka moneki ma kuali timouikakaj ika nochi touampoyouaj, ma amo • onkaj majmajyotl uan teixpanolistli; moneki ma onkaj yejyektlalistli, ma titlajtlajtokaj Language -specific uan ma tijneltokakaj tlen tojuantij tijnekij tijneltokasej uan amo tlen ma topanti, kenke, pampa tijnekij ma onkaj tlatepanitalistli. Pampa ni tlatepanitalotl moneki ma tiyejyekokaj, ma tijchiuakaj uan ma tijmanauikaj; preprocessing (example: ma nojkia kiixmatikaj tekiuajtinij, uejueyij tekiuajtinij, ijkinoj amo onkas nopeka se akajya touampoj san tlen ueli kinekis techchiuilis, technauatis, kinekis technauatis ma tijchiuakaj se tlamantli tlen amo kuali; yeka ni tlatepanitalotl tlauel moneki ipan tonemilis ni tlalpan. pre-reordering) Pampa nojkia tlauel moneki ma kuali timouikakaj, ma tielikaj keuak tiiknimej, nochi tlen tlakamej uan siuamej tlen tiitstokej ni tlalpan. … • http://www.ohchr.org/EN/UDHR/Pages/Language.aspx?LangID=nhn ... Slides courtesy Rebecca Knowles

Alignments If we had word-aligned text, we could easily estimate P( f | e ). But we don’t usually have word alignments, and they are expensive to produce by hand… If we had P( f | e ) we could produce alignments automatically. Slides courtesy Rebecca Knowles

IBM Model 1 (1993) • Lexical Translation Model • Word Alignment Model • The simplest of the original IBM models • For all IBM models, see the original paper (Brown et al, 1993): http://www.aclweb.org/anthology/J93-2003 Slides courtesy Rebecca Knowles

Latent Models: Sequence Models Beyond HMMs and Machine Translation - PowerPoint PPT Presentation

Latent Models: Sequence Models Beyond HMMs and Machine Translation Alignment CMSC 473/673 UMBC Outline Review: EM for HMMs Machine Translation Alignment Limited Sequence Models Maximum Entropy Markov Models Conditional Random Fields

1 Latent variable models In the next section we will discuss latent variable models for

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Latent Variable Models CS3750 Xiaoting Li 1 Out utli line Latent Variable Models

Learning Overcomplete Latent Variable Models through Tensor Methods Anima Anandkumar UC Irvine

Latent Class Models: The Latent Class Logit Model Accouting for unobserved heterogeneity:

Pengtao Xie Joint work with Yuntian Deng and Eric Xing Carnegie Mellon University 1 Latent

Latent Variable Models Stefano Ermon, Aditya Grover Stanford University Lecture 6 Stefano

C unobserved construct (e.g. Disordered v. Non- Disordered) Latent classes are mutually

Learning Latent Variable Models through Tensor Methods Anima Anandkumar U.C. Irvine Challenges

Models for Retrieval Models for Retrieval 1. HMM/N-gram-based 2. Latent Semantic Indexing (LSI)

Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model CS330

Optimization-Based Model Fitting for Latent Class and Latent Profile Analyses Guan-Hua Huang,

Latent Damage and Reliability in Semiconductor Devices May1625 - Advisor & Client: Dr. Randy

ZEB1 Regulates the Latent- -Lytic Lytic Switch Switch ZEB1 Regulates the Latent in Infection

Demystifying Relational Latent Representations Sebastijan Dumani, Hendrik Blockeel DTAI, KU

Latent Class Analysis (LCA) in Stata Kristin MacDonald Director of Statistical Services

YES, YOU CAN TRAIN CATS! Circuscats.com (The Acro-Cats) CHERYL KOLUS, DVM, KPA-CTP 2 IF THIS

Grammars and the Pumping Lemma 10/2/19 (Using slides adapted from the book) Administrivia No

CS 162 Intro to Programming II Polymorphism Ib 1 Type

DL2: Training and Querying Neural Networks with Logic Ma Marc Fischer er , Mislav Balunovi ,

Catalan intransitive verbs and argument realization Alex Alsina & Fengrong Yang

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part 2 S ebastien

The Catch-up Phenomenon in Bayesian and MDL Model Selection Tim van Erven www.timvanerven.nl 23

Catch-up Effects in Health Outcomes - Linear and Quantile Regression Estimates from Four Countries

Latent Models: Sequence Models Beyond HMMs and Machine Translation - PowerPoint PPT Presentation

Latent Models: Sequence Models Beyond HMMs and Machine Translation Alignment CMSC 473/673 UMBC Outline Review: EM for HMMs Machine Translation Alignment Limited Sequence Models Maximum Entropy Markov Models Conditional Random Fields

1 Latent variable models In the next section we will discuss latent variable models for

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Latent Variable Models CS3750 Xiaoting Li 1 Out utli line Latent Variable Models

Learning Overcomplete Latent Variable Models through Tensor Methods Anima Anandkumar UC Irvine

Latent Class Models: The Latent Class Logit Model Accouting for unobserved heterogeneity:

Pengtao Xie Joint work with Yuntian Deng and Eric Xing Carnegie Mellon University 1 Latent

Latent Variable Models Stefano Ermon, Aditya Grover Stanford University Lecture 6 Stefano

C unobserved construct (e.g. Disordered v. Non- Disordered) Latent classes are mutually

Learning Latent Variable Models through Tensor Methods Anima Anandkumar U.C. Irvine Challenges

Models for Retrieval Models for Retrieval 1. HMM/N-gram-based 2. Latent Semantic Indexing (LSI)

Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model CS330

Optimization-Based Model Fitting for Latent Class and Latent Profile Analyses Guan-Hua Huang,

Latent Damage and Reliability in Semiconductor Devices May1625 - Advisor &amp; Client: Dr. Randy

ZEB1 Regulates the Latent- -Lytic Lytic Switch Switch ZEB1 Regulates the Latent in Infection

Demystifying Relational Latent Representations Sebastijan Dumani, Hendrik Blockeel DTAI, KU

Latent Class Analysis (LCA) in Stata Kristin MacDonald Director of Statistical Services

YES, YOU CAN TRAIN CATS! Circuscats.com (The Acro-Cats) CHERYL KOLUS, DVM, KPA-CTP 2 IF THIS

Grammars and the Pumping Lemma 10/2/19 (Using slides adapted from the book) Administrivia No

CS 162 Intro to Programming II Polymorphism Ib 1 Type

DL2: Training and Querying Neural Networks with Logic Ma Marc Fischer er , Mislav Balunovi ,

Catalan intransitive verbs and argument realization Alex Alsina &amp; Fengrong Yang

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, Part 2 S ebastien

The Catch-up Phenomenon in Bayesian and MDL Model Selection Tim van Erven www.timvanerven.nl 23

Catch-up Effects in Health Outcomes - Linear and Quantile Regression Estimates from Four Countries

Latent Damage and Reliability in Semiconductor Devices May1625 - Advisor & Client: Dr. Randy

Catalan intransitive verbs and argument realization Alex Alsina & Fengrong Yang