Models w/ Latent Random Variables Chunting Zhou, Junxian He Site - PowerPoint PPT Presentation

CS11-747 Neural Networks for NLP Models w/ Latent Random Variables Chunting Zhou, Junxian He Site https://phontron.com/class/nn4nlp2019/ Slides from Graham Neubig

Discriminative vs. Generative Models • Discriminative model: calculate the probability of output given input P(Y|X) • Generative model: calculate the probability of a variable P(X), or multiple variables P(X,Y) • Which of the following models are discriminative vs. generative? • Standard BiLSTM POS tagger • Globally normalized CRF POS tagger • Language model

Types of Variables • Observed vs. Latent: • Observed: something that we can see from our data, e.g. X or Y • Latent: a variable that we assume exists, but we aren’t given the value • Deterministic vs. Random: • Deterministic: variables that are calculated directly according to some deterministic function • Random (stochastic): variables that obey a probability distribution, and may take any of several (or infinite) values

Quiz: What Types of Variables? • In the an attentional sequence-to-sequence model using MLE/teacher forcing, are the following variables observed or latent? deterministic or random? • The input word ids f • The encoder hidden states h • The attention values a • The output word ids e

Goal of Latent Random Variable Modeling • Specify structural relationships in the context of unknown variables, to learn interpretable structure • Inject inductive bias / prior knowledge

What is Latent Random Variable Model • Older latent variable models • Topic models (unsupervised)

What is Latent Random Variable Model • Older latent variable models • Topic models (unsupervised) • Hidden Markov Model (unsupervised tagger)

What is Latent Random Variable Model • Older latent variable models • Topic models • Hidden Markov Model (unsupervised tagger) • Some tree-structured Model (unsupervised parsing)

Why Latent Random Variable • Specify structure, but interpretable structure is often discrete • There is always a tradeo ff between interpretability and flexibility

What is Latent Random Variable Model • Deep latent variable models • Variational Autoencoders (VAEs) • Generative Adversarial Network (GANs) • Flow-based generative models

Variational Auto-encoders (Kingma and Welling 2014)

A Latent Variable Model • We observed output x (assume a continuous vector for now) • We have a latent variable z generated from a Gaussian • We have a function f, parameterized by Θ that maps from z to x , where this function is usually a neural net z ~ N (0, I) Θ x = f( z ; Θ ) x N

An Example (Goersch 2016) f z x

A Latent Variable Model • We observed output x (assume a continuous vector for now) • We have a latent variable z generated from a Gaussian • We have a function f, parameterized by Θ that maps from z to x , where this function is usually a neural net z ~ N (0, I) Θ x = f( z ; Θ ) x N

What is Our Loss Function? • We would like to maximize the corpus log likelihood X log P ( X ) = log P ( x ; θ ) x ∈ X • For a single example, the marginal likelihood is Z P ( x ; θ ) = P ( x | z ; θ ) P ( z ) d z • We can approximate this by sampling z s then summing X S ( x ) := { z 0 ; z 0 ∼ P ( z ) } P ( x ; θ ) ≈ P ( x | z ; θ ) where z ∈ S ( x )

<latexit sha1_base64="726sRLPU0hZ9Kj5P1KihHSpU9D0=">ACWHicZVBNaxsxENVu03y4bWI7x1xETKEHY9YhkORmXAo5FJCnBi8xmjlWVtEH4s029os+0t6TX5U6J+J1vahTgbEPL15I5eknhMIpegvDzsfdvf2D2qfPXw6P6o3mvTO5TDgRho7TJgDKTQMUKCEYWaBqUTCQ/L4vZo/AbrhNF3uMxgrNhMi1Rwhp6a1I/iGdAYHFj5/9m3JSb0WdaFX0PehuQIts6nbSCM7iqeG5Ao1cMudG3SjDcEsCi6hrMW5g4zxRzaDkYeaKXDjYuW8pF89M6Wpsf5opCv2/42CKacYzr2yam5rVjFojHRtr8K5qlr1zOrulipJ6pdiaxL3RsjmF6OC6GzHEHztY80lxQNrVKiU2GBo1x6wLgV/iuUz5lH2WyYSVdZqsY/3CjF9LSIF2URJ6pYlOU2D2sePO9T7r7N9D0YnHWuOtGv81av4l7n5yQU/KNdMkF6ZFrcksGhJOc/CVP5Dn4FwbhXniwlobBZueYbFXYfAVZ7Zp</latexit> <latexit sha1_base64="726sRLPU0hZ9Kj5P1KihHSpU9D0=">ACWHicZVBNaxsxENVu03y4bWI7x1xETKEHY9YhkORmXAo5FJCnBi8xmjlWVtEH4s029os+0t6TX5U6J+J1vahTgbEPL15I5eknhMIpegvDzsfdvf2D2qfPXw6P6o3mvTO5TDgRho7TJgDKTQMUKCEYWaBqUTCQ/L4vZo/AbrhNF3uMxgrNhMi1Rwhp6a1I/iGdAYHFj5/9m3JSb0WdaFX0PehuQIts6nbSCM7iqeG5Ao1cMudG3SjDcEsCi6hrMW5g4zxRzaDkYeaKXDjYuW8pF89M6Wpsf5opCv2/42CKacYzr2yam5rVjFojHRtr8K5qlr1zOrulipJ6pdiaxL3RsjmF6OC6GzHEHztY80lxQNrVKiU2GBo1x6wLgV/iuUz5lH2WyYSVdZqsY/3CjF9LSIF2URJ6pYlOU2D2sePO9T7r7N9D0YnHWuOtGv81av4l7n5yQU/KNdMkF6ZFrcksGhJOc/CVP5Dn4FwbhXniwlobBZueYbFXYfAVZ7Zp</latexit> <latexit sha1_base64="726sRLPU0hZ9Kj5P1KihHSpU9D0=">ACWHicZVBNaxsxENVu03y4bWI7x1xETKEHY9YhkORmXAo5FJCnBi8xmjlWVtEH4s029os+0t6TX5U6J+J1vahTgbEPL15I5eknhMIpegvDzsfdvf2D2qfPXw6P6o3mvTO5TDgRho7TJgDKTQMUKCEYWaBqUTCQ/L4vZo/AbrhNF3uMxgrNhMi1Rwhp6a1I/iGdAYHFj5/9m3JSb0WdaFX0PehuQIts6nbSCM7iqeG5Ao1cMudG3SjDcEsCi6hrMW5g4zxRzaDkYeaKXDjYuW8pF89M6Wpsf5opCv2/42CKacYzr2yam5rVjFojHRtr8K5qlr1zOrulipJ6pdiaxL3RsjmF6OC6GzHEHztY80lxQNrVKiU2GBo1x6wLgV/iuUz5lH2WyYSVdZqsY/3CjF9LSIF2URJ6pYlOU2D2sePO9T7r7N9D0YnHWuOtGv81av4l7n5yQU/KNdMkF6ZFrcksGhJOc/CVP5Dn4FwbhXniwlobBZueYbFXYfAVZ7Zp</latexit> Variational Inference ≥ ELBO The inequality holds for any q (z|x), but the lower bound is tight only if q(z|x) = p(z|x) p(z|x) is intractable

Practice Prove >= Hint: use Jensen’s inequality

Models w/ Latent Random Variables Chunting Zhou, Junxian He Site - PowerPoint PPT Presentation

CS11-747 Neural Networks for NLP Models w/ Latent Random Variables Chunting Zhou, Junxian He Site https://phontron.com/class/nn4nlp2019/ Slides from Graham Neubig Discriminative vs. Generative Models Discriminative model: calculate the

Chapter 2: Random Variables In this chapter we will cover: 1. Discrete Random variables, ( 2.1

Discrete Random Variables October 7, 2010 Discrete Random Variables Random Variables In many

Outline Outline Several Random Variables Several Random Variables Joint

1 Latent variable models In the next section we will discuss latent variable models for

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Formal Modeling in Cognitive Science 1 Continuous Random Variables Lecture 21: Continuous Random

continuous random variables continuous random variables Discrete random variable: takes values in

P3 - Continuous random variables STAT 587 (Engineering) Iowa State University August 22, 2020

3.8 Functions of random variables 3.7, 3.9, 3.11 Multiple random variables (discrete) Prof.

Estimation of moment-based models with latent variables work in progress Raaella Giacomini and

YCL Week 3 Lets talk about variables! Variables Variables are containers for data. Variables

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

Linear Models: Comparing Variables Stony Brook University CSE545, Fall 2017 Statistical

Latent Variable Models CS3750 Xiaoting Li 1 Out utli line Latent Variable Models

Outline Outline 2 Probability Models of N Random Variables Probability Models of N Random

4 Sums of Random Variables Many of the variables dealt with in physics can be expressed as a sum

Make Tutorial Single source file code: g++ -g Wall main.cpp lm o main Multiple

I wrote Distromatch, shall we use it? Enrico Zini enrico@debian.org Feb 4, 2012 Enrico Zini

Misc. linux/bash tidbits An assortment of linux/bash commands and programs that may be handy

REPRODUCE AND VERIFY FILESYSTEMS Vincent Batts @vbatts $> finger $(whoami) Login: vbatts

Physical processes affecting stratocumulus Siems et al. 1993 Lecture 15, Slide 1 Sc physical

CPSC 320: Intermediate Algorithm Design and Analysis Schedule transformation example Schedule

Best Practices Martin Morgan (mtmorgan@fhcrc.org) Fred Hutchinson Cancer Research Center

Using Fireworks : Viewing CRAFT and MC Events Christopher Jones, Lothar Bauerdick FNAL

Models w/ Latent Random Variables Chunting Zhou, Junxian He Site - PowerPoint PPT Presentation

CS11-747 Neural Networks for NLP Models w/ Latent Random Variables Chunting Zhou, Junxian He Site https://phontron.com/class/nn4nlp2019/ Slides from Graham Neubig Discriminative vs. Generative Models Discriminative model: calculate the

Chapter 2: Random Variables In this chapter we will cover: 1. Discrete Random variables, ( 2.1

Discrete Random Variables October 7, 2010 Discrete Random Variables Random Variables In many

Outline Outline Several Random Variables Several Random Variables Joint

1 Latent variable models In the next section we will discuss latent variable models for

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Formal Modeling in Cognitive Science 1 Continuous Random Variables Lecture 21: Continuous Random

continuous random variables continuous random variables Discrete random variable: takes values in

P3 - Continuous random variables STAT 587 (Engineering) Iowa State University August 22, 2020

3.8 Functions of random variables 3.7, 3.9, 3.11 Multiple random variables (discrete) Prof.

Estimation of moment-based models with latent variables work in progress Raaella Giacomini and

YCL Week 3 Lets talk about variables! Variables Variables are containers for data. Variables

Random Numbers RANDOM VS PSEUDO RANDOM Truly Random numbers From Wolfram: A random number

Linear Models: Comparing Variables Stony Brook University CSE545, Fall 2017 Statistical

Latent Variable Models CS3750 Xiaoting Li 1 Out utli line Latent Variable Models

Outline Outline 2 Probability Models of N Random Variables Probability Models of N Random

4 Sums of Random Variables Many of the variables dealt with in physics can be expressed as a sum

Make Tutorial Single source file code: g++ -g Wall main.cpp lm o main Multiple

I wrote Distromatch, shall we use it? Enrico Zini enrico@debian.org Feb 4, 2012 Enrico Zini

Misc. linux/bash tidbits An assortment of linux/bash commands and programs that may be handy

REPRODUCE AND VERIFY FILESYSTEMS Vincent Batts @vbatts $&gt; finger $(whoami) Login: vbatts

Physical processes affecting stratocumulus Siems et al. 1993 Lecture 15, Slide 1 Sc physical

CPSC 320: Intermediate Algorithm Design and Analysis Schedule transformation example Schedule

Best Practices Martin Morgan (mtmorgan@fhcrc.org) Fred Hutchinson Cancer Research Center

Using Fireworks : Viewing CRAFT and MC Events Christopher Jones, Lothar Bauerdick FNAL

REPRODUCE AND VERIFY FILESYSTEMS Vincent Batts @vbatts $> finger $(whoami) Login: vbatts