directed probabilistic graphical models
play

Directed Probabilistic Graphical Models CMSC 678 UMBC Announcement - PowerPoint PPT Presentation

Directed Probabilistic Graphical Models CMSC 678 UMBC Announcement 1: Assignment 3 Due Wednesday April 11 th , 11:59 AM Any questions? Announcement 2: Progress Report on Project Due Monday April 16 th , 11:59 AM Build on the proposal: Update


  1. Directed Probabilistic Graphical Models CMSC 678 UMBC

  2. Announcement 1: Assignment 3 Due Wednesday April 11 th , 11:59 AM Any questions?

  3. Announcement 2: Progress Report on Project Due Monday April 16 th , 11:59 AM Build on the proposal: Update to address comments Discuss the progress you’ve made Discuss what remains to be done Discuss any new blocks you’ve experienced (or anticipate experiencing) Any questions?

  4. Outline Recap of EM Math: Lagrange Multipliers for constrained optimization Probabilistic Modeling Example: Die Rolling Directed Graphical Models Naïve Bayes Hidden Markov Models Message Passing: Directed Graphical Model Inference Most likely sequence Total (marginal) probability EM in D-PGMs

  5. Recap from last time…

  6. Expectation Maximization (EM): E-step 0. Assume some value for your parameters Two step, iterative algorithm 1. E-step: count under uncertainty, assuming these parameters 𝑞(𝑨 𝑗 ) count(𝑨 𝑗 , 𝑥 𝑗 ) 2. M-step: maximize log-likelihood, assuming these uncertain counts 𝑞 𝑢+1 (𝑨) 𝑞 (𝑢) (𝑨) estimated counts http://blog.innotas.com/wp-

  7. EM Math E-step: count under uncertainty max 𝔽 𝑨 ~ 𝑞 𝜄(𝑢) (⋅|𝑥) log 𝑞 𝜄 (𝑨, 𝑥) old parameters 𝜄 new parameters posterior distribution new parameters M-step: maximize log-likelihood 𝒟 𝜄 = log-likelihood of 𝒬 𝜄 = posterior log- ℳ 𝜄 = marginal log- complete data (X,Y) likelihood of incomplete data Y likelihood of observed data X ℳ 𝜄 = 𝔽 𝑍∼𝜄 (𝑢) [𝒟 𝜄 |𝑌] − 𝔽 𝑍∼𝜄 (𝑢) [𝒬 𝜄 |𝑌] EM does not decrease the marginal log-likelihood

  8. Outline Recap of EM Math: Lagrange Multipliers for constrained optimization Probabilistic Modeling Example: Die Rolling Directed Graphical Models Naïve Bayes Hidden Markov Models Message Passing: Directed Graphical Model Inference Most likely sequence Total (marginal) probability EM in D-PGMs

  9. Lagrange multipliers Assume an original optimization problem

  10. Lagrange multipliers Assume an original optimization problem We convert it to a new optimization problem:

  11. Lagrange multipliers: an equivalent problem?

  12. Lagrange multipliers: an equivalent problem?

  13. Lagrange multipliers: an equivalent problem?

  14. Outline Recap of EM Math: Lagrange Multipliers for constrained optimization Probabilistic Modeling Example: Die Rolling Directed Graphical Models Naïve Bayes Hidden Markov Models Message Passing: Directed Graphical Model Inference Most likely sequence Total (marginal) probability EM in D-PGMs

  15. Probabilistic Estimation of Rolling a Die N different (independent) rolls 𝑞 𝑥 1 , 𝑥 2 , … , 𝑥 𝑂 = 𝑞 𝑥 1 𝑞 𝑥 2 ⋯ 𝑞 𝑥 𝑂 = ෑ 𝑞 𝑥 𝑗 𝑗 Generative Story 𝑥 1 = 1 for roll 𝑗 = 1 to 𝑂: 𝑥 𝑗 ∼ Cat(𝜄) 𝑥 2 = 5 𝑥 3 = 4 a probability distribution over 6 sides of the die ⋯ 6 0 ≤ 𝜄 𝑙 ≤ 1, ∀𝑙 ෍ 𝜄 𝑙 = 1 𝑙=1

  16. Probabilistic Estimation of Rolling a Die N different (independent) rolls 𝑞 𝑥 1 , 𝑥 2 , … , 𝑥 𝑂 = 𝑞 𝑥 1 𝑞 𝑥 2 ⋯ 𝑞 𝑥 𝑂 = ෑ 𝑞 𝑥 𝑗 𝑗 Generative Story 𝑥 1 = 1 for roll 𝑗 = 1 to 𝑂: 𝑥 𝑗 ∼ Cat(𝜄) 𝑥 2 = 5 Maximize Log-likelihood 𝑥 3 = 4 ℒ 𝜄 = ෍ log 𝑞 𝜄 (𝑥 𝑗 ) 𝑗 ⋯ = ෍ log 𝜄 𝑥 𝑗 𝑗

  17. Probabilistic Estimation of Rolling a Die N different (independent) rolls 𝑞 𝑥 1 , 𝑥 2 , … , 𝑥 𝑂 = 𝑞 𝑥 1 𝑞 𝑥 2 ⋯ 𝑞 𝑥 𝑂 = ෑ 𝑞 𝑥 𝑗 𝑗 Generative Story Maximize Log-likelihood for roll 𝑗 = 1 to 𝑂: ℒ 𝜄 = ෍ log 𝜄 𝑥 𝑗 𝑥 𝑗 ∼ Cat(𝜄) 𝑗 Q: What’s an easy way to maximize this, as written exactly (even without calculus)?

  18. Probabilistic Estimation of Rolling a Die N different (independent) rolls 𝑞 𝑥 1 , 𝑥 2 , … , 𝑥 𝑂 = 𝑞 𝑥 1 𝑞 𝑥 2 ⋯ 𝑞 𝑥 𝑂 = ෑ 𝑞 𝑥 𝑗 𝑗 Generative Story Maximize Log-likelihood for roll 𝑗 = 1 to 𝑂: ℒ 𝜄 = ෍ log 𝜄 𝑥 𝑗 𝑥 𝑗 ∼ Cat(𝜄) 𝑗 Q: What’s an easy way to maximize this, as written exactly (even without calculus)? A: Just keep increasing 𝜄 𝑙 ( we know 𝜄 must be a distribution, but it’s not specified)

  19. Probabilistic Estimation of Rolling a Die N different (independent) rolls 𝑞 𝑥 1 , 𝑥 2 , … , 𝑥 𝑂 = 𝑞 𝑥 1 𝑞 𝑥 2 ⋯ 𝑞 𝑥 𝑂 = ෑ 𝑞 𝑥 𝑗 𝑗 Maximize Log-likelihood (with distribution constraints) 6 (we can include the inequality constraints ℒ 𝜄 = ෍ log 𝜄 𝑥 𝑗 s. t. ෍ 𝜄 𝑙 = 1 0 ≤ 𝜄 𝑙 , but it complicates the problem and, right 𝑗 𝑙=1 now , is not needed) solve using Lagrange multipliers

  20. Probabilistic Estimation of Rolling a Die N different (independent) rolls 𝑞 𝑥 1 , 𝑥 2 , … , 𝑥 𝑂 = 𝑞 𝑥 1 𝑞 𝑥 2 ⋯ 𝑞 𝑥 𝑂 = ෑ 𝑞 𝑥 𝑗 𝑗 Maximize Log-likelihood (with distribution constraints) (we can include the 6 inequality constraints 0 ≤ 𝜄 𝑙 , but it ℱ 𝜄 = ෍ log 𝜄 𝑥 𝑗 − 𝜇 ෍ 𝜄 𝑙 − 1 complicates the problem and, right 𝑗 𝑙=1 now , is not needed) 6 𝜖ℱ 𝜄 1 𝜖ℱ 𝜄 = ෍ − 𝜇 = − ෍ 𝜄 𝑙 + 1 𝜖𝜄 𝑙 𝜄 𝑥 𝑗 𝜖𝜇 𝑗:𝑥 𝑗 =𝑙 𝑙=1

  21. Probabilistic Estimation of Rolling a Die N different (independent) rolls 𝑞 𝑥 1 , 𝑥 2 , … , 𝑥 𝑂 = 𝑞 𝑥 1 𝑞 𝑥 2 ⋯ 𝑞 𝑥 𝑂 = ෑ 𝑞 𝑥 𝑗 𝑗 Maximize Log-likelihood (with distribution constraints) (we can include the 6 inequality constraints 0 ≤ 𝜄 𝑙 , but it ℱ 𝜄 = ෍ log 𝜄 𝑥 𝑗 − 𝜇 ෍ 𝜄 𝑙 − 1 complicates the problem and, right 𝑗 𝑙=1 now , is not needed) 6 σ 𝑗:𝑥 𝑗 =𝑙 1 𝜄 𝑙 = optimal 𝜇 when ෍ 𝜄 𝑙 = 1 𝜇 𝑙=1

  22. Probabilistic Estimation of Rolling a Die N different (independent) rolls 𝑞 𝑥 1 , 𝑥 2 , … , 𝑥 𝑂 = 𝑞 𝑥 1 𝑞 𝑥 2 ⋯ 𝑞 𝑥 𝑂 = ෑ 𝑞 𝑥 𝑗 𝑗 Maximize Log-likelihood (with distribution constraints) (we can include the 6 inequality constraints 0 ≤ 𝜄 𝑙 , but it ℱ 𝜄 = ෍ log 𝜄 𝑥 𝑗 − 𝜇 ෍ 𝜄 𝑙 − 1 complicates the problem and, right 𝑗 𝑙=1 now , is not needed) 6 σ 𝑗:𝑥 𝑗 =𝑙 1 σ 𝑙 σ 𝑗:𝑥 𝑗 =𝑙 1 = 𝑂 𝑙 𝜄 𝑙 = optimal 𝜇 when ෍ 𝜄 𝑙 = 1 𝑂 𝑙=1

  23. Example: Conditionally Rolling a Die 𝑞 𝑥 1 , 𝑥 2 , … , 𝑥 𝑂 = 𝑞 𝑥 1 𝑞 𝑥 2 ⋯ 𝑞 𝑥 𝑂 = ෑ 𝑞 𝑥 𝑗 𝑗 add complexity to better explain what we see 𝑞 𝑨 1 , 𝑥 1 , 𝑨 2 , 𝑥 2 , … , 𝑨 𝑂 , 𝑥 𝑂 = 𝑞 𝑨 1 𝑞 𝑥 1 |𝑨 1 ⋯ 𝑞 𝑨 𝑂 𝑞 𝑥 𝑂 |𝑨 𝑂 = ෑ 𝑞 𝑥 𝑗 |𝑨 𝑗 𝑞 𝑨 𝑗 𝑗 𝑞 heads = 𝜇 𝑞 tails = 1 − 𝜇 𝑨 1 = 𝐼 𝑥 1 = 1 𝑞 heads = 𝛿 𝑨 2 = 𝑈 𝑥 2 = 5 𝑞 tails = 1 − 𝛿 ⋯ 𝑞 heads = 𝜔 𝑞 tails = 1 − 𝜔

  24. Example: Conditionally Rolling a Die 𝑞 𝑥 1 , 𝑥 2 , … , 𝑥 𝑂 = 𝑞 𝑥 1 𝑞 𝑥 2 ⋯ 𝑞 𝑥 𝑂 = ෑ 𝑞 𝑥 𝑗 𝑗 add complexity to better explain what we see 𝑞 𝑨 1 , 𝑥 1 , 𝑨 2 , 𝑥 2 , … , 𝑨 𝑂 , 𝑥 𝑂 = 𝑞 𝑨 1 𝑞 𝑥 1 |𝑨 1 ⋯ 𝑞 𝑨 𝑂 𝑞 𝑥 𝑂 |𝑨 𝑂 = ෑ 𝑞 𝑥 𝑗 |𝑨 𝑗 𝑞 𝑨 𝑗 𝑗 Generative Story 𝑞 heads = 𝜇 𝜇 = distribution over penny 𝑞 tails = 1 − 𝜇 𝛿 = distribution for dollar coin 𝜔 = distribution over dime 𝑞 heads = 𝛿 for item 𝑗 = 1 to 𝑂: 𝑞 tails = 1 − 𝛿 𝑨 𝑗 ~ Bernoulli 𝜇 𝑞 heads = 𝜔 if 𝑨 𝑗 = 𝐼: 𝑥 𝑗 ~ Bernoulli 𝛿 else: 𝑥 𝑗 ~ Bernoulli 𝜔 𝑞 tails = 1 − 𝜔

  25. Outline Recap of EM Math: Lagrange Multipliers for constrained optimization Probabilistic Modeling Example: Die Rolling Directed Graphical Models Naïve Bayes Hidden Markov Models Message Passing: Directed Graphical Model Inference Most likely sequence Total (marginal) probability EM in D-PGMs

  26. Classify with Bayes Rule argmax 𝑍 𝑞 𝑍 𝑌) argmax 𝑍 log 𝑞 𝑌 𝑍) + log 𝑞(𝑍) likelihood prior

  27. The Bag of Words Representation Adapted from Jurafsky & Martin (draft)

  28. The Bag of Words Representation Adapted from Jurafsky & Martin (draft)

  29. The Bag of Words Representation 29 Adapted from Jurafsky & Martin (draft)

  30. Bag of Words Representation seen 2 classifier sweet 1 γ ( )=c whimsical 1 recommend 1 happy 1 classifier ... ... Adapted from Jurafsky & Martin (draft)

  31. Naïve Bayes: A Generative Story Generative Story 𝜚 = distribution over 𝐿 labels global for label 𝑙 = 1 to 𝐿: parameters 𝜄 𝑙 = generate parameters

  32. Naïve Bayes: A Generative Story Generative Story 𝜚 = distribution over 𝐿 labels y for label 𝑙 = 1 to 𝐿: 𝜄 𝑙 = generate parameters for item 𝑗 = 1 to 𝑂: 𝑧 𝑗 ~ Cat 𝜚

  33. Naïve Bayes: A Generative Story Generative Story 𝜚 = distribution over 𝐿 labels y for label 𝑙 = 1 to 𝐿: 𝜄 𝑙 = generate parameters for item 𝑗 = 1 to 𝑂: 𝑦 𝑗1 𝑦 𝑗2 𝑦 𝑗3 𝑦 𝑗4 𝑦 𝑗5 𝑧 𝑗 ~ Cat 𝜚 local variables for each feature 𝑘 𝑦 𝑗𝑘 ∼ F 𝑘 (𝜄 𝑧 𝑗 )

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend