vegan fleas movie ratings and the em algorithm
play

Vegan fleas, movie ratings, and the EM algorithm Carlos Cotrini - PowerPoint PPT Presentation

Vegan fleas, movie ratings, and the EM algorithm Carlos Cotrini Department of Computer Science ETH Z urich ccarlos@inf.ethz.ch March 25, 2019 Carlos Cotrini (ETH Z urich) The EM algorithm March 25, 2019 1 / 36 Overview The vegan-flea


  1. Vegan fleas, movie ratings, and the EM algorithm Carlos Cotrini Department of Computer Science ETH Z¨ urich ccarlos@inf.ethz.ch March 25, 2019 Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 1 / 36

  2. Overview The vegan-flea optimization problem 1 Building a movie recommendation system 2 The EM algorithm 3 Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 2 / 36

  3. The vegan-flea optimization problem Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 3 / 36

  4. A two-dimensional dog Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 4 / 36

  5. The dog’s cardiovascular system Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 5 / 36

  6. The dog’s cardiovascular system Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 6 / 36

  7. The flea, the dog’s skin, and the vessel’s upper border Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 7 / 36

  8. Animation Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 8 / 36

  9. Formalization Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 9 / 36

  10. Assumptions We assume that for any x ∈ [0 , 1] and any two time points t 1 , t 2 ∈ [0 , ∞ ) , skin ( x, t 1 ) − vessel ( x, t 1 ) = skin ( x, t 2 ) − vessel ( x, t 2 ) . For any x ∈ [0 , 1] and any t ∈ [0 , ∞ ) , there is t ′ ≥ t such that vessel ( x, t ′ ) is a maximum of vessel ( · , t ′ ) . For any t ∈ [0 , ∞ ) , the flea can efficiently compute a point x ∗ that maximizes skin ( · , t ) . For any x ∈ [0 , 1] and any t ∈ [0 , ∞ ) , the flea can efficiently compute ˆ t ≥ t such that vessel ( x, ˆ t ) is a maximum of vessel ( · , ˆ t ) . Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 10 / 36

  11. Objective Can the flea compute x ∗ such that d ( x ∗ ) ≥ d ( x 0 ) , where x 0 is the flea’s current position? Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 11 / 36

  12. Optimization algorithm Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 12 / 36

  13. Optimization algorithm Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 13 / 36

  14. Optimization algorithm Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 14 / 36

  15. Why does this work? Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 15 / 36

  16. A movie recommendation system Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 16 / 36

  17. A simple dataset of movie ratings Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 17 / 36

  18. A simple dataset of movie ratings Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 18 / 36

  19. A simple dataset of movie ratings Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 19 / 36

  20. A simple dataset of movie ratings Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 20 / 36

  21. A probability model for movie ratings Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 21 / 36

  22. A probability model for movie ratings Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 22 / 36

  23. A probability model for movie ratings Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 23 / 36

  24. A probability model for movie ratings Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 24 / 36

  25. A probability model for movie ratings Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 25 / 36

  26. A probability model for movie ratings Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 26 / 36

  27. A probability model for movie ratings Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 27 / 36

  28. Notation X = ( x i,j ) i ≤ N,j ≤ D . Here, x i,j ∈ { 0 , 1 } indicates whether person i liked movie j or not. µ = ( µ k,j ) k ≤ K,j ≤ D . Here, µ k,j ∈ [0 , 1] denotes the probability that ¯ someone in category k likes movie j . ν = ( ν k ) k ≤ K . Here, ν k ∈ [0 , 1] denotes the probability that a ¯ person belongs to category k . z = ( z ( i )) i ≤ N . Here, z ( i ) ∈ { 0 , . . . , K } indicates person i ’s ¯ category. Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 28 / 36

  29. How to mine a probability model from X ? Maximum-likelihood approach: Solve the following problem. arg max log p ( X | ¯ µ, ¯ ν ) . ¯ µ, ¯ ν � s.t. ν k = 1 . k ≤ K Incomplete-data log likelihood. Complete-data log likelihood. log p ( X, ¯ z | ¯ µ, ¯ ν ) . Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 29 / 36

  30. How to mine a probability model from X ? Maximum-likelihood approach: Solve the following problem. �� j ≤ D µ x i,j � z ( i ) ,j (1 − µ z ( i ) ,j ) 1 − x i,j arg max � i ≤ N log � z ( i ) ν z ( i ) . µ, ¯ ¯ ν � s.t. ν k = 1 . k ≤ K Incomplete-data log likelihood. Complete-data log likelihood. � � � i ≤ N log ν z ( i ) + � j ≤ D x i,j log µ z ( i ) ,j + (1 − x i,j ) log 1 − µ z ( i ) ,j . Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 30 / 36

  31. The dilemma We are between a problem we want to solve, but we don’t know how, and a problem we know how to solve but we don’t want to solve. Let’s try to connect them. Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 31 / 36

  32. Connecting incomplete-data and complete-data log likelihoods Let θ = (¯ µ, ¯ ν ) How can we connect log p ( X | θ ) and log p ( X, ¯ z | θ ) ? We can start with the following: z | X, θ ) = p ( X, ¯ z | θ ) p (¯ p ( X | θ ) . Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 32 / 36

  33. Connecting incomplete-data and complete-data log likelihoods Let θ = (¯ µ, ¯ ν ) How can we connect log p ( X | θ ) and log p ( X, ¯ z | θ ) ? We can start with the following: z | X, θ ) = p ( X, ¯ z | θ ) p (¯ p ( X | θ ) . From here, we can derive that: log p ( X | θ ) = log p ( X, ¯ z | θ ) − log p (¯ z | X, θ ) . Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 32 / 36

  34. Connecting incomplete-data and complete-data log likelihoods Let θ = (¯ µ, ¯ ν ) How can we connect log p ( X | θ ) and log p ( X, ¯ z | θ ) ? We can start with the following: z | X, θ ) = p ( X, ¯ z | θ ) p (¯ p ( X | θ ) . From here, we can derive that: log p ( X | θ ) = log p ( X, ¯ z | θ ) − log p (¯ z | X, θ ) . But we don’t know the value of ¯ z . Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 32 / 36

  35. Take expectations on both sides with respect to ¯ z , using some pdf ˜ p (¯ z ) for ¯ z . � � � p (¯ ˜ z ) log p ( X | θ ) d ¯ z = p (¯ ˜ z ) log p ( X, ¯ z | θ ) d ¯ z − p (¯ ˜ z ) log p (¯ z | X, θ ) d ¯ z. Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 33 / 36

  36. Since log p ( X | θ ) does not depend on ¯ z , we get � � log p ( X | θ ) = p (¯ ˜ z ) log p ( X, ¯ z | θ ) d ¯ z − p (¯ ˜ z ) log p (¯ z | X, θ ) d ¯ z. Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 34 / 36

  37. In other words, log p ( X | θ ) = E ˜ z ) log p ( X, ¯ z | θ ) − E ˜ z ) log p (¯ z | X, θ ) . p (¯ p (¯ Does this look familiar? Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 35 / 36

  38. In other words, log p ( X | θ ) = E ˜ z ) log p ( X, ¯ z | θ ) − E ˜ z ) log p (¯ z | X, θ ) . p (¯ p (¯ Does this look familiar? d ( θ ) = skin ( θ, ˜ p ) − vessel ( θ, ˜ p ) . Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 35 / 36

  39. In other words, log p ( X | θ ) = E ˜ z ) log p ( X, ¯ z | θ ) − E ˜ z ) log p (¯ z | X, θ ) . p (¯ p (¯ Does this look familiar? d ( θ ) = skin ( θ, ˜ p ) − vessel ( θ, ˜ p ) . Like a vegan flea, we want to maximize the value for θ that maximizes the distance between E ˜ z ) log p ( X, ¯ z | θ ) and p (¯ z ) log p (¯ z | X, θ ) ! E ˜ p (¯ Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 35 / 36

  40. In other words, log p ( X | θ ) = E ˜ z ) log p ( X, ¯ z | θ ) − E ˜ z ) log p (¯ z | X, θ ) . p (¯ p (¯ Does this look familiar? d ( θ ) = skin ( θ, ˜ p ) − vessel ( θ, ˜ p ) . Like a vegan flea, we want to maximize the value for θ that maximizes the distance between E ˜ z ) log p ( X, ¯ z | θ ) and p (¯ z ) log p (¯ z | X, θ ) ! E ˜ p (¯ It turns out that all assumptions hold! Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 35 / 36

  41. In other words, log p ( X | θ ) = E ˜ z ) log p ( X, ¯ z | θ ) − E ˜ z ) log p (¯ z | X, θ ) . p (¯ p (¯ Does this look familiar? d ( θ ) = skin ( θ, ˜ p ) − vessel ( θ, ˜ p ) . Like a vegan flea, we want to maximize the value for θ that maximizes the distance between E ˜ z ) log p ( X, ¯ z | θ ) and p (¯ z ) log p (¯ z | X, θ ) ! E ˜ p (¯ It turns out that all assumptions hold! We can apply our optimization algorithm to approximately maximize log p ( X | θ ) with respect to θ . Carlos Cotrini (ETH Z¨ urich) The EM algorithm March 25, 2019 35 / 36

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend