determinantal point processes
play

DETERMINANTAL POINT PROCESSES FOR NATURAL LANGUAGE PROCESSING - PowerPoint PPT Presentation

DETERMINANTAL POINT PROCESSES FOR NATURAL LANGUAGE PROCESSING Jennifer Gillenwater Joint work with Alex Kulesza and Ben Taskar OUTLINE OUTLINE Motivation & background on DPPs OUTLINE Motivation & background on DPPs Large-scale


  1. DETERMINANTAL POINT PROCESSES FOR NATURAL LANGUAGE PROCESSING Jennifer Gillenwater Joint work with Alex Kulesza and Ben Taskar

  2. OUTLINE

  3. OUTLINE Motivation & background on DPPs

  4. OUTLINE Motivation & background on DPPs Large-scale settings

  5. OUTLINE Motivation & background on DPPs Large-scale settings Structured summarization

  6. OUTLINE Motivation & background on DPPs Large-scale settings Structured summarization Other potential NLP applications

  7. MOTIVATION & BACKGROUND

  8. SUMMARIZATION

  9. SUMMARIZATION ...

  10. SUMMARIZATION ...

  11. SUMMARIZATION ... Quality : relevance to the topic

  12. SUMMARIZATION ... Quality : Diversity : relevance to coverage of the topic core ideas

  13. SUBSET SELECTION

  14. SUBSET SELECTION

  15. SUBSET SELECTION

  16. SUBSET SELECTION

  17. AREA AS SET-GOODNESS

  18. AREA AS SET-GOODNESS feature space

  19. AREA AS SET-GOODNESS feature space B i B j

  20. AREA AS SET-GOODNESS feature space p B > quality = i B i similarity = B > i B j B i B j

  21. AREA AS SET-GOODNESS feature space p B > quality = i B i B i + B j similarity = B > i B j i B j ) 2 B i 2 � ( B > 2 k B j k 2 k 2 B i k q = a e B j r a

  22. AREA AS SET-GOODNESS feature space p B > quality = i B i B i + B j similarity = B > i B j i B j ) 2 B i 2 � ( B > 2 k B j k 2 k 2 B i k q = a e B j r a

  23. AREA AS SET-GOODNESS feature space p B > quality = i B i B i + B j similarity = B > i B j i B j ) 2 B i 2 � ( B > 2 k B j k 2 k 2 B i k q = a e B j r a

  24. VOLUME AS SET-GOODNESS q 2 � ( B > k B i k 2 2 k B j k 2 area = i B j ) 2

  25. VOLUME AS SET-GOODNESS q 2 � ( B > k B i k 2 2 k B j k 2 area = i B j ) 2

  26. VOLUME AS SET-GOODNESS q 2 � ( B > k B i k 2 2 k B j k 2 area = i B j ) 2 length = k B i k 2

  27. VOLUME AS SET-GOODNESS q 2 � ( B > k B i k 2 2 k B j k 2 area = i B j ) 2 volume = base × height length = k B i k 2

  28. VOLUME AS SET-GOODNESS q 2 � ( B > k B i k 2 2 k B j k 2 area = i B j ) 2 volume = base × height length = k B i k 2 vol( B ) = height × base = || B 1 || 2 vol(proj ⊥ B 1 ( B 2: N ))

  29. AREA AS A DET q 2 � ( B > k B i k 2 2 k B j k 2 area = i B j ) 2

  30. AREA AS A DET q 2 � ( B > k B i k 2 2 k B j k 2 area = i B j ) 2 1 ( ) B > || B i || 2 2 i B j 2 = det B > || B j || 2 i B j 2

  31. AREA AS A DET q 2 � ( B > k B i k 2 2 k B j k 2 area = i B j ) 2 1 ( ) B > || B i || 2 2 i B j 2 = det B > || B j || 2 i B j 2 1 = det ( ) 2 B i B j B i B j

  32. VOLUME AS A DET 1 = det ( ) 2 B i vol( B { i,j } ) B j B i B j

  33. VOLUME AS A DET 1 = det ( ) 2 B i vol( B { i,j } ) B j B i B j 1 ( ) 2 B 1 vol( B ) = det B 1 B N . . . . . . B N vol( B ) 2 = det( B > B ) = det( L )

  34. COMPLEX STATISTICS

  35. COMPLEX STATISTICS

  36. COMPLEX STATISTICS

  37. COMPLEX STATISTICS P

  38. COMPLEX STATISTICS P

  39. COMPLEX STATISTICS P

  40. COMPLEX STATISTICS P

  41. COMPLEX STATISTICS P

  42. COMPLEX STATISTICS P

  43. COMPLEX STATISTICS P

  44. COMPLEX STATISTICS P

  45. COMPLEX STATISTICS ⇒ 2 N sets N items =

  46. EFFICIENT COMPUTATION

  47. EFFICIENT COMPUTATION 1 det 2

  48. EFFICIENT COMPUTATION 2 det

  49. EFFICIENT COMPUTATION 2 det P

  50. EFFICIENT COMPUTATION 2 det O ( N 3 )

  51. POINT PROCESSES Y = { 1 , . . . , N }

  52. POINT PROCESSES Y = { 1 , . . . , N }

  53. POINT PROCESSES Y = { 1 , . . . , N } ( ) P

  54. POINT PROCESSES Y = { 1 , . . . , N } ( ) = 0 . 2 P

  55. DETERMINANTAL

  56. DETERMINANTAL P ( { 2 , 3 , 5 } ) ∝

  57. DETERMINANTAL P ( { 2 , 3 , 5 } ) ∝ L 11 L 12 L 13 L 14 L 15 L 21 L 22 L 23 L 24 L 25 L 31 L 32 L 33 L 34 L 35 L 41 L 42 L 43 L 44 L 45 L 51 L 52 L 53 L 54 L 55

  58. DETERMINANTAL P ( { 2 , 3 , 5 } ) ∝ L 11 L 12 L 13 L 14 L 15 L 21 L 22 L 23 L 24 L 25 L 31 L 32 L 33 L 34 L 35 L 41 L 42 L 43 L 44 L 45 L 51 L 52 L 53 L 54 L 55

  59. DETERMINANTAL L 22 L 23 L 25 P ( { 2 , 3 , 5 } ) ∝ L 32 L 33 L 35 L 52 L 53 L 55

  60. DETERMINANTAL det ( L 22 L 23 L 25 ) P ( { 2 , 3 , 5 } ) ∝ L 32 L 33 L 35 L 52 L 53 L 55

  61. DETERMINANTAL det ( L 22 L 23 L 25 ) P ( { 2 , 3 , 5 } ) = L 32 L 33 L 35 L 52 L 53 L 55

  62. DETERMINANTAL det ( L 22 L 23 L 25 ) P ( { 2 , 3 , 5 } ) = L 32 L 33 L 35 L 52 L 53 L 55 det( L + I )

  63. EFFICIENT INFERENCE

  64. EFFICIENT INFERENCE P L ( Y = Y ) Normalizing:

  65. EFFICIENT INFERENCE P L ( Y = Y ) Normalizing: P ( Y ⊆ Y ) Marginalizing:

  66. EFFICIENT INFERENCE P L ( Y = Y ) Normalizing: P ( Y ⊆ Y ) Marginalizing: Conditioning: P L ( Y = B | A ⊆ Y ) P L ( Y = B | A ∩ Y = ∅ )

  67. EFFICIENT INFERENCE P L ( Y = Y ) Normalizing: P ( Y ⊆ Y ) Marginalizing: Conditioning: P L ( Y = B | A ⊆ Y ) P L ( Y = B | A ∩ Y = ∅ ) Sampling: Y ∼ P L

  68. EFFICIENT INFERENCE P L ( Y = Y ) Normalizing: P ( Y ⊆ Y ) Marginalizing: Conditioning: P L ( Y = B | A ⊆ Y ) P L ( Y = B | A ∩ Y = ∅ ) Sampling: Y ∼ P L O ( N 3 )

  69. LARGE-SCALE SETTINGS

  70. DUAL KERNEL KULESZA AND TASKAR (NIPS 2010)

  71. DUAL KERNEL KULESZA AND TASKAR (NIPS 2010) L B 1 B N B 1 B 2 B 3 B 2 . . . B 3 . . . B N

  72. DUAL KERNEL KULESZA AND TASKAR (NIPS 2010) L B 1 B N B 1 B 2 B 3 B 2 . . . = B 3 N × N . . . B N

  73. DUAL KERNEL KULESZA AND TASKAR (NIPS 2010) C B 1 B N B 1 B 2 B 3 B 2 . . . = B 3 N × N . . . B N

  74. DUAL KERNEL KULESZA AND TASKAR (NIPS 2010) C B 1 B N B 1 B 2 B 3 B 2 . . . = B 3 . . . B N

  75. DUAL KERNEL KULESZA AND TASKAR (NIPS 2010) C B 1 B N B 1 B 2 B 3 D × D B 2 . . . = B 3 . . . B N

  76. DUAL INFERENCE

  77. DUAL INFERENCE L = V Λ V > C = ˆ V Λ ˆ V >

  78. DUAL INFERENCE V = B > ˆ V Λ � 1 L = V Λ V > C = ˆ V Λ ˆ V > 2

  79. DUAL INFERENCE V = B > ˆ V Λ � 1 L = V Λ V > C = ˆ V Λ ˆ V > 2 Normalizing O ( D 3 ) P Y det( L Y )

  80. DUAL INFERENCE V = B > ˆ V Λ � 1 L = V Λ V > C = ˆ V Λ ˆ V > 2 Normalizing O ( D 3 ) P Y det( L Y ) O ( D 3 + D 2 k 2 ) Marginalizing & Conditioning

  81. DUAL INFERENCE V = B > ˆ V Λ � 1 L = V Λ V > C = ˆ V Λ ˆ V > 2 Normalizing O ( D 3 ) P Y det( L Y ) O ( D 3 + D 2 k 2 ) Marginalizing & Conditioning Sampling O ( ND 2 k ) Y ∼ P L

  82. EXPONENTIAL N

  83. EXPONENTIAL N We want to select a diverse set of parses. N = O ( { sentence length } { sentence length } )

  84. EXPONENTIAL N We want to select a diverse set of parses. N = O ( { sentence length } { sentence length } ) N = O ( { node degree } { path length } )

  85. STRUCTURE FACTORIZATION KULESZA AND TASKAR (NIPS 2010) i =

  86. STRUCTURE FACTORIZATION KULESZA AND TASKAR (NIPS 2010) i = B i = q ( i ) φ ( i ) quality similarity

  87. STRUCTURE FACTORIZATION KULESZA AND TASKAR (NIPS 2010) i = { i α } α ∈ F α c = 1 i = B i = q ( i ) φ ( i ) quality similarity

  88. STRUCTURE FACTORIZATION KULESZA AND TASKAR (NIPS 2010) i = { i α } α ∈ F α c = 2 i = B i = q ( i ) φ ( i ) quality similarity

  89. STRUCTURE FACTORIZATION KULESZA AND TASKAR (NIPS 2010) i = { i α } α ∈ F α c = 2 i =  Q � B i = q ( i α ) φ ( i ) α ∈ F

  90. STRUCTURE FACTORIZATION KULESZA AND TASKAR (NIPS 2010) i = { i α } α ∈ F α c = 2 i =  Q �  P � B i = q ( i α ) φ ( i α ) α ∈ F α ∈ F

  91. STRUCTURE FACTORIZATION KULESZA AND TASKAR (NIPS 2010) i = { i α } α ∈ F α c = 2 i =  Q �  P � B i = q ( i α ) φ ( i α ) α ∈ F α ∈ F O ( ND 2 k ) Y ∼ P L

  92. STRUCTURE FACTORIZATION KULESZA AND TASKAR (NIPS 2010) α M = R = c = 2  Q �  P � B i = q ( i α ) φ ( i α ) α ∈ F α ∈ F O ( D 2 k 3 + Dk 2 M c R ) Y ∼ P L

  93. STRUCTURE FACTORIZATION KULESZA AND TASKAR (NIPS 2010) α M = R = c = 2  Q �  P � B i = q ( i α ) φ ( i α ) α ∈ F α ∈ F O ( D 2 k 3 + Dk 2 M c R ) Y ∼ P L M c R = 4 2 ⇤ 12 = 192 ⌧ N = 4 12 = 16 , 777 , 216

  94. LARGE FEATURE SETS?

  95. LARGE FEATURE SETS? N = # of items Large Exponential D = # of features dual + Small dual structure

  96. LARGE FEATURE SETS? N = # of items Large Exponential D = # of features dual + Small dual structure ? ? Large

  97. RANDOM PROJECTIONS GILLENWATER, KULESZA, AND TASKAR (EMNLP 2012)

  98. RANDOM PROJECTIONS GILLENWATER, KULESZA, AND TASKAR (EMNLP 2012) D N Φ

  99. RANDOM PROJECTIONS GILLENWATER, KULESZA, AND TASKAR (EMNLP 2012) D Φ M c R

  100. RANDOM PROJECTIONS GILLENWATER, KULESZA, AND TASKAR (EMNLP 2012) D d Φ D M c R ×

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend