multiresolution analysis for the statistical analysis of
play

multiresolution analysis for the statistical analysis of incomplete - PowerPoint PPT Presentation

multiresolution analysis for the statistical analysis of incomplete rankings Eric Sibony Anna Korba Stphan Clmenon NIPS Workshop on Multiresolution Methods for Large-Scale Learning December 12 2015 LTCI UMR 5141, Telecom ParisTech/CNRS


  1. multiresolution analysis for the statistical analysis of incomplete rankings Eric Sibony Anna Korba Stéphan Clémençon NIPS Workshop on Multiresolution Methods for Large-Scale Learning December 12 2015 LTCI UMR 5141, Telecom ParisTech/CNRS 0

  2. introduction Why rankings? Ranking data naturally appear in a wide variety of situations ∙ elections ∙ survey answers ∙ expert judgments ∙ race results ∙ competition rankings ∙ customers behaviors ∙ users preferences ∙ … 1

  3. introduction Probabilistic modeling on rankings Catalog of items � n � := { 1 , . . . , n } Full ranking a 1 ≻ · · · ≻ a n ⇔ Permutation σ ∈ S n that maps an item to its rank: σ ( a i ) = i The variability of full rankings is therefore modeled by a probability distribution p over the set of permutations S n . p is called a ranking model. 2

  4. introduction Example: probability distribution over S 5 (APA dataset) 3

  5. introduction Probabilistic modeling on rankings “Parametric” models - psychological interpretation ∙ Thurstone, ∙ Mallows, ∙ Plackett-Luce … “Nonparametric” approaches - mathematical interpretation ∙ Distance-based, ∙ Independence modeling, ∙ Fourier analysis … Why Multiresolution Analysis? To exploit another relevant structure of rankings 4

  6. fourier analysis on the symmetric group

  7. abstract fourier analysis Fourier analysis consists in decomposing a signal into projections on subspaces that are stable under translations. Example: Fourier series For e k ( x ) = e 2i π kx , the space C e k is stable under translations T a : f �→ f ( · − a ) for all a ∈ R / Z . Fourier coefficient � f ( k ) is defined by ∫ 1 f ( x ) e 2i π kx dx � f ( k ) = ⟨ f , e k ⟩ = 0 6

  8. abstract fourier analysis the symmetric group Let L ( S n ) := { f : S n → R } . Translations on L ( S n ) are the operators T τ : f �→ f ( · τ − 1 ) defined for τ ∈ S n . Theorem (From group representation theory) ⊕ L ( S n ) ∼ d λ S λ = λ ⊢ n ∙ λ ⊢ n: indexes of the irreducible representations of S n ∙ S λ : space of irreducible representation indexed by λ ∙ d λ = dim S λ 7

  9. abstract fourier transform Let ρ λ : S n → R d λ × d λ be a representative of the irreducible representation indexed by λ . ∑ ∈ R d λ × d λ � f ( λ ) = f ( σ ) ρ λ ( σ ) “ = ⟨ f , ρ λ ⟩ ” σ ∈ S n “projection on d λ S λ ”. The Fourier transform is then defined by (� ) F : f �→ f ( λ ) λ ⊢ n Satisfies classic properties ∙ Parseval identity ∙ Inverse Fourier transform ∙ Turns convolution into (matrix) product 8

  10. specificities Fourier coefficients are matrices “Frequencies” λ are not numbers (no canonic total order) They are partitions of n: tuples ( λ 1 , . . . , λ r ) ∈ N r such that λ 1 ≥ · · · ≥ λ r and ∑ r i = 1 λ i = n. ( n ) , ( n − 1 , 1 ) , ( n − 2 , 2 ) , ( n − 2 , 1 , 1 ) , . . . The canonic partial order on partitions however orders the Fourier coefficients by “levels of smoothness”. 9

  11. utilizations Classic methods of Fourier analysis apply to ranking data ∙ Band-limited approximation (e.g. [Huang et al., 2009]) ∙ Phase-magnitude decomposition (e.g. [Kakarala, 2011]) ∙ Analysis of random walks (e.g. [Diaconis, 1988]) ∙ Construction of kernels (e.g. [Kondor and Barbosa, 2010]) ∙ Hypothesis testing (e.g. [Diaconis, 1989]) 10

  12. looking for a new representation

  13. natural extension As in classic Fourier analysis, Fourier coefficients contain global information on S n . ∑ � f ( λ ) = f ( σ ) ρ λ ( σ ) σ ∈ S n ⇒ The Fourier transform only allows to characterize the global smoothness of a function Probability distributions over S n may show local irregularities ⇒ One needs some form of multiresolution analysis to characterize the local smoothness of a function 12

  14. construction of a multiresolution analysis Natural attempt ∙ Fourier analysis is constructed from translations ∙ Multiresolution analysis should be constructed from translations and dilations Problem: No equivalent of dilations in a discrete setting. 13

  15. “space-scale” decomposition Relevant approach Directly construct a Multiresolution analysis that allows to characterize local singularities. The multiresolution analysis introduced in [Kondor and Dempsey, 2012] allows to characterize singularities f localized both ∙ in “space”: f with a small support in S n ∙ in “scale/frequency”: F f with small support in { λ ⊢ n } 14

  16. item localization Some modern applications require a different type of localization. In these applications, observed rankings are incomplete: they only involve small subsets of items among the catalog � n � a 1 ≻ · · · ≻ a k with k ≪ n (e.g. users preferences). Such applications require “item localization”. 15

  17. our purpose : “item-scale” decomposition Does Fourier analysis offers some “item localization”? No ⇒ We introduce a multiresolution analysis that allows to characterize singularities f localized both in ∙ in “items”: f only “impacts” the rankings of a subset of items ∙ in “scale/frequency”: F f with small support in { λ ⊢ n } 16

  18. our purpose : “item-scale” decomposition What do we mean by “item localization”? 17

  19. rank information localization

  20. rank information ( ) 1 2 3 4 5 Permutation ↔ Ranking 5 ≻ 1 ≻ 4 ≻ 3 ≻ 2 2 5 4 3 1 Absolute rank information Relative rank information ∙ What is the rank σ ( 3 ) of item ∙ How are items 1 and 3 3? 4 relatively ordered? 1 ≻ 3 ∙ What item σ − 1 ( 2 ) is ranked at ∙ How are the items of the 2 nd position? 1 subset { 2 , 4 , 5 } relatively ordered? 5 ≻ 4 ≻ 2 ∙ What are the ranks σ ( { 2 , 4 , 5 } ) of items { 2 , 4 , 5 } ? { 5 , 3 , 1 } 19

  21. rank information Ranking σ − 1 ( 1 ) ≻ · · · ≻ σ − 1 ( n ) Permutation σ ↔ Absolute rank information Relative rank information ∙ What is the rank σ ( i ) of item i? ∙ How are items a and b relatively ordered? ∙ What item σ − 1 ( j ) is ranked at j th position? ∙ How are the items of the subset A relatively ∙ What are the ranks σ ( { i , j , k } ) ordered? of items { i , j , k } ? 20

  22. rank information rnd Ranking Σ − 1 ( 1 ) ≻ · · · ≻ Σ − 1 ( n ) rnd Permutation Σ ↔ Absolute rank information Relative rank information ∙ What is the law of the rank ∙ What is the probability Σ( i ) of item i ? P [Σ( a ) < Σ( b )] that a is ranked higher than b? ∙ What is the law of the item Σ − 1 ( j ) ranked at j th position? ∙ What is the law of the ranking Σ | A induced by ∙ What is the law of the ranks Σ on the subset A? Σ( { i , j , k } ) of items { i , j , k } ? 21

  23. marginals of a ranking model For a random permutation Σ drawn from a ranking model p, all these laws are marginals of p. Example ∑ P [Σ( i ) = j ] = p ( σ ) σ ∈ S n , σ ( i )= j ∑ P [Σ( a ) < Σ( b )] = p ( σ ) σ ∈ S n , σ ( a ) <σ ( b ) Associated marginal operators M ( n − 1 , 1 ) : p �→ law of Σ( i ) M { a , b } : p �→ law of I { Σ( a ) < Σ( b ) } i 22

  24. marginals of a ranking model Absolute marginals For λ ⊢ n, M λ A 1 ,..., A r : p �→ law of (Σ( A 1 ) , . . . , Σ( A r )) where ( A 1 , . . . , A r ) is an partition of � n � such that | A i | = λ i . Absolute marginals For A ⊂ � n � with | A | ≥ 2, M A : p �→ law of Σ | A where Σ | A is the ranking induced by Σ on the items of A. 23

  25. marginals localize nested levels of rank information Example for absolute marginals The knowledge of all ( n − 2 , 1 , 1 ) marginals induces the knowledge of all ( n − 1 , 1 ) marginals. M ( n − 1 , 1 ) p ( j ) = P [Σ( i ) = j ] i ∑ P [Σ( i ) = j , Σ( i ′ ) = j ′ ] = j ′ ̸ = j ∑ M ( n − 2 , 1 , 1 ) p ( j , j ′ ) = ( i , i ′ ) j ′ ̸ = j for all i ′ ̸ = i. 24

  26. marginals localize nested levels of rank information Example for relative marginals The knowledge of the marginal on { a , b , c } induces the knowledge of the marginal on { a , c } . M { a , c } p ( b ≻ c ) = P [Σ( b ) < Σ( c )] = P [Σ( a ) < Σ( b ) < Σ( c )] + P [Σ( b ) < Σ( a ) < Σ( c )] + P [Σ( b ) < Σ( c ) < Σ( a )] = M { a , b , c } p ( a ≻ b ≻ c ) + M { a , b , c } p ( b ≻ a ≻ c ) + M { a , b , c } p ( b ≻ c ≻ a ) 25

  27. fourier analysis localizes absolute rank information A classic result from S n representation theory (Young’s rule) says informally that: 1. Absolute marginals are nested according to the canonic order on partitions � 2. The part of information of a function f : S n → R that is specific to its λ -marginals M λ f is contained in its Fourier coefficient � f ( λ ) :   ∑ A 1 ,..., A r F − 1 � K µ,λ � M λ A 1 ,..., A r f “=” M λ f ( λ ) + f ( µ )  µ ◃ λ 26

  28. fourier analysis localizes absolute rank information Illustration from Jonathan Huang’s thesis 27

  29. fourier analysis does not localize relative rank information f M { 1 , 2 , 3 } f M { 1 , 2 , 4 } f M { 1 , 3 , 4 } f M { 2 , 3 , 4 } f M { 1 , 2 } f M { 1 , 3 } f M { 1 , 4 } f M { 2 , 3 } f M { 2 , 4 } f M { 3 , 4 } f 28

  30. the mra representation

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend