mixtures of weighted distance based models for ranking
play

Mixtures of Weighted Distance-Based Models for Ranking Data Paul H. - PowerPoint PPT Presentation

Mixtures of Weighted Distance-Based Models for Ranking Data Paul H. Lee Philip L. H. Yu The University of Hong Kong 1 / 38 Outline of presentation Introduction Introduction Distance-Based Models for Ranking Data Distance-Based


  1. Mixtures of Weighted Distance-Based Models for Ranking Data Paul H. Lee ∗ Philip L. H. Yu The University of Hong Kong 1 / 38

  2. Outline of presentation Introduction ■ Introduction Distance-Based Models for Ranking Data ■ Distance-Based Models for Ranking Data Mixtures of Weighted Distance-based ■ Weighted Distance-based Models (with application) Models Conclusions and ■ Simulation Studies Further Research ■ Conclusions and Further Research ■ Question & Answer 2 / 38

  3. Introduction Distance-Based Models for Ranking Data Mixtures of Weighted Distance-based Models Conclusions and Further Research Introduction 3 / 38

  4. Introduction Introduction ■ What is ranking data? Distance-Based Models for Ranking Data ◆ Rank a set of items Mixtures of Weighted ◆ Types of soft drinks Distance-based Models Coke, 7-up, fanta Conclusions and Further Research ◆ Political goals ◆ Election candidates World footballer of the year 4 / 38

  5. Introduction Introduction ■ Notations used in ranking literature Distance-Based Models for Ranking Data ◆ π : ranking Mixtures of π ( i ) is the rank assigned to item i Weighted Distance-based π = (2,4,1,3) Models Conclusions and Item 1 rank 2nd, item 2 rank 4th Further Research ◆ π − 1 : ordering π − 1 ( i ) is the item having rank i π − 1 = (2,4,1,3) Item 2 rank 1st, item 4 rank 2nd 5 / 38

  6. Examples of Ranking Data Introduction ■ Marketing research: Distance-Based Models for Ranking Data ◆ Green and Rao (1972): to rank 15 breakfast snack Mixtures of food items including toast, donut, etc. Weighted Distance-based Models Conclusions and ■ Travel behavior and mode of transportation: Further Research ◆ Beggs, et al. (1981), Hausman, et al. (1987): to rank order 16 car designs which differed over 9 attibutes. 6 / 38

  7. Examples of Ranking Data Introduction ■ Politic: Distance-Based Models for Ranking Data ◆ Croon (1989): to rank 4 political goals: Order, Say, Mixtures of Price, and Freedom. Weighted Distance-based Models Conclusions and ■ Horse racing: Further Research ◆ Lo et al. (1994): to predict the top two winning horses. 7 / 38

  8. Types of Ranking Data Given a set of J items. There are two types of ranking data: Introduction Distance-Based Models for Ranking ■ Complete rankings (rank all J items) Data Mixtures of Weighted ■ Incomplete (or Partial) rankings Distance-based Models Conclusions and ◆ Top q rankings (select the top q items and rank them) Further Research When q = 1 , top q ranking = discrete choice ◆ Subset rankings (select a subset of m items and rank them) When m = 2 , subset ranking = paired comparison When m = 3 , subset ranking = triple ranking 8 / 38

  9. Problems of Interest Introduction ■ Graphical representation of ranking data Distance-Based Models for Ranking Data ◆ visualize rankings given by judges preferably in a Mixtures of low-dimensional space Weighted Distance-based Models ◆ existing work: Dual scaling (Nishisato, 1994), vector models Conclusions and (Tucker, 1960; Carroll, 1980; Yu and Chan, 2001), ideal point Further Research models (Coombs, 1950; De Soete, et al., 1986; Yu, Chung and Leung, 2008), polyhedron representation (Thompson, 2003) 9 / 38

  10. Problems of Interest Introduction ■ Factor analysis Distance-Based Models for Ranking Data ◆ identify latent factors that affect ranking decision. Mixtures of Weighted ◆ existing work: Yu, Lam and Lo (2005) Distance-based Models Conclusions and ■ Cluster analysis / Latent class analysis Further Research ◆ find group of judges with similar rank-order preference within clusters. ◆ recent work: Murphy and Martin (2003), Lee and Yu (2010) 10 / 38

  11. Problems of Interest Introduction ■ Modelling Distance-Based Models for Ranking Data ◆ determine probabilistic structure of probability of Mixtures of observing a ranking Weighted Distance-based Models ◆ existing work: a lot, see Marden (1995) for a review, Yu (2000) Conclusions and Further Research ◆ Different types of statistical models for ranking data ■ Order-statistics ■ Paired comparison ■ Distance-based ■ Multistage ◆ This talk: a weighted distance-based model? ◆ mixtures models? 11 / 38

  12. Introduction Introduction ■ Properties of distance measure Distance-Based Models for Ranking Data ◆ d ( π i , π i ) = 0 Mixtures of Weighted ◆ d ( π i , π j ) = d ( π j , π i ) Distance-based Models ◆ d ( π i , π j ) > 0 if π i � = π j Conclusions and Further Research ■ Property of metric Triangular inequality d ( π i , π k ) ≤ d ( π i , π j ) + d ( π j , π k ) 12 / 38

  13. Introduction Distance-Based Models for Ranking Data Mixtures of Weighted Distance-based Models Conclusions and Further Research Distance-Based Models for Ranking Data 13 / 38

  14. Distance-Based Models for Ranking Data Introduction ■ Model assumption: Distance-Based Models for Ranking Data ◆ Probability of observing a ranking π depends on Mixtures of its distance to the modal ranking π 0 Weighted Distance-based Models ◆ The effect of distance is controlled by Conclusions and the dispersion parameter λ Further Research ■ Model specification: ◆ P ( π | λ, π 0 ) = C ( λ ) e − λd ( π , π 0 ) ◆ λ > 0 for identification problem ◆ d ( π , π 0 ) is the distance between π and π 0 ◆ C ( λ ) is the proportionality constant 14 / 38

  15. Distance-Based Models for Ranking Data Introduction ■ Different types of distance Distance-Based Models for Ranking Data ◆ Kendall’s tau Mixtures of T ( π , π 0 ) = � i<j I { [ π ( i ) − π ( j )][ π 0 ( i ) − π 0 ( j )] } Weighted Distance-based Used in Mallow’s φ -model (1957) Models P ( π | φ, π 0 ) = C ( φ ) φ T ( π , π 0 ) Conclusions and Further Research ◆ Minimum number of pairwise adjacent transpositions needed to transform π to π 0 ◆ Spearman’s rho square R 2 ( π , π 0 ) = � i [ π ( i ) − π 0 ( i )] 2 Used in Mallow’s θ -model (1957) P ( π | θ, π 0 ) = C ( θ ) θ R 2 ( π , π 0 ) A distance but not a metric 15 / 38

  16. Distance-Based Models for Ranking Data Introduction ■ Different types of distance Distance-Based Models for Ranking Data ◆ Spearman’s rho Mixtures of i [ π ( i ) − π 0 ( i )] 2 � 0 . 5 �� R ( π , π 0 ) = Weighted Distance-based A metric Models Conclusions and ◆ Spearman’s footrule Further Research F ( π , π 0 ) = � i | π ( i ) − π 0 ( i ) | ■ Cayley’s distance C ( π , π 0 ) = minimum number of transpositions needed to transform π to π 0 16 / 38

  17. Distance-Based Models for Ranking Data Introduction ■ Different types of distance Distance-Based Models for Ranking Data ◆ Proportionality constant C ( λ ) is difficult to compute Mixtures of Weighted ◆ Close form solution available only for: Distance-based Models Kendall’s tau Conclusions and Cayley’s distance Further Research ◆ Can be solved numerically by 1 C ( λ ) = � k ! i =1 e − λd ( π i, π 0) ■ Computational time increases exponentially when number of items increase 17 / 38

  18. Distance-Based Models for Ranking Data Introduction ■ φ -component model Distance-Based Models for Ranking Data ◆ Extension of Mallow’s φ -model Mixtures of (Fligner and Verducci, 1988) Weighted Distance-based Models ◆ For ranking of k items, Kendall’s tau can be Conclusions and decomposed Further Research T ( π , π 0 ) = � k − 1 i =1 V i All V ’s are independent ■ V 1 = m means the m + 1 st best item, with reference to π 0 , is chosen in π ■ This item is dropped and will not be considered anymore ■ V 2 = m means the m + 1 st best item is chosen in the remaining items ■ The process is repeated until all items are ranked 18 / 38

  19. Distance-Based Models for Ranking Data Introduction ■ φ -component model Distance-Based Models for Ranking Data ◆ The V ’s can be weighted : Mixtures of � k − 1 i =1 θ i V i Weighted Distance-based Models ◆ The resulting model is: Conclusions and P ( π | λ, π 0 ) = C ( λ ) e − � k − 1 i =1 λ i V i Further Research λ = { λ i , i = 1 , ..., k − 1 } ◆ Also named k − 1 parameter model ◆ Under the re-parameterizations φ i = e − λ i , i = 1 , ...k − 1 , the resulting model will be: P ( π | φ, π 0 ) = C ( φ ) � k − 1 i =1 φ iV i 19 / 38

  20. Distance-Based Models for Ranking Data Introduction ■ The model has closed form proportionality constant if the Distance-Based Models for Ranking V ’s are independent Data Mixtures of ■ Only Kendall’s tau and Cayley’s distance can be Weighted Distance-based decomposed in such form Models Conclusions and Further Research ■ The extension based on Cayley’s distance is named Cyclic structure model ■ The model based on decomposition of Kendall’s tau is more commonly used than Cayley’s distance 20 / 38

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend