Mixtures of Weighted Distance-Based Models for Ranking Data Paul H. - PowerPoint PPT Presentation

Mixtures of Weighted Distance-Based Models for Ranking Data Paul H. Lee ∗ Philip L. H. Yu The University of Hong Kong 1 / 38

Outline of presentation Introduction ■ Introduction Distance-Based Models for Ranking Data ■ Distance-Based Models for Ranking Data Mixtures of Weighted Distance-based ■ Weighted Distance-based Models (with application) Models Conclusions and ■ Simulation Studies Further Research ■ Conclusions and Further Research ■ Question & Answer 2 / 38

Introduction Distance-Based Models for Ranking Data Mixtures of Weighted Distance-based Models Conclusions and Further Research Introduction 3 / 38

Introduction Introduction ■ What is ranking data? Distance-Based Models for Ranking Data ◆ Rank a set of items Mixtures of Weighted ◆ Types of soft drinks Distance-based Models Coke, 7-up, fanta Conclusions and Further Research ◆ Political goals ◆ Election candidates World footballer of the year 4 / 38

Introduction Introduction ■ Notations used in ranking literature Distance-Based Models for Ranking Data ◆ π : ranking Mixtures of π ( i ) is the rank assigned to item i Weighted Distance-based π = (2,4,1,3) Models Conclusions and Item 1 rank 2nd, item 2 rank 4th Further Research ◆ π − 1 : ordering π − 1 ( i ) is the item having rank i π − 1 = (2,4,1,3) Item 2 rank 1st, item 4 rank 2nd 5 / 38

Examples of Ranking Data Introduction ■ Marketing research: Distance-Based Models for Ranking Data ◆ Green and Rao (1972): to rank 15 breakfast snack Mixtures of food items including toast, donut, etc. Weighted Distance-based Models Conclusions and ■ Travel behavior and mode of transportation: Further Research ◆ Beggs, et al. (1981), Hausman, et al. (1987): to rank order 16 car designs which differed over 9 attibutes. 6 / 38

Examples of Ranking Data Introduction ■ Politic: Distance-Based Models for Ranking Data ◆ Croon (1989): to rank 4 political goals: Order, Say, Mixtures of Price, and Freedom. Weighted Distance-based Models Conclusions and ■ Horse racing: Further Research ◆ Lo et al. (1994): to predict the top two winning horses. 7 / 38

Types of Ranking Data Given a set of J items. There are two types of ranking data: Introduction Distance-Based Models for Ranking ■ Complete rankings (rank all J items) Data Mixtures of Weighted ■ Incomplete (or Partial) rankings Distance-based Models Conclusions and ◆ Top q rankings (select the top q items and rank them) Further Research When q = 1 , top q ranking = discrete choice ◆ Subset rankings (select a subset of m items and rank them) When m = 2 , subset ranking = paired comparison When m = 3 , subset ranking = triple ranking 8 / 38

Problems of Interest Introduction ■ Graphical representation of ranking data Distance-Based Models for Ranking Data ◆ visualize rankings given by judges preferably in a Mixtures of low-dimensional space Weighted Distance-based Models ◆ existing work: Dual scaling (Nishisato, 1994), vector models Conclusions and (Tucker, 1960; Carroll, 1980; Yu and Chan, 2001), ideal point Further Research models (Coombs, 1950; De Soete, et al., 1986; Yu, Chung and Leung, 2008), polyhedron representation (Thompson, 2003) 9 / 38

Problems of Interest Introduction ■ Factor analysis Distance-Based Models for Ranking Data ◆ identify latent factors that affect ranking decision. Mixtures of Weighted ◆ existing work: Yu, Lam and Lo (2005) Distance-based Models Conclusions and ■ Cluster analysis / Latent class analysis Further Research ◆ find group of judges with similar rank-order preference within clusters. ◆ recent work: Murphy and Martin (2003), Lee and Yu (2010) 10 / 38

Problems of Interest Introduction ■ Modelling Distance-Based Models for Ranking Data ◆ determine probabilistic structure of probability of Mixtures of observing a ranking Weighted Distance-based Models ◆ existing work: a lot, see Marden (1995) for a review, Yu (2000) Conclusions and Further Research ◆ Different types of statistical models for ranking data ■ Order-statistics ■ Paired comparison ■ Distance-based ■ Multistage ◆ This talk: a weighted distance-based model? ◆ mixtures models? 11 / 38

Introduction Introduction ■ Properties of distance measure Distance-Based Models for Ranking Data ◆ d ( π i , π i ) = 0 Mixtures of Weighted ◆ d ( π i , π j ) = d ( π j , π i ) Distance-based Models ◆ d ( π i , π j ) > 0 if π i � = π j Conclusions and Further Research ■ Property of metric Triangular inequality d ( π i , π k ) ≤ d ( π i , π j ) + d ( π j , π k ) 12 / 38

Introduction Distance-Based Models for Ranking Data Mixtures of Weighted Distance-based Models Conclusions and Further Research Distance-Based Models for Ranking Data 13 / 38

Distance-Based Models for Ranking Data Introduction ■ Model assumption: Distance-Based Models for Ranking Data ◆ Probability of observing a ranking π depends on Mixtures of its distance to the modal ranking π 0 Weighted Distance-based Models ◆ The effect of distance is controlled by Conclusions and the dispersion parameter λ Further Research ■ Model specification: ◆ P ( π | λ, π 0 ) = C ( λ ) e − λd ( π , π 0 ) ◆ λ > 0 for identification problem ◆ d ( π , π 0 ) is the distance between π and π 0 ◆ C ( λ ) is the proportionality constant 14 / 38

Distance-Based Models for Ranking Data Introduction ■ Different types of distance Distance-Based Models for Ranking Data ◆ Kendall’s tau Mixtures of T ( π , π 0 ) = � i<j I { [ π ( i ) − π ( j )][ π 0 ( i ) − π 0 ( j )] } Weighted Distance-based Used in Mallow’s φ -model (1957) Models P ( π | φ, π 0 ) = C ( φ ) φ T ( π , π 0 ) Conclusions and Further Research ◆ Minimum number of pairwise adjacent transpositions needed to transform π to π 0 ◆ Spearman’s rho square R 2 ( π , π 0 ) = � i [ π ( i ) − π 0 ( i )] 2 Used in Mallow’s θ -model (1957) P ( π | θ, π 0 ) = C ( θ ) θ R 2 ( π , π 0 ) A distance but not a metric 15 / 38

Distance-Based Models for Ranking Data Introduction ■ Different types of distance Distance-Based Models for Ranking Data ◆ Spearman’s rho Mixtures of i [ π ( i ) − π 0 ( i )] 2 � 0 . 5 �� R ( π , π 0 ) = Weighted Distance-based A metric Models Conclusions and ◆ Spearman’s footrule Further Research F ( π , π 0 ) = � i | π ( i ) − π 0 ( i ) | ■ Cayley’s distance C ( π , π 0 ) = minimum number of transpositions needed to transform π to π 0 16 / 38

Distance-Based Models for Ranking Data Introduction ■ Different types of distance Distance-Based Models for Ranking Data ◆ Proportionality constant C ( λ ) is difficult to compute Mixtures of Weighted ◆ Close form solution available only for: Distance-based Models Kendall’s tau Conclusions and Cayley’s distance Further Research ◆ Can be solved numerically by 1 C ( λ ) = � k ! i =1 e − λd ( π i, π 0) ■ Computational time increases exponentially when number of items increase 17 / 38

Distance-Based Models for Ranking Data Introduction ■ φ -component model Distance-Based Models for Ranking Data ◆ Extension of Mallow’s φ -model Mixtures of (Fligner and Verducci, 1988) Weighted Distance-based Models ◆ For ranking of k items, Kendall’s tau can be Conclusions and decomposed Further Research T ( π , π 0 ) = � k − 1 i =1 V i All V ’s are independent ■ V 1 = m means the m + 1 st best item, with reference to π 0 , is chosen in π ■ This item is dropped and will not be considered anymore ■ V 2 = m means the m + 1 st best item is chosen in the remaining items ■ The process is repeated until all items are ranked 18 / 38

Distance-Based Models for Ranking Data Introduction ■ φ -component model Distance-Based Models for Ranking Data ◆ The V ’s can be weighted : Mixtures of � k − 1 i =1 θ i V i Weighted Distance-based Models ◆ The resulting model is: Conclusions and P ( π | λ, π 0 ) = C ( λ ) e − � k − 1 i =1 λ i V i Further Research λ = { λ i , i = 1 , ..., k − 1 } ◆ Also named k − 1 parameter model ◆ Under the re-parameterizations φ i = e − λ i , i = 1 , ...k − 1 , the resulting model will be: P ( π | φ, π 0 ) = C ( φ ) � k − 1 i =1 φ iV i 19 / 38

Distance-Based Models for Ranking Data Introduction ■ The model has closed form proportionality constant if the Distance-Based Models for Ranking V ’s are independent Data Mixtures of ■ Only Kendall’s tau and Cayley’s distance can be Weighted Distance-based decomposed in such form Models Conclusions and Further Research ■ The extension based on Cayley’s distance is named Cyclic structure model ■ The model based on decomposition of Kendall’s tau is more commonly used than Cayley’s distance 20 / 38

Mixtures of Weighted Distance-Based Models for Ranking Data Paul H. - PowerPoint PPT Presentation

Mixtures of Weighted Distance-Based Models for Ranking Data Paul H. Lee Philip L. H. Yu The University of Hong Kong 1 / 38 Outline of presentation Introduction Introduction Distance-Based Models for Ranking Data Distance-Based

Weighted graphs Weighted graphs Weighted graphs Weighted graphs Graphs with numbers, called

Distance Education Distance education used to be about the distance. 1700s 1800s 1900s 2000s

1 Similarity ranking: example Weighted scoring with linear combination A simple weighted

Analysis of a model of elastic plastic mixtures (Prandtl-Reuss-mixtures) Project of Josef

Mixtures of models Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory

Easy and Hard Outline Constraint Ranking in OT The Constraint Ranking problem Making fast

Tutorial: TF-Ranking for sparse features Tutorial: TF-Ranking for sparse features This tutorial

Weighted graphs 2 Weighted graphs So far we have only considered weighted graphs with

Weighted graphs 3 Weighted graph Edges in weighted graph are assigned a weight: w(v 1 , v 2 ),

Mark-recapture distance sampling (MRDS) in Distance 7.1 Setting up Distance for MRDS

Three Graph Algorithms Shortest Distance Paths Distance/Cost of a path in weighted graph sum of

KNN and re ranking models for English KNN and re-ranking models for English patent mining at

A Nivat Theorem for Weighted Timed Automata and Weighted Relative Distance Logic Manfred Droste

Release granular mushrooms Release granular mushrooms and dried mixtures and dried mixtures

The science of mixtures and separation techniques Rahul Bhambure PhD Scientist, Chemical

Phylogenetics: Distance Methods COMP 571 Luay Nakhleh, Rice University Outline Evolutionary

Roger Carr Roger Carr Chairman 1 December 2004 Karim Naffah Karim Naffah Finance Director

TAU18 WORKSHOP MONTEREY, MARCH 15-16 2018 TOM SPYROU (INTEL) GENERAL CHAIR SONG CHEN

ECON 4100: Industrial Organization Lecture 3 - Market structure and market power 1 Introduction

IPv4 IPv6 Co-Existence IPv4 IPv6 Co-Existence Interim Meeting Interim Meeting October 1

effectively address child obesity in in Lambeth Bimpe Oki boki@lambeth.gov.uk Vida Cunningham

Deposit Return Evidence Summary June 2017 zerowastescotland.org.uk @zerowastescot Research: key

1 st semester ENG NGLI LISH SH LA LANG NGUAGE AGE Topic 41: English Pupils book 11. Unit

Currents SPS Presents: A Cosmic Lunch! Who: Dr. Brown will be speaking about Evolution of