Promotion Analysis in Multi-Dimensional Space Tianyi Wu (UIUC) - PowerPoint PPT Presentation

Promotion Analysis in Multi-Dimensional Space Tianyi Wu (UIUC) Dong Xin (Microsoft Research) Qiaozhu Mei (University of Michigan) Jiawei Han (UIUC)

2 Outline  Introduction  Query execution algorithms  Spurious promotion  Experiment  Conclusion

4 Promotion analysis: introduction  Formulate and study a useful function  Promotion analysis through ranking  General goal: promote a given object by leveraging subspace ranking  Motivating example  A marketing manager of a book retailer  Basic fact  Book sales: 30 th out of 100 other retailers  Not particularly interesting!  After promotion analysis, he discovered:  Ranked 1st in the { college students, science and technology } area  Further advertising and marketing decisions  Another example: person promotion Let’s promote our brand!

5 Promotion query Observation Global rank May not be interesting Local rank Can be more interesting Compare to all other Compare objects in Full-space Subspaces objects in all aspects certain areas Single SQL query Low cost Many subspaces High cost T HE P ROMOTION Q UERY P ROBLEM Given: an object (e.g., product, person) Goal: discover the most interesting subspaces where the object is highly ranked

6 Subspace rank: why interesting  Discover merit and competitive strengths  E.g., a bestselling car model among hybrid cars  Enhance image  E.g., fortune 500 company  Facilitate decision making  E.g., marketing plan that focuses on college students  Deliver specific information  E.g., “top - 3 university in biomedical research” vs. “top - 20 university”  Extensively practiced in marketing  Market segmentation  Customer targeting and product positioning

7 Challenges  Current systems  Given a condition, find top- k objects  Sophisticated early termination and pruning algorithms  Promotion query: not well-supported  User: manual search and navigation  Trial-and-error  Computationally expensive It should be good at …  The rank measure: holistic Let me try some queries…  A blow-up of subspaces

8 Promotion analysis Multidimensional data model  Fact table Location Time Object Score Lyon July T 0.5 Chicago July T 0.8 Chicago August S 1.0 Chicago July S 1.0 Lyon August V 0.3 Chicago August V 0.6 Chicago July V 0.7 Subspace dimensions Object dimension Score dimension

9 Subspaces Location Location Time Time Object Object Score Score Lyon Lyon July July T T 0.5 0.5 Chicago Chicago July July T T 0.8 0.8 Aggregate and compute the target Chicago Chicago August August S S 1.0 1.0 object’s rank in each subspace. Chicago Chicago July July S S 1.0 1.0 Lyon Lyon August August V V 0.3 0.3 {*} Chicago Chicago August August V V 0.6 0.6 SUM(T)=1.3 Chicago Chicago July July V V 0.7 0.7 Rank(T)=3 rd / 3 Given a target object T {Lyon} {Chicago} {July} SUM(T)=0.5 SUM(T)=1.8 SUM(T)=1.3 Rank(T)=1 st / 2 Rank(T)=3 rd / 3 Rank(T)=1 st / 3 {Lyon, July} {Chicago, July} SUM(T)=0.8 SUM(T)=0.5 Rank(T)=2 nd / 3 Subspaces of T Rank(T)=1 st / 1 {*} is the special case: full-space

10 Query model  Given a target object T, find the top subspaces which are promotive  “ Promotiveness ” : a class of measures to quantify how well a subspace S can promote T  P(S, T) = f(Rank(S, T)) * g(Sig(S))  Higher rank ~ more promotive  More significant subspace (e.g., more objects) ~ more promotive  Example instantiations  Simple ranking: P(S, T) = Rank -1 (S, T)  Iceberg condition: P(S, T) = Rank -1 (S, T) * I(ObjCount(S)>MinSig)  Percentile ranking: P(S, T) = ObjCount(S) / Rank(S, T)  …

11 Query model  Given a target object T, find the top subspaces which are promotive  “ Promotiveness ” : a class of measures to quantify how well a T HE P ROMOTION Q UERY P ROBLEM subspace S can promote T Input: a target object T  P(S, T) = f(Rank(S, T)) * g(Sig(S)) Output: top-R subspaces with the largest P(S, T) scores  Higher rank ~ more promotive /* assume simple ranking */  More significant subspace (e.g., more objects) ~ more promotive  Example instantiations  Simple ranking: P(S, T) = Rank -1 (S, T)  Iceberg condition: P(S, T) = Rank -1 (S, T) * I(ObjCount(S)>MinSig)  Percentile ranking: P(S, T) = ObjCount(S) / Rank(S, T)  …

12 Outline  Introduction  Query execution algorithms  (1) PromoRank framework  (a) Subspace pruning  (b) Object pruning  (2) Promotion cubes  Spurious promotion  Experiment  Conclusion

13 The PromoRank framework Idea: use a recursive process to {*} partition and aggregate the data to compute the target object’s rank in each subspace [Beyer99] The bottom-up method {A} {B} {C} {D} {AB} {AC} {AD} {BC} {BD} {CD} {ABC} {ABD} {ACD} {BCD} {ABCD} Target object’s subspace lattice

14 Compute T’s rank in {*} Method: create a hash table: PromoRank: recursive process HashTable[object] = AggregateScore Partition the data based on A {*} Method: sorting Compute T’s rank in {A} 1 {A} {B} {C} {D} {A} 2 10 14 16 Recursively repeat… {AB} {AB} {AC} {AD} {BC} {BD} {CD} 3 7 9 11 13 15 Top-R promotive {ABC} {ABD} {ACD} {BCD} 4 6 8 12 subspaces: priority queue {ABCD} 5

15 (1.1) Subspace pruning  Idea: reuse previous results  Goal: prune out unseen subspaces by bounding their promotiveness {*} scores Sig(S) : bounded {A} {B} {C} {D} {A} Rank(S, T) : bounded {AB} {AB} {AC} {AD} {BC} {BD} {CD} {ABC} {ABD} {ACD} {BCD} {ABCD}

16 Subspace pruning  Keys: Any unseen subspace with low LBRank(T) can be pruned  Compute T’s highest possible Rank: LBRank , S}|+ 1 = 3 rd Thus, LBRank(T) = |{V  Use the monotonicity of the aggregate measure (e.g. SUM, MAX) SUM(V) > SUM(T) SUM(S) > SUM(T) {B} SUM(T) = 1.9 SUM(V) = 5.5 How to prune an unseen one? SUM(S) = 2.2 10 SUM(V) = 5.5 Given a seen (aggregated) subspace {AB} SUM(S) = 2.2 3 SUM(T) = 1.1 Rank(T) = 3rd / 3

17 (1.2) Object pruning Idea: avoid computing objects Power-law distribution: objects which do not affect rank at the long-tail can be pruned Goal: reduce the partitioning and aggregation cost W and Z can be pruned! SUM(S) = 6.5 SUM(W)<MinScore(T) SUM(T) = 2.2 SUM(Z)<MinScore(T) SUM(U) = 1.5 Seen (aggregated) subspace SUM(W) = 1.0 {A} SUM(Z) = 0.8 Unseen subtree of {AB} {AC} subspaces SUM(T) = 1.9 MinScore(T) = 1.1 SUM(T) = 1.2 {ABC} SUM(T) = 1.1

18 (2) Promotion cubes Observation: (1) T: tends to be highly ranked in a top subspace; (2) A top subspace is likely to contain many objects  Method: promotion cube  Offline materialization  Structure  For each subspace with Sig(S)>MinSig  parameter: MinSig  Materialize a selected sample of top- k aggregate scores in each subspace  Parameter(s): k and k’

19 Promotion cell  For each “significant” subspace S, create a “promotion cell”  Promotion cell:  Store aggregate scores; no object IDs Subspace S  Parameters MinSig , k , and k’ : chosen to yield a space-time tradeoff; application dependent Passing the MinSig  Does not restrict query processing threshold PCell(S) k =9, k’ =3 Object (sorted) Object (sorted)

20 Query execution using promotion cube  Step 1: Compute T’s aggregate scores  Step 2: Compute LBRanks and UBRanks and do pruning  Using the promotion cube {*} {*}  Step 3: Call PromoRank SUM(T)=3.0 {A} {A} {B} {B} {C} {C} {D} {D} SUM(T)=2.2 SUM(T)=2.2 SUM(T)=1.9 SUM(T)=1.6 {AB} {AB} {AC} {AC} {AD} {AD} {BC} {BC} {BD} {BD} {CD} {CD} SUM(T)=1.2 SUM(T)=1.9 SUM(T)=1.8 SUM(T)=1.9 SUM(T)=1.5 SUM(T)=0.9 {ABC} {ABC} {ABD} {ABD} {ACD} {ACD} {BCD} {BCD} SUM(T)=1.1 SUM(T)=0.9 SUM(T)=0.5 SUM(T)=0.3 SUM(T)=0.5 {ABCD} {ABCD}

21 Query execution using promotion cube  Step 1: Compute T’s aggregate scores  Step 2: Compute LBRanks and UBRanks and do pruning [LBRank, UBRank]  Using the promotion cube {*} {*}  Step 3: Call PromoRank [11, 19] {A} {A} {B} {B} {C} {C} {D} {D} [51, 59] [20, 20] [21, 29] [31, 39] {AB} {AB} {AC} {AC} {AD} {AD} {BC} {BC} {BD} {BD} {CD} {CD} [11, 19] [61,∞) [31, 39] [11, 19] [21, 29] [31, 39] {ABC} {ABC} {ABD} {ABD} {ACD} {ACD} {BCD} {BCD} [21, 29] [61, ∞) [11, 19] [50, 50] {ABCD} {ABCD} [51, 59]

23 The spurious promotion problem  Spurious promotion  The target object is highly ranked in a subspace due to random perturbation: not meaningful  Example: Michael Jordan (NBA player) Rank Subspace # 1 {Year = 1995} OK # 1 {MonthOfBirth = February} Spurious # 1 {Weather = Sunny} Spurious Due to random perturbation

Promotion Analysis in Multi-Dimensional Space Tianyi Wu (UIUC) - PowerPoint PPT Presentation

Promotion Analysis in Multi-Dimensional Space Tianyi Wu (UIUC) Dong Xin (Microsoft Research) Qiaozhu Mei (University of Michigan) Jiawei Han (UIUC) 2 Outline Introduction Query execution algorithms Spurious promotion

Promotion Test 2020 When is Promotion Test? Promotion Test would be held on the 20th of October

PROMOTION STRATEGY (IPS)? 9:30-10:15 www.wavteq.com Investment promotion strategy

Min indfulness Case Studies Dr Stacey Waters Health Promotion Solutions Health Promotion

The Role of Trade Promotion Trade Facilitation and Promotion Division Ministry of Finance and

Promotion and Tenure Workshop 2020-2021 February 2020 Overview The Road to Promotion and Tenure

The Power of On Air Promotion How to make your (small) station successful with On Air Promotion

DEPARTMENT OF MEDICINE PROMOTION WORKSHOP Senior Promotion 101 November 2017 1 To Be Covered

The Role of OPDP in Regulating Prescription Drug Promotion Prescription Drug Promotion Richard

Promotion of the profession, and your practice Katie Griffiths (BSc. Ost. Med) Head of

A PROJECT OF HEALTH PROMOTION A PROJECT OF HEALTH PROMOTION AT WORKPLACES IN VIETNAM: AT

Promotion policy Info Day Portugal 28.02.2019 Lene NAESAGER Head of Unit "External

WHO IS PROMOTION SOLUTIONS? Since 1995, Promotion Solutions has been providing many of North

PROMOTION AND DISSEMINATION OF A HEALTHY LIFESTYLE IN TURKEY Promotion is done in a variety

Studying BA (Hons) Fashion Business and Promotion at Salford BA(HONS) FASHION BUSINESS AND

PROMOTION & TENURE 2019-2020 2/18/2020 - Update to Slide 7 P&T Timetable (during sixth

Building a Dossier Towards T enure and Promotion What Matters Most for Tenure and Promotion The

Redis Presentation by Atreyee Maiti What is redis? an in-memory key-value store, with

Personal Factors Make a Difference! Research from more than 1,000 published studies on

Enhancing the Power of Deep Learning in Side-Channel Analysis? Breaking multiple layers of

2 nd ACM Information Hiding Multimedia & Security Workshop Salzburg, 12 June 2014 features

RE mote DI ctionary S erver Chris Keith James Tavares Overview History Users Logical Data

Buffering to Redis for Efficient Real-Time Processing Percona Live, April 24, 2018

Multi-Dimensional Arrays Chapter 8 1-Dimentional and 2-Dimentional Arrays In the previous

Organizational Culture Chris May: Vitae Consulting LLC 630-608-7072 cmayconsulting@gmail.com

Promotion Analysis in Multi-Dimensional Space Tianyi Wu (UIUC) - PowerPoint PPT Presentation

Promotion Analysis in Multi-Dimensional Space Tianyi Wu (UIUC) Dong Xin (Microsoft Research) Qiaozhu Mei (University of Michigan) Jiawei Han (UIUC) 2 Outline Introduction Query execution algorithms Spurious promotion

Promotion Test 2020 When is Promotion Test? Promotion Test would be held on the 20th of October

PROMOTION STRATEGY (IPS)? 9:30-10:15 www.wavteq.com Investment promotion strategy

Min indfulness Case Studies Dr Stacey Waters Health Promotion Solutions Health Promotion

The Role of Trade Promotion Trade Facilitation and Promotion Division Ministry of Finance and

Promotion and Tenure Workshop 2020-2021 February 2020 Overview The Road to Promotion and Tenure

The Power of On Air Promotion How to make your (small) station successful with On Air Promotion

DEPARTMENT OF MEDICINE PROMOTION WORKSHOP Senior Promotion 101 November 2017 1 To Be Covered

The Role of OPDP in Regulating Prescription Drug Promotion Prescription Drug Promotion Richard

Promotion of the profession, and your practice Katie Griffiths (BSc. Ost. Med) Head of

A PROJECT OF HEALTH PROMOTION A PROJECT OF HEALTH PROMOTION AT WORKPLACES IN VIETNAM: AT

Promotion policy Info Day Portugal 28.02.2019 Lene NAESAGER Head of Unit &quot;External

WHO IS PROMOTION SOLUTIONS? Since 1995, Promotion Solutions has been providing many of North

PROMOTION AND DISSEMINATION OF A HEALTHY LIFESTYLE IN TURKEY Promotion is done in a variety

Studying BA (Hons) Fashion Business and Promotion at Salford BA(HONS) FASHION BUSINESS AND

PROMOTION &amp; TENURE 2019-2020 2/18/2020 - Update to Slide 7 P&amp;T Timetable (during sixth

Building a Dossier Towards T enure and Promotion What Matters Most for Tenure and Promotion The

Redis Presentation by Atreyee Maiti What is redis? an in-memory key-value store, with

Personal Factors Make a Difference! Research from more than 1,000 published studies on

Enhancing the Power of Deep Learning in Side-Channel Analysis? Breaking multiple layers of

2 nd ACM Information Hiding Multimedia &amp; Security Workshop Salzburg, 12 June 2014 features

RE mote DI ctionary S erver Chris Keith James Tavares Overview History Users Logical Data

Buffering to Redis for Efficient Real-Time Processing Percona Live, April 24, 2018

Multi-Dimensional Arrays Chapter 8 1-Dimentional and 2-Dimentional Arrays In the previous

Organizational Culture Chris May: Vitae Consulting LLC 630-608-7072 cmayconsulting@gmail.com

Promotion policy Info Day Portugal 28.02.2019 Lene NAESAGER Head of Unit "External

PROMOTION & TENURE 2019-2020 2/18/2020 - Update to Slide 7 P&T Timetable (during sixth

2 nd ACM Information Hiding Multimedia & Security Workshop Salzburg, 12 June 2014 features