Efficient Rank Join with Aggregation Constraints Min Xie , Laks - PowerPoint PPT Presentation

Efficient Rank Join with Aggregation Constraints Min Xie † , Laks V.S. Lakshmanan † , Peter Wood ‡ † University of British Columbia ‡ Birkbeck, University of London University of British Columbia / Birkbeck, University of London 1 Wednesday, 31 August, 11 1

Outline • Introduction • Aggregation Constraints • Deterministic Optimization • Probabilistic Optimization • Empirical Results University of British Columbia / Birkbeck, University of London 2 Wednesday, 31 August, 11 2

Top-k Query Processing • Top-k query [Ilyas et al., CSUR’11] • Information retrieval, recommender system and etc. • Extremely fruitful area with lots of interesting work • Rank join [Ilyas et al., VLDB’03, Natsev et al., VLDB’01] • Well studied top-k operator in the DB community with many applications • Multi-criteria selection • Information retrieval • Data mining University of British Columbia / Birkbeck, University of London 3 Wednesday, 31 August, 11 3

Rank Join Operator • Rank join • Extremely useful for building preferred packages of items • Travel Planning : a package of one museum & one restaurant Museum Restaurant Location Rating Location Rating ⨝ a c 5 4.5 a b 5 4.5 Museum.Location = Restaurant.Location Order By b b 4.5 4.5 a a 4.5 Museum.Rating + Restaurant.Rating 3 b Keep top-k a 3.5 3 University of British Columbia / Birkbeck, University of London 4 Wednesday, 31 August, 11 4

Limitation of Rank Join Operator • Aggregation constraints • Constraints on attribute values of each join result • Extremely common for applications such as travel packages, course recommendations and etc. ⨝ Museum Restaurant Location Cost Rating Location Cost Rating Museum.Location = Restaurant.Location a c 13.5 5 50 4.5 Order By a 15 b 20 5 4.5 Museum.Rating + Restaurant.Rating b b 10 10 4.5 4.5 a a 15 4.5 5 3 Keep top-k b a 5 3.5 10 3 Constrained by Museum.Cost + Restaurant.Cost ≤ 50 University of British Columbia / Birkbeck, University of London 5 Wednesday, 31 August, 11 5

Review of Existing Rank Join Algorithms • Existing algorithms [Ilyas et al., VLDB’03] [Schnaitter and Polyzotis, PODS’08] • Settings : Tuples in each table pre-sorted based on the score attribute(s) • Threshold-based algorithm • Accessing tuples iteratively from each table • Determine a upper bound after a new tuple is accessed • Stop if the current top-k results of accessed tuples are better than the upperbound • Cruxes of the rank join algorithms • Item accessing strategy (Round Robin/Adaptive) • Bounding schemes (Corner Bound/FR(*) Bound) • Significantly affect the performance of the underlying rank join algorithms University of British Columbia / Birkbeck, University of London 6 Wednesday, 31 August, 11 6

Review Existing Rank Join Algorithms • Performance of rank join algorithm • Number of items accessed • In memory computation cost • Rank join algorithms with FR(*) bounding scheme is Instance Optimal [Schnaitter and Polyzotis, PODS’08] • Within a broad class of algorithms, the # of items accessed is always bounded by a constant factor compared with other algorithm • Instance optimality alone doesn’t guarantee good overall performance! [Finger and Polyzotis, SIGMOD’09] • In memory computational cost may dominate the cost University of British Columbia / Birkbeck, University of London 7 Wednesday, 31 August, 11 7

Leveraging Existing Rank Join Algorithms • How to support aggregation constraints? • A naive solution: post-filtering • Threshold-based algorithm • Accessing tuples iteratively from each table • Determine a upper bound after a new tuple is accessed • Stop if seen top-k results of accessed tuples, which satisfies all aggregation constraints , are better than the upper bound • How good is this naive algorithm? • Instance Optimal ! (Proof in the paper) • Yet bad empirical performance • In memory processing cost is high University of British Columbia / Birkbeck, University of London 8 Wednesday, 31 August, 11 8

Optimization Opportunity (i) Constraint Museum Restaurant Location Cost Rating Location Cost Rating SUM ( Cost ) ≤ 20 t 6 : a c 13.5 5 50 4.5 t 1 : t 2 : t 7 : a 15 b 20 5 4.5 Top-2 results t 3 : t 8 : b b 10 10 4.5 4.5 t 4 : t 9 : a a 15 4.5 5 3 { t 3 , t 8 } : 9 t 5 : b t 10 : 5 3.5 a 10 3 { t 1 , t 9 } : 8 Upperbound : 8 • Number of tuples kept for each relation • Museum : 5 • Restaurant : 4 • Number of join probes performed (Round Robin) • 20 University of British Columbia / Birkbeck, University of London 9 Wednesday, 31 August, 11 9

Optimization Opportunity (ii) • Deterministic optimization Museum Restaurant Constraint Location Cost Rating Location Cost Rating t 6 : a c 13.5 5 50 4.5 t 1 : SUM ( Cost ) ≤ 20 t 2 : t 7 : a 15 b 20 5 4.5 t 3 : t 8 : b b 10 10 4.5 4.5 Top-2 results t 4 : t 9 : a a 15 4.5 5 3 t 5 : b t 10 : 5 3.5 a 10 3 Deterministic tuple pruning can save many unnecessary join probes during the query processing University of British Columbia / Birkbeck, University of London 10 Wednesday, 31 August, 11 10

Outline • Aggregation Constraints • Deterministic Optimization • Probabilistic Optimization • Empirical Results University of British Columbia / Birkbeck, University of London 11 Wednesday, 31 August, 11 11

Aggregation Constraints • Aggregation constraint definition • Let A be an attribute, λ be a constant value, θ be a comparison operator and AGG be an aggregation function {MIN,MAX,SUM} • Primitive aggregation constraint (PAC) pac ::= AGG ( A ) θ λ • Aggregation constraint (AC) ac ::= pac | pac ∧ ac Museum Restaurant Constraint Location Cost Rating Location Cost Rating SUM ( Cost ) ≤ 20 SUM(Cost, true ) ≤ 20 a t 6 : c 13.5 5 50 4.5 t 1 : t 2 : t 7 : a b 15 5 20 4.5 Top-2 results t 3 : t 8 : b b 10 10 4.5 4.5 { t 3 , t 8 } t 4 : t 9 : a 15 4.5 a 5 3 t 5 : { t 1 , t 9 } b t 10 : 5 3.5 a 10 3 University of British Columbia / Birkbeck, University of London 12 Wednesday, 31 August, 11 12

Problem Definition • Rank Join with Aggregation Constraints • Given a set of relations R , a join condition jc , a monotonic score function S and an aggregation constraint ac • Find top-k join results which satisfy ac University of British Columbia / Birkbeck, University of London 13 Wednesday, 31 August, 11 13

Outline • Aggregation Constraints • Deterministic Optimization • Probabilistic Optimization • Empirical Results University of British Columbia / Birkbeck, University of London 14 Wednesday, 31 August, 11 14

Deterministic Optimization (i) • Basic properties of aggregation constraints • When AGG is MIN and θ is ≥ , the corresponding PAC can leverage on direct-pruning . • If a tuple t doesn’t satisfies the PAC, t can be directly pruned University of British Columbia / Birkbeck, University of London 15 Wednesday, 31 August, 11 15

Example (i) Constraint Museum Restaurant Location Cost Rating Location Cost Rating t 6 : a c 13.5 5 50 4.5 MIN ( Rating ) ≥ 4 t 1 : t 2 : t 7 : a 15 b 20 5 4.5 t 3 : t 8 : b b 10 10 4.5 4.5 Top-2 results t 4 : t 9 : a a 15 4.5 5 3 t 5 : b t 10 : 5 3.5 a 10 3 University of British Columbia / Birkbeck, University of London 16 Wednesday, 31 August, 11 16

Deterministic Optimization (i) • Basic properties of aggregation constraints • When AGG is MAX and θ is ≥ , the corresponding PAC is monotone . • If a tuple t satisfies the PAC, join results of t with any tuple also satisfy the PAC • When AGG is SUM and θ is ≤ , the corresponding PAC is anti-monotone . • If a tuple t doesn’t satisfy the PAC, join results of t with any tuple also don’t satisfy the PAC University of British Columbia / Birkbeck, University of London 17 Wednesday, 31 August, 11 17

Deterministic Optimization (i) • Basic properties of aggregation constraints Pruning based on investigating each individual tuple University of British Columbia / Birkbeck, University of London 18 Wednesday, 31 August, 11 18

Deterministic Optimization (ii) • Subsumption-based Pruning (Motivation) Constraint Museum Restaurant Location Cost Rating Location Cost Rating SUM ( Cost ) ≤ 20 t 6 : a c 13.5 5 50 4.5 t 1 : t 2 : t 7 : a 15 b 20 5 4.5 t 3 : t 8 : b b 10 10 4.5 4.5 Top-2 results t 4 : t 9 : a a 15 4.5 5 3 t 5 : b t 10 : 5 3.5 a 10 3 Pruning based on comparing tuples University of British Columbia / Birkbeck, University of London 19 Wednesday, 31 August, 11 19

Deterministic Optimization (ii) • pac-Dominance Relationship • Comparing two tuples w.r.t. a single PAC • Given two tuples t, t’ from the same relation R • t pac-dominates t’ (or t ≽ pac t’), if • for any tuple t’’ which can join with t’ without violating pac • t’’ can also join with t without violating pac • For the common scenario where we have one aggregation constraint per attribute • Sufficient and necessary conditions for determining pac- dominance relationship of each possible aggregation constraint University of British Columbia / Birkbeck, University of London 20 Wednesday, 31 August, 11 20

Efficient Rank Join with Aggregation Constraints Min Xie , Laks - PowerPoint PPT Presentation

Efficient Rank Join with Aggregation Constraints Min Xie , Laks V.S. Lakshmanan , Peter Wood University of British Columbia Birkbeck, University of London University of British Columbia / Birkbeck, University of London 1

2 3 4 5 8 9 MINNEAPOLIS MILWAUKEE MSA RANK #16 MSA RANK #39 CHICAGO MSA RANK #3

Part 16: Group Recommender Systems Rank Aggregation and Balancing Techniques Francesco Ricci

On the minimum rank of a graph Jisu Jeong June 21, 2013 Jisu Jeong On the minimum rank of a

JOINS IN SQL By Rohit Dhanwani OBJECTIVES Define and use different types of joins INNER

Rank Aggregation from Pairwise Comparisons in the Presence of Adversarial Corruptions Arpit

Course : Data mining Topic : Rank aggregation Aristides Gionis Aalto University Department of

Supervised Rank Aggregation Approach for Link Prediction in Complex Networks Manisha Pujari &

Supervised Rank Aggregation Approach for Link Prediction in Complex Networks Manisha Pujari &

A new family of maximum rank distance codes or: Maximum rank distance codes and finite semifields

1 SVD applications: rank, column, row, and null spaces Rank : the rank of a matrix is equal to:

When to Optimize Enumerating all possible plans Selection Pushdown Join Conversion Join

Elmwood Park: Electricity Aggregation Developing an Opt-In Municipal Aggregation Program to

simplifying the customer experience through account aggregation Sim Sangha Business Development

The Axiomatic Method in Social Choice Theory: Preference Aggregation, Judgment Aggregation, Graph

I/O-EFFICIENT SIMILARITY JOIN R. Pagh, N. Pham, F. Silvestri, M. Stckel Similarity Join R = Q

Group Recommender Systems Rank Aggregation and Balancing Techniques Linas Baltrunas, Tadas

Approaches Towards Unified Models for Integrating Web Knowledge Bases Maria Koutraki Joint work

VIRTUAL GLOBES MUSEUM 2.0 ADDING THE POWER OF COMMUNITY Department of Cartography M Gede Zs

MuSEUM University of Tokyo and Its Systematic Uncertainty Matsuda Lab. M1 Todays Menu 1.

Identifying codes and VC-dimension Aline Parreau University of Li` ege, Belgium Joint work

OBAMA PRESIDENTIAL CENTER INTRODUCTION 2 INTRODUCTION 3 ARCHITECTURAL DESIGN 4 ARCHITECTURAL

Evaluating Leadership Development Programs: Easing into Levels 3 & 4 Presenters: Cheryl

Library-based Attack Tree Synthesis S ebastien L e Cong, Sophie Pinchinat Francois

Musical Interfaces and Sequencers Graduate School of Culture Technology, KAIST Juhan Nam Musical

Sambuz

Useful Links

Newsletter

Mail Us

Efficient Rank Join with Aggregation Constraints Min Xie , Laks - PowerPoint PPT Presentation

Efficient Rank Join with Aggregation Constraints Min Xie , Laks V.S. Lakshmanan , Peter Wood University of British Columbia Birkbeck, University of London University of British Columbia / Birkbeck, University of London 1

2 3 4 5 8 9 MINNEAPOLIS MILWAUKEE MSA RANK #16 MSA RANK #39 CHICAGO MSA RANK #3

Part 16: Group Recommender Systems Rank Aggregation and Balancing Techniques Francesco Ricci

On the minimum rank of a graph Jisu Jeong June 21, 2013 Jisu Jeong On the minimum rank of a

JOINS IN SQL By Rohit Dhanwani OBJECTIVES Define and use different types of joins INNER

Rank Aggregation from Pairwise Comparisons in the Presence of Adversarial Corruptions Arpit

Course : Data mining Topic : Rank aggregation Aristides Gionis Aalto University Department of

Supervised Rank Aggregation Approach for Link Prediction in Complex Networks Manisha Pujari &amp;

Supervised Rank Aggregation Approach for Link Prediction in Complex Networks Manisha Pujari &amp;

A new family of maximum rank distance codes or: Maximum rank distance codes and finite semifields

1 SVD applications: rank, column, row, and null spaces Rank : the rank of a matrix is equal to:

When to Optimize Enumerating all possible plans Selection Pushdown Join Conversion Join

Elmwood Park: Electricity Aggregation Developing an Opt-In Municipal Aggregation Program to

simplifying the customer experience through account aggregation Sim Sangha Business Development

The Axiomatic Method in Social Choice Theory: Preference Aggregation, Judgment Aggregation, Graph

I/O-EFFICIENT SIMILARITY JOIN R. Pagh, N. Pham, F. Silvestri, M. Stckel Similarity Join R = Q

Group Recommender Systems Rank Aggregation and Balancing Techniques Linas Baltrunas, Tadas

Approaches Towards Unified Models for Integrating Web Knowledge Bases Maria Koutraki Joint work

VIRTUAL GLOBES MUSEUM 2.0 ADDING THE POWER OF COMMUNITY Department of Cartography M Gede Zs

MuSEUM University of Tokyo and Its Systematic Uncertainty Matsuda Lab. M1 Todays Menu 1.

Identifying codes and VC-dimension Aline Parreau University of Li` ege, Belgium Joint work

OBAMA PRESIDENTIAL CENTER INTRODUCTION 2 INTRODUCTION 3 ARCHITECTURAL DESIGN 4 ARCHITECTURAL

Evaluating Leadership Development Programs: Easing into Levels 3 &amp; 4 Presenters: Cheryl

Library-based Attack Tree Synthesis S ebastien L e Cong, Sophie Pinchinat Francois

Musical Interfaces and Sequencers Graduate School of Culture Technology, KAIST Juhan Nam Musical

Sambuz

Useful Links

Newsletter

Mail Us

Supervised Rank Aggregation Approach for Link Prediction in Complex Networks Manisha Pujari &

Supervised Rank Aggregation Approach for Link Prediction in Complex Networks Manisha Pujari &

Evaluating Leadership Development Programs: Easing into Levels 3 & 4 Presenters: Cheryl