ranking distributed probabilistic data
play

Ranking Distributed Probabilistic Data Jeffrey Jestes Feifei Li Ke - PowerPoint PPT Presentation

Ranking Distributed Probabilistic Data Jeffrey Jestes Feifei Li Ke Yi 1-1 Introduction Ranking queries are important tools used to return only the most significant results 2-1 Introduction Ranking queries are important tools used to return


  1. Expected Ranks Example tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 world W Pr[ W ] { t 1 = 120 , t 2 = 103 , t 3 = 98 } 0 . 8 × 0 . 7 × 1 = 0 . 56 { t 1 = 120 , t 3 = 98 , t 2 = 70 } 0 . 8 × 0 . 3 × 1 = 0 . 24 { t 2 = 103 , t 3 = 98 , t 1 = 62 } 0 . 2 × 0 . 7 × 1 = 0 . 14 { t 3 = 98 , t 2 = 70 , t 1 = 62 } 0 . 2 × 0 . 3 × 1 = 0 . 06 tuple r(tuple) 0 . 56 × 0 + 0 . 24 × 0 + 0 . 14 × 2 + 0 . 06 × 2 = 0 . 4 t 1 0 . 56 × 1 + 0 . 24 × 2 + 0 . 14 × 0 + 0 . 06 × 1 = 1 . 1 t 2 0 . 56 × 2 + 0 . 24 × 1 + 0 . 14 × 1 + 0 . 06 × 0 = 1 . 5 t 3 8-3

  2. Expected Ranks Example tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 world W Pr[ W ] { t 1 = 120 , t 2 = 103 , t 3 = 98 } 0 . 8 × 0 . 7 × 1 = 0 . 56 { t 1 = 120 , t 3 = 98 , t 2 = 70 } 0 . 8 × 0 . 3 × 1 = 0 . 24 { t 2 = 103 , t 3 = 98 , t 1 = 62 } 0 . 2 × 0 . 7 × 1 = 0 . 14 { t 3 = 98 , t 2 = 70 , t 1 = 62 } 0 . 2 × 0 . 3 × 1 = 0 . 06 tuple r(tuple) 0 . 56 × 0 + 0 . 24 × 0 + 0 . 14 × 2 + 0 . 06 × 2 = 0 . 4 t 1 0 . 56 × 1 + 0 . 24 × 2 + 0 . 14 × 0 + 0 . 06 × 1 = 1 . 1 t 2 0 . 56 × 2 + 0 . 24 × 1 + 0 . 14 × 1 + 0 . 06 × 0 = 1 . 5 t 3 8-4

  3. Expected Ranks Example tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 world W Pr[ W ] { t 1 = 120 , t 2 = 103 , t 3 = 98 } 0 . 8 × 0 . 7 × 1 = 0 . 56 { t 1 = 120 , t 3 = 98 , t 2 = 70 } 0 . 8 × 0 . 3 × 1 = 0 . 24 { t 2 = 103 , t 3 = 98 , t 1 = 62 } 0 . 2 × 0 . 7 × 1 = 0 . 14 { t 3 = 98 , t 2 = 70 , t 1 = 62 } 0 . 2 × 0 . 3 × 1 = 0 . 06 tuple r(tuple) 0 . 56 × 0 + 0 . 24 × 0 + 0 . 14 × 2 + 0 . 06 × 2 = 0 . 4 t 1 0 . 56 × 1 + 0 . 24 × 2 + 0 . 14 × 0 + 0 . 06 × 1 = 1 . 1 t 2 0 . 56 × 2 + 0 . 24 × 1 + 0 . 14 × 1 + 0 . 06 × 0 = 1 . 5 t 3 8-5

  4. Expected Ranks Example tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 world W Pr[ W ] { t 1 = 120 , t 2 = 103 , t 3 = 98 } 0 . 8 × 0 . 7 × 1 = 0 . 56 { t 1 = 120 , t 3 = 98 , t 2 = 70 } 0 . 8 × 0 . 3 × 1 = 0 . 24 { t 2 = 103 , t 3 = 98 , t 1 = 62 } 0 . 2 × 0 . 7 × 1 = 0 . 14 { t 3 = 98 , t 2 = 70 , t 1 = 62 } 0 . 2 × 0 . 3 × 1 = 0 . 06 tuple r(tuple) 0 . 56 × 0 + 0 . 24 × 0 + 0 . 14 × 2 + 0 . 06 × 2 = 0 . 4 t 1 0 . 56 × 1 + 0 . 24 × 2 + 0 . 14 × 0 + 0 . 06 × 1 = 1 . 1 t 2 0 . 56 × 2 + 0 . 24 × 1 + 0 . 14 × 1 + 0 . 06 × 0 = 1 . 5 t 3 8-6

  5. Expected Ranks Example tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 world W Pr[ W ] { t 1 = 120 , t 2 = 103 , t 3 = 98 } 0 . 8 × 0 . 7 × 1 = 0 . 56 { t 1 = 120 , t 3 = 98 , t 2 = 70 } 0 . 8 × 0 . 3 × 1 = 0 . 24 { t 2 = 103 , t 3 = 98 , t 1 = 62 } 0 . 2 × 0 . 7 × 1 = 0 . 14 { t 3 = 98 , t 2 = 70 , t 1 = 62 } 0 . 2 × 0 . 3 × 1 = 0 . 06 tuple r(tuple) 0 . 56 × 0 + 0 . 24 × 0 + 0 . 14 × 2 + 0 . 06 × 2 = 0 . 4 t 1 0 . 56 × 1 + 0 . 24 × 2 + 0 . 14 × 0 + 0 . 06 × 1 = 1 . 1 t 2 0 . 56 × 2 + 0 . 24 × 1 + 0 . 14 × 1 + 0 . 06 × 0 = 1 . 5 t 3 8-7

  6. Expected Ranks Example tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 world W Pr[ W ] { t 1 = 120 , t 2 = 103 , t 3 = 98 } 0 . 8 × 0 . 7 × 1 = 0 . 56 { t 1 = 120 , t 3 = 98 , t 2 = 70 } 0 . 8 × 0 . 3 × 1 = 0 . 24 { t 2 = 103 , t 3 = 98 , t 1 = 62 } 0 . 2 × 0 . 7 × 1 = 0 . 14 { t 3 = 98 , t 2 = 70 , t 1 = 62 } 0 . 2 × 0 . 3 × 1 = 0 . 06 tuple r(tuple) 0 . 56 × 0 + 0 . 24 × 0 + 0 . 14 × 2 + 0 . 06 × 2 = 0 . 4 t 1 0 . 56 × 1 + 0 . 24 × 2 + 0 . 14 × 0 + 0 . 06 × 1 = 1 . 1 t 2 0 . 56 × 2 + 0 . 24 × 1 + 0 . 14 × 1 + 0 . 06 × 0 = 1 . 5 t 3 8-8

  7. Expected Ranks Example tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 world W Pr[ W ] { t 1 = 120 , t 2 = 103 , t 3 = 98 } 0 . 8 × 0 . 7 × 1 = 0 . 56 { t 1 = 120 , t 3 = 98 , t 2 = 70 } 0 . 8 × 0 . 3 × 1 = 0 . 24 { t 2 = 103 , t 3 = 98 , t 1 = 62 } 0 . 2 × 0 . 7 × 1 = 0 . 14 { t 3 = 98 , t 2 = 70 , t 1 = 62 } 0 . 2 × 0 . 3 × 1 = 0 . 06 tuple r(tuple) 0 . 56 × 0 + 0 . 24 × 0 + 0 . 14 × 2 + 0 . 06 × 2 = 0 . 4 t 1 0 . 56 × 1 + 0 . 24 × 2 + 0 . 14 × 0 + 0 . 06 × 1 = 1 . 1 t 2 0 . 56 × 2 + 0 . 24 × 1 + 0 . 14 × 1 + 0 . 06 × 0 = 1 . 5 t 3 8-9

  8. Expected Ranks Example tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 world W Pr[ W ] { t 1 = 120 , t 2 = 103 , t 3 = 98 } 0 . 8 × 0 . 7 × 1 = 0 . 56 { t 1 = 120 , t 3 = 98 , t 2 = 70 } 0 . 8 × 0 . 3 × 1 = 0 . 24 { t 2 = 103 , t 3 = 98 , t 1 = 62 } 0 . 2 × 0 . 7 × 1 = 0 . 14 { t 3 = 98 , t 2 = 70 , t 1 = 62 } 0 . 2 × 0 . 3 × 1 = 0 . 06 tuple r(tuple) 0 . 56 × 0 + 0 . 24 × 0 + 0 . 14 × 2 + 0 . 06 × 2 = 0 . 4 t 1 0 . 56 × 1 + 0 . 24 × 2 + 0 . 14 × 0 + 0 . 06 × 1 = 1 . 1 t 2 0 . 56 × 2 + 0 . 24 × 1 + 0 . 14 × 1 + 0 . 06 × 0 = 1 . 5 t 3 8-10

  9. Expected Ranks Example tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 world W Pr[ W ] { t 1 = 120 , t 2 = 103 , t 3 = 98 } 0 . 8 × 0 . 7 × 1 = 0 . 56 { t 1 = 120 , t 3 = 98 , t 2 = 70 } 0 . 8 × 0 . 3 × 1 = 0 . 24 { t 2 = 103 , t 3 = 98 , t 1 = 62 } 0 . 2 × 0 . 7 × 1 = 0 . 14 { t 3 = 98 , t 2 = 70 , t 1 = 62 } 0 . 2 × 0 . 3 × 1 = 0 . 06 tuple r(tuple) 0 . 56 × 0 + 0 . 24 × 0 + 0 . 14 × 2 + 0 . 06 × 2 = 0 . 4 t 1 0 . 56 × 1 + 0 . 24 × 2 + 0 . 14 × 0 + 0 . 06 × 1 = 1 . 1 t 2 0 . 56 × 2 + 0 . 24 × 1 + 0 . 14 × 1 + 0 . 06 × 0 = 1 . 5 t 3 8-11

  10. Expected Ranks It has been shown that r ( t i ) may be written as b i � p i,l ( q ( v i,l ) − Pr [ X i > v i,l ]) r ( t i ) = (2) l =1 where, = number of choices in the pdf of t i b i = probability of choice l in tuple t i p i,l � q ( v i,l ) = j Pr [ X j > v i,l ] = pdf of tuple t i X i Pr [ X i > v i,l ] = contribution of t i to q ( v i,l ) 9-1

  11. Expected Ranks It has been shown that r ( t i ) may be written as b i � p i,l ( q ( v i,l ) − Pr [ X i > v i,l ]) r ( t i ) = (2) l =1 where, = number of choices in the pdf of t i b i = probability of choice l in tuple t i p i,l � q ( v i,l ) = j Pr [ X j > v i,l ] = pdf of tuple t i X i Pr [ X i > v i,l ] = contribution of t i to q ( v i,l ) q ( v i,l ) is the sum of the probabilities that a tuple will out- rank a tuple with score v i,l 9-2

  12. Expected Ranks It has been shown that r ( t i ) may be written as b i � p i,l ( q ( v i,l ) − Pr [ X i > v i,l ]) r ( t i ) = (2) l =1 where, = number of choices in the pdf of t i b i = probability of choice l in tuple t i p i,l � q ( v i,l ) = j Pr [ X j > v i,l ] = pdf of tuple t i X i Pr [ X i > v i,l ] = contribution of t i to q ( v i,l ) X i may contain value-probability pairs ( v, p ) s.t. v > v i,l , since the existence of t i = v i,l precludes t i = v , we must subtract the corresponding p ’s from q ( v i,l ) 9-3

  13. Expected Ranks It has been shown that r ( t i ) may be written as b i � p i,l ( q ( v i,l ) − Pr [ X i > v i,l ]) r ( t i ) = (2) l =1 where, = number of choices in the pdf of t i b i = probability of choice l in tuple t i p i,l � q ( v i,l ) = j Pr [ X j > v i,l ] = pdf of tuple t i X i Pr [ X i > v i,l ] = contribution of t i to q ( v i,l ) Efficient algorithms exist to compute the Expected ranks in O ( NlogN ) time for a database of N tuples 9-4

  14. Computing Expected Ranks by q ( v ) ’s tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 3.0 2.8 2.5 1.5 0.8 0 - ∞ 62 70 98 103 120 10-1

  15. Computing Expected Ranks by q ( v ) ’s tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 3.0 2.8 2.5 1.5 0.8 0 - ∞ 62 70 98 103 120 10-2

  16. Computing Expected Ranks by q ( v ) ’s tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 3.0 2.8 2.5 1.5 0.8 0 - ∞ 62 70 98 103 120 r ( t 1 ) = 0 . 8 × 0 10-3

  17. Computing Expected Ranks by q ( v ) ’s tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 3.0 2.8 2.5 1.5 0.8 0 - ∞ 62 70 98 103 120 r ( t 1 ) = 0 . 8 × 0 + 0 . 2 × (2 . 8 − 0 . 8) = 0 . 4 10-4

  18. Distributed Probabilistic Data Model site 1 tuples score t 1 , 1 X 1 , 1 t 1 , 2 X 1 , 2 . . . . . . . . . site m tuples score t 2 , 1 X 2 , 1 t 2 , 2 X 2 , 2 . . . . . . 11-1

  19. Distributed Probabilistic Data Model site 1 tuples score t 1 , 1 X 1 , 1 t 1 , 2 X 1 , 2 tuples . . . . . . t 1 . t 2 . . . . . site m tuples score t N t 2 , 1 X 2 , 1 Conceptual t 2 , 2 X 2 , 2 Database D . . . . . . We can think of the union of the individual databases D i at each site s i as a conceptual database D 11-2

  20. Ranking Queries for Distributed Probabilistic Data We introduce two frameworks for ranking queries for distributed probabilistic data Sorted Access on Local Ranks Sorted Access on Expected Scores 12-1

  21. Sorted Access on Local Ranks Framework site 2 site m site 1 t 2 , 1 t 1 , 1 t m, 1 . . . t 2 , 2 t m, 2 t 1 , 2 . . . . . . . . . t 2 ,n 2 t 2 ,n m t 1 ,n 1 Every site calculates the local ranks of its tuples and stores tuples in ascending order of local ranks 13-1

  22. Sorted Access on Local Ranks Framework SERVER site 2 site m site 1 t 2 , 1 t 1 , 1 t m, 1 . . . t 2 , 2 t m, 2 t 1 , 2 . . . . . . . . . t 2 ,n 2 t 2 ,n m t 1 ,n 1 The server accesses tuples in ascending order of local ranks and combines the local ranks to get the global ranks 13-2

  23. Local and Global Ranks The local rank of a tuple t i,j at a site s i in database D i is b i,j � p i,j,l ( q i ( v i,j,l ) − Pr [ X i,j > v i,j,l ]) r ( t i,j , D i ) = (3) l =0 The local rank for a tuple t i,j at a site s y with database D y , s.t. i � = y is b i,j � r ( t i,j , D y ) = p i,j,l ( q y ( v i,j,l )) (4) l =1 The global rank for a tuple t i,j is m � r ( t i,j , D y ) = r ( t i,j , Dy ) (5) y =1 14-1

  24. Local and Global Ranks The local rank of a tuple t i,j at a site s i in database D i is b i,j � p i,j,l ( q i ( v i,j,l ) − Pr [ X i,j > v i,j,l ]) r ( t i,j , D i ) = (3) l =0 The local rank for a tuple t i,j at a site s y with database D y , s.t. i � = y is b i,j � r ( t i,j , D y ) = p i,j,l ( q y ( v i,j,l )) (4) l =1 14-2

  25. Local and Global Ranks The local rank of a tuple t i,j at a site s i in database D i is b i,j � p i,j,l ( q i ( v i,j,l ) − Pr [ X i,j > v i,j,l ]) r ( t i,j , D i ) = (3) l =0 The local rank for a tuple t i,j at a site s y with database D y , s.t. i � = y is b i,j � r ( t i,j , D y ) = p i,j,l ( q y ( v i,j,l )) (4) l =1 The global rank for a tuple t i,j is m � r ( t i,j , D y ) = r ( t i,j , Dy ) (5) y =1 14-3

  26. Sorted Access on Local Ranks Initialization Rep. Queue tuple lrank 0.8 t 3 , 1 1.2 t 1 , 1 2.3 t 2 , 1 site 2 site 2 site 3 site 3 site 1 site 1 tuple tuple lrank lrank tuple tuple lrank lrank tuple tuple lrank lrank → → → 2.3 2.3 0.8 0.8 1.2 1.2 t 2 , 1 t 2 , 1 t 3 , 1 t 3 , 1 t 1 , 1 t 1 , 1 → → 3.4 3.4 4.1 4.1 → 5.9 5.9 t 2 , 2 t 2 , 2 t 3 , 2 t 3 , 2 t 1 , 2 t 1 , 2 . . . . . . . . . . . . . . . . . . 29.1 29.1 40.4 40.4 t 2 ,n 2 t 2 ,n 2 t 3 ,n 3 t 3 ,n 3 34.2 34.2 t 1 ,n 1 t 1 ,n 1 15-1

  27. Sorted Access on Local Ranks Initialization Rep. Queue tuple lrank 0.8 t 3 , 1 1.2 t 1 , 1 2.3 t 2 , 1 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 → → 3.4 4.1 → 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . . . . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 15-2

  28. Sorted Access on Local Ranks: a Round Rep. Queue top − 2 Queue tuple lrank tuple grank 3.4 t 2 , 2 5.4 t 2 , 1 4.1 t 3 , 2 7.9 t 1 , 1 5.9 t 1 , 2 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-1

  29. Sorted Access on Local Ranks: a Round Rep. Queue top − 2 Queue tuple lrank tuple lrank tuple grank 3.4 t 2 , 2 3.4 t 2 , 2 5.4 t 2 , 1 4.1 t 3 , 2 7.9 t 1 , 1 5.9 t 1 , 2 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-2

  30. Sorted Access on Local Ranks: a Round Rep. Queue top − 2 Queue tuple lrank tuple lrank tuple grank 3.4 t 2 , 2 3.4 t 2 , 2 5.4 t 2 , 1 4.1 t 3 , 2 7.9 t 1 , 1 5.9 t 1 , 2 tuple lrank 4.8 t 2 , 3 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-3

  31. Sorted Access on Local Ranks: a Round top − 2 Queue Rep. Queue tuple lrank tuple lrank tuple grank 3.4 t 2 , 2 4.1 5.4 t 3 , 2 t 2 , 1 4.8 7.9 t 2 , 3 t 1 , 1 5.9 t 1 , 2 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-4

  32. Sorted Access on Local Ranks: a Round top − 2 Queue Rep. Queue tuple lrank tuple lrank tuple grank 3.4 t 2 , 2 4.1 5.4 t 3 , 2 t 2 , 1 4.8 7.9 t 2 , 3 t 1 , 1 5.9 t 1 , 2 X 2 , 2 X 2 , 2 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-5

  33. Sorted Access on Local Ranks: a Round top − 2 Queue Rep. Queue tuple lrank tuple lrank tuple grank 3.4 t 2 , 2 4.1 5.4 t 3 , 2 t 2 , 1 4.8 7.9 t 2 , 3 t 1 , 1 5.9 t 1 , 2 lrank lrank 0.7 1.5 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-6

  34. Sorted Access on Local Ranks: a Round top − 2 Queue Rep. Queue tuple lrank tuple lrank tuple grank 3.4 t 2 , 2 4.1 5.4 t 3 , 2 t 2 , 1 4.8 7.9 t 2 , 3 t 1 , 1 grank 5.9 t 1 , 2 5.6 lrank lrank 0.7 1.5 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-7

  35. Sorted Access on Local Ranks: a Round top − 2 Queue Rep. Queue tuple grank tuple lrank tuple grank 5.6 t 2 , 2 4.1 5.4 t 3 , 2 t 2 , 1 4.8 7.9 t 2 , 3 t 1 , 1 5.9 t 1 , 2 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-8

  36. Sorted Access on Local Ranks: a Round top − 2 Queue Rep. Queue tuple lrank tuple grank 4.1 5.4 t 3 , 2 t 2 , 1 4.8 5.6 t 2 , 3 t 2 , 2 5.9 t 1 , 2 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-9

  37. Sorted Access on Local Ranks: a Round We can safely terminate top − 2 Queue Rep. Queue whenever the largest grank tuple lrank tuple grank from top − k queue is ≤ 4.1 5.4 t 3 , 2 t 2 , 1 smallest lrank from Rep. 4.8 5.6 t 2 , 3 t 2 , 2 Queue 5.9 t 1 , 2 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-10

  38. Sorted Access on Local Ranks: a Round We can safely terminate top − 2 Queue Rep. Queue whenever the largest grank tuple lrank tuple grank from top − k queue is ≤ 4.1 5.4 t 3 , 2 t 2 , 1 smallest lrank from Rep. 4.8 5.6 t 2 , 3 t 2 , 2 Queue 5.9 t 1 , 2 A-LR site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-11

  39. Sorted Access on Expected Scores Framework site 2 site m site 1 t 2 , 1 t 1 , 1 t m, 1 . . . t 2 , 2 t m, 2 t 1 , 2 . . . . . . . . . t 2 ,n 2 t 2 ,n m t 1 ,n 1 Every site calculates the local ranks and the expected scores of its tuples and stores the tuples in descending order of expected scores 17-1

  40. Sorted Access on Expected Scores Framework SERVER site 2 site m site 1 t 2 , 1 t 1 , 1 t m, 1 . . . t 2 , 2 t m, 2 t 1 , 2 . . . . . . . . . t 2 ,n 2 t 2 ,n m t 1 ,n 1 Tuples are accessed by descending order of expected scores and the server calculates global ranks 17-2

  41. Sorted Access on Expected Scores Initialization site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] → → → 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . . . . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 18-1

  42. Sorted Access on Expected Scores Initialization Rep. Queue tuple E [ X ] 500 t 3 , 1 489 t 1 , 1 476 t 2 , 1 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 → → 464 432 → 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . . . . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 18-2

  43. Sorted Access on Expected Scores: a Round Rep. Queue top − 2 Queue tuple E [ X ] tuple grank 464 t 2 , 2 5.4 t 2 , 1 432 t 3 , 2 7.9 t 1 , 1 421 t 1 , 2 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-1

  44. Sorted Access on Expected Scores: a Round Rep. Queue top − 2 Queue tuple lrank tuple E [ X ] tuple grank 3.4 t 2 , 2 464 t 2 , 2 5.4 t 2 , 1 432 t 3 , 2 7.9 t 1 , 1 421 t 1 , 2 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-2

  45. Sorted Access on Expected Scores: a Round Rep. Queue top − 2 Queue tuple lrank tuple E [ X ] tuple grank 3.4 t 2 , 2 464 t 2 , 2 5.4 t 2 , 1 432 t 3 , 2 7.9 t 1 , 1 421 t 1 , 2 tuple E [ X ] 429 t 2 , 3 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-3

  46. Sorted Access on Expected Scores: a Round top − 2 Queue Rep. Queue tuple lrank tuple E [ X ] tuple grank 3.4 t 2 , 2 432 5.4 t 3 , 2 t 2 , 1 429 7.9 t 2 , 3 t 1 , 1 421 t 1 , 2 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-4

  47. Sorted Access on Expected Scores: a Round top − 2 Queue Rep. Queue tuple lrank tuple E [ X ] tuple grank 3.4 t 2 , 2 432 5.4 t 3 , 2 t 2 , 1 429 7.9 t 2 , 3 t 1 , 1 421 t 1 , 2 X 2 , 2 X 2 , 2 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-5

  48. Sorted Access on Expected Scores: a Round top − 2 Queue Rep. Queue tuple lrank tuple E [ X ] tuple grank 3.4 t 2 , 2 432 5.4 t 3 , 2 t 2 , 1 429 7.9 t 2 , 3 t 1 , 1 421 t 1 , 2 lrank lrank 0.7 1.5 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-6

  49. Sorted Access on Expected Scores: a Round top − 2 Queue Rep. Queue tuple lrank tuple E [ X ] tuple grank 3.4 t 2 , 2 432 5.4 t 3 , 2 t 2 , 1 429 7.9 t 2 , 3 t 1 , 1 grank 421 t 1 , 2 5.6 lrank lrank 0.7 1.5 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-7

  50. Sorted Access on Expected Scores: a Round top − 2 Queue Rep. Queue tuple grank tuple E [ X ] tuple grank 5.6 t 2 , 2 432 5.4 t 3 , 2 t 2 , 1 429 7.9 t 2 , 3 t 1 , 1 421 t 1 , 2 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-8

  51. Sorted Access on Expected Scores: a Round top − 2 Queue Rep. Queue tuple E [ X ] tuple grank 432 5.4 t 3 , 2 t 2 , 1 429 5.6 t 2 , 3 t 2 , 2 421 t 1 , 2 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-9

  52. Sorted Access on Expected Scores: a Round Now the only question is top − 2 Queue Rep. Queue when may we safely termi- tuple E [ X ] tuple grank nate and be certain we have 432 5.4 t 3 , 2 t 2 , 1 the global top − k 429 5.6 t 2 , 3 t 2 , 2 421 t 1 , 2 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-10

  53. Sorted Access on Expected Scores: Termina- tion The largest element from the top − k queue is clearly an upper bound r + λ for the global rank of any seen tuple t with pdf X to be in the top − k at round λ 20-1

  54. Sorted Access on Expected Scores: Termina- tion top − 2 Queue Rep. Queue tuple grank tuple E [ X ] 5.4 432 t 2 , 1 t 3 , 2 5.6 429 t 2 , 2 t 2 , 3 421 t 1 , 2 The largest element from the top − k queue is clearly an upper bound r + λ for the global rank of any seen tuple t with pdf X to be in the top − k at round λ 20-2

  55. Sorted Access on Expected Scores: Termina- tion top − 2 Queue Rep. Queue tuple grank tuple E [ X ] 5.4 432 t 2 , 1 t 3 , 2 5.6 429 t 2 , 2 t 2 , 3 421 t 1 , 2 The largest element from the top − k queue is clearly an upper bound r + λ for the global rank of any seen tuple t with pdf X to be in the top − k at round λ The head from the Representative queue with expectance τ is an upper bound for the expectance of any unseen t s.t. E [ X ] ≤ τ 20-3

  56. Sorted Access on Expected Scores: Termina- tion top − 2 Queue Rep. Queue tuple grank tuple E [ X ] 5.4 432 t 2 , 1 t 3 , 2 5.6 429 t 2 , 2 t 2 , 3 421 t 1 , 2 The largest element from the top − k queue is clearly an upper bound r + λ for the global rank of any seen tuple t with pdf X to be in the top − k at round λ The head from the Representative queue with expectance τ is an upper bound for the expectance of any unseen t s.t. E [ X ] ≤ τ How can we derive a lower bound r − λ for the global rank of any unseen tuple t s.t. when r + λ ≤ r − λ it is safe to terminate at round λ ? 20-4

  57. Sorted Access on Expected Scores: a Lower Bound? We introduce two methods to find a lower bound r − λ for any unseen tuple t at round λ 21-1

  58. Sorted Access on Expected Scores: a Lower Bound? We introduce two methods to find a lower bound r − λ for any unseen tuple t at round λ Markov Inequality 21-2

  59. Sorted Access on Expected Scores: a Lower Bound? We introduce two methods to find a lower bound r − λ for any unseen tuple t at round λ Markov Inequality Linear Programming 21-3

  60. Markov Inequality Lower Bound We know that the pdf of any unseen t must satisfy E [ X ] ≤ τ 22-1

  61. Markov Inequality Lower Bound We know that the pdf of any unseen t must satisfy E [ X ] ≤ τ We can use the Markov Inequality to lower bound the rank of any site s i with database D i as, n i n i � � r ( t, D i ) = Pr[ X j > X ] = n i − Pr[ X ≥ X j ] j =1 j =1 b ij n i p i,j,ℓ E [ X ] � � (Markov Ineq.) n i − v i,j,ℓ . ≥ j =1 ℓ =1 b ij n i τ � � v i,j,ℓ = r − ( t, D i ) . (6) n i − p i,j,ℓ ≥ j =1 ℓ =1 22-2

  62. Markov Inequality Lower Bound We know that the pdf of any unseen t must satisfy E [ X ] ≤ τ We can use the Markov Inequality to lower bound the rank of any site s i with database D i as, n i n i � � r ( t, D i ) = Pr[ X j > X ] = n i − Pr[ X ≥ X j ] j =1 j =1 b ij n i p i,j,ℓ E [ X ] � � (Markov Ineq.) n i − v i,j,ℓ . ≥ j =1 ℓ =1 b ij n i τ � � v i,j,ℓ = r − ( t, D i ) . (6) n i − p i,j,ℓ ≥ j =1 ℓ =1 Now the global rank r ( t ) must satisfy m � r ( t ) ≥ r − ( t, D i ) = r − (7) λ i =1 22-3

  63. Markov Inequality Lower Bound We know that the pdf of any unseen t must satisfy E [ X ] ≤ τ We can use the Markov Inequality to lower bound the rank of any site s i with database D i as, n i n i � � r ( t, D i ) = Pr[ X j > X ] = n i − Pr[ X ≥ X j ] j =1 j =1 b ij n i p i,j,ℓ E [ X ] � � Loose! (Markov Ineq.) n i − v i,j,ℓ . ≥ j =1 ℓ =1 b ij n i τ � � v i,j,ℓ = r − ( t, D i ) . (6) n i − p i,j,ℓ ≥ j =1 ℓ =1 Now the global rank r ( t ) must satisfy m � r ( t ) ≥ r − ( t, D i ) = r − (7) λ i =1 22-4

  64. Linear Programming Lower Bound Any unseen tuple t must have E [ X ] ≤ τ 23-1

  65. Linear Programming Lower Bound Any unseen tuple t must have E [ X ] ≤ τ We’ve seen how to derive a lower bound r − λ on the global rank for any unseen tuple t using Markov’s Inequality 23-2

  66. Linear Programming Lower Bound Any unseen tuple t must have E [ X ] ≤ τ We’ve seen how to derive a lower bound r − λ on the global rank for any unseen tuple t using Markov’s Inequality We want to find as tight a r − λ as possible by finding the small- est possible r − ( t, D i ) ’s at each site 23-3

  67. Linear Programming Lower Bound Any unseen tuple t must have E [ X ] ≤ τ We’ve seen how to derive a lower bound r − λ on the global rank for any unseen tuple t using Markov’s Inequality We want to find as tight a r − λ as possible by finding the small- est possible r − ( t, D i ) ’s at each site We can use Linear Programming in order to derive the r − ( t, D i ) at each site to find a tight r − λ 23-4

  68. Linear Programming The idea is to construct the best possible X for an unseen tuple t at each site s i that obtains the smallest possible local rank for each s i 24-1

  69. Linear Programming The idea is to construct the best possible X for an unseen tuple t at each site s i that obtains the smallest possible local rank for each s i X could take on arbitrary v ℓ ’s as it’s possible score values, some of which do not exist in value universe U i at a site s i 24-2

  70. Linear Programming The idea is to construct the best possible X for an unseen tuple t at each site s i that obtains the smallest possible local rank for each s i X could take on arbitrary v ℓ ’s as it’s possible score values, some of which do not exist in value universe U i at a site s i We can show this problem is irrelevant after studying the se- mantics of the r ( t, D i ) ’s and the q ( v ) ’s 24-3

  71. Linear Programming: a Note on q ( v ) ’s Recall that r ( t i,j , D y ) = � b i,j ℓ =1 p i,j,l q y ( v i,j,l ) and q ( v ) is essentially a stair case curve as above 25-1

  72. Linear Programming X may take a value v ℓ not in U i with v 2 as its nearest left neighbor 25-2

  73. Linear Programming Even if X takes a value v ℓ not in U i we can decrease v ℓ until we hit v 2 in U i and E [ X ] ≤ τ clearly still holds as we are only decreasing the value of one of the choices in X 25-3

  74. Linear Programming Also note that during this transformation q ( v ℓ ) = q ( v 2 ) and so the local rank of t remains the same 25-4

  75. Linear Programming Formulation Now we can assume X draws values from U i 26-1

  76. Linear Programming Formulation Now we can assume X draws values from U i Then we can define a linear program with the constraints 0 ≤ p ℓ ≤ 1 ℓ = 1 , . . . , γ = | U i | p 1 + . . . + p γ = 1 p 1 v 1 + . . . + p γ v γ ≤ τ and minimize the local rank which is, r ( X, D i ) = � γ ℓ =1 p ℓ q i ( v ℓ ) 26-2

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend