Continuous Inverse Ranking Queries in Uncertain Streams Thomas - - PowerPoint PPT Presentation

continuous inverse ranking queries in uncertain streams
SMART_READER_LITE
LIVE PREVIEW

Continuous Inverse Ranking Queries in Uncertain Streams Thomas - - PowerPoint PPT Presentation

LUDWIG- MAXIMILIANS- DEPARTMENT DATABASE UNIVERSITT INSTITUTE FOR SYSTEMS MNCHEN INFORMATICS GROUP Continuous Inverse Ranking Queries in Uncertain Streams Thomas Bernecker*, Hans-Peter Kriegel*, Nikos Mamoulis**, Matthias Renz* and


slide-1
SLIDE 1

LUDWIG- MAXIMILIANS- UNIVERSITÄT MÜNCHEN DATABASE SYSTEMS GROUP DEPARTMENT INSTITUTE FOR INFORMATICS

Continuous Inverse Ranking Queries in Uncertain Streams

Thomas Bernecker*, Hans-Peter Kriegel*, Nikos Mamoulis**, Matthias Renz* and Andreas Zuefle*

*) Ludwig-Maximilians-Universität München (LMU) Munich, Germany http://www.dbs.ifi.lmu.de {bernecker, kriegel, renz, zuefle}@dbs.ifi.lmu.de **) University of Hong Kong (HKU) Hong Kong http://www.cs.hku.hk nikos@cs.hku.hk

slide-2
SLIDE 2

DATABASE SYSTEMS GROUP

2

Outline

  • 1. Motivation: Probabilistic Inverse Ranking
  • 2. Continuous Inverse Ranking Queries

– Initial Computation – Incremental Processing

  • 3. Experimental Evaluation
  • 4. Summary

Continuous Inverse Ranking Queries in Uncertain Streams

slide-3
SLIDE 3

DATABASE SYSTEMS GROUP

  • Identification of the significance of objects among peers

– Inverse Ranking: Return the position of the query object q w.r.t. the score function S – Probabilistic Inverse Ranking: Find all possible positions of q

  • Example: Stock rating system

3

Probabilistic Inverse Ranking

Continuous Inverse Ranking Queries in Uncertain Streams

q Stock I Stock II Stock III Chances Risk

q Rank 1? → 0 % Rank 2? → 50 % Rank 3? → 50 % Rank 4? → 0 % S = Chances - Risk

slide-4
SLIDE 4

DATABASE SYSTEMS GROUP

4

Probabilistic Inverse Ranking

  • Probabilistic Inverse Ranking (PIR) Query

– Probabilistic database DB where |DB| = n – Uncertain object o: m alternative locations (discrete uncertainty) or pdf (continuous uncertainty) – Query object q – Score function S : DB → R0

+

– Definition: ∀ i = 1, ..., k : P(q is on rank i w.r.t. S ) = – There exist exactly i - 1 objects o DB with S(o) > S(q)

  • Challenge: Application to dynamic data

– General stream model with location updates retrieved at a time t – P(q is on rank i at time t) = – Initial computation – Incremental processing

Continuous Inverse Ranking Queries in Uncertain Streams

( )

i Pt

q

( )

i Pt

q

slide-5
SLIDE 5

DATABASE SYSTEMS GROUP

5

Initial Computation (1)

  • Initial time t: Compute

∀i = 1,...,k

– Object o DB : = P(S(o) > S(q) at time t) – j objects have been processed so far (oj is the latest) – Successive processing by the Poisson Binomial Recurrence (PBR):

Continuous Inverse Ranking Queries in Uncertain Streams

( )

⎪ ⎩ ⎪ ⎨ ⎧ − ⋅ + ⋅ > ∨ < = ∧ = =

− − −

else 1 if if 1

1 , 1 , 1 , t

  • t

j i t

  • t

j i t j i

j j

p P p P j i i j i P

i out of j: S(o) > S(q) i-1 out of j-1: S(o) > S(q) and S(oj) > S(q) i out of j-1: S(o) > S(q) and S(oj) ≤ S(q)

t

  • p

( )

i Pt

q

slide-6
SLIDE 6

DATABASE SYSTEMS GROUP

6

Initial Computation (2)

  • j = n (∀i = 0,...,k-1):

⇒ PIR result for q ⇒ runtime: O(k·n)

  • Optimizations:

– ⇒ o has no effect on the rank of q – ⇒ increment counter

  • General case ( ) ⇒ process o by PBR: ∀i = 0,...,k-1 :

P(i objects processed by PBR have a higher score than q) =

  • Initial PIR result:

Continuous Inverse Ranking Queries in Uncertain Streams

( )

1

, ,

+ = = i P P P

t q t n i t j i

( )

( )

⎩ ⎨ ⎧ + + ≤ ≤ + − − = else 1 1 if 1 k C i C C i P i P

t t t t PBR t q

) (i Pt

PBR

=

t

  • p

1 =

t

  • p

t

C 1 < <

t

  • p
slide-7
SLIDE 7

DATABASE SYSTEMS GROUP

1 .

1 =

t

  • p
  • Example: n = 4, k = 2

7

Initial Computation (3)

Continuous Inverse Ranking Queries in Uncertain Streams

2 =

t

  • p

6 .

3 =

t

  • p

1

4 =

t

  • p

=

t

C

q

  • 1
  • 3
  • 2
  • 4
slide-8
SLIDE 8

DATABASE SYSTEMS GROUP

8

Initial Computation (3)

  • Example: n = 4, k = 2, j = 1
  • j = 1:

Continuous Inverse Ranking Queries in Uncertain Streams

( )

9 . 9 . 1 1 . 1

1 1

, , 1 1 ,

= ⋅ + ⋅ = − ⋅ + ⋅ =

− t

  • t

t

  • t

t

p P p P P 1 .

1 =

t

  • p

2 =

t

  • p

6 .

3 =

t

  • p

1

4 =

t

  • p

( )

1 . 9 . 1 . 1 1

1 1

, 1 , 1 , 1

= ⋅ + ⋅ = − ⋅ + ⋅ =

t

  • t

t

  • t

t

p P p P P =

t

C

slide-9
SLIDE 9

DATABASE SYSTEMS GROUP

9

Initial Computation (3)

  • Example: n = 4, k = 2, j = 2
  • j = 1:

Continuous Inverse Ranking Queries in Uncertain Streams

( )

9 . 9 . 1 1 . 1

1 1

, , 1 1 ,

= ⋅ + ⋅ = − ⋅ + ⋅ =

− t

  • t

t

  • t

t

p P p P P 1 .

1 =

t

  • p

2 =

t

  • p

6 .

3 =

t

  • p

1

4 =

t

  • p

( )

1 . 9 . 1 . 1 1

1 1

, 1 , 1 , 1

= ⋅ + ⋅ = − ⋅ + ⋅ =

t

  • t

t

  • t

t

p P p P P =

t

C

slide-10
SLIDE 10

DATABASE SYSTEMS GROUP

10

Initial Computation (3)

  • Example: n = 4, k = 2, j = 3
  • j = 1:
  • j = 3:

Continuous Inverse Ranking Queries in Uncertain Streams

( )

9 . 9 . 1 1 . 1

1 1

, , 1 1 ,

= ⋅ + ⋅ = − ⋅ + ⋅ =

− t

  • t

t

  • t

t

p P p P P 1 .

1 =

t

  • p

2 =

t

  • p

6 .

3 =

t

  • p

1

4 =

t

  • p

( )

1 . 9 . 1 . 1 1

1 1

, 1 , 1 , 1

= ⋅ + ⋅ = − ⋅ + ⋅ =

t

  • t

t

  • t

t

p P p P P

( )

36 . 4 . 9 . 6 . 1

3 3

1 , 1 , 1 2 ,

= ⋅ + ⋅ = − ⋅ + ⋅ =

− t

  • t

t

  • t

t

p P p P P

( )

58 . 4 . 1 . 6 . 9 . 1

3 3

1 , 1 1 , 2 , 1

= ⋅ + ⋅ = − ⋅ + ⋅ =

t

  • t

t

  • t

t

p P p P P =

t

C

slide-11
SLIDE 11

DATABASE SYSTEMS GROUP

11

Initial Computation (3)

  • Example: n = 4, k = 2, j = 4
  • j = 1:
  • j = 3:

Continuous Inverse Ranking Queries in Uncertain Streams

( )

9 . 9 . 1 1 . 1

1 1

, , 1 1 ,

= ⋅ + ⋅ = − ⋅ + ⋅ =

− t

  • t

t

  • t

t

p P p P P 1 .

1 =

t

  • p

2 =

t

  • p

6 .

3 =

t

  • p

1

4 =

t

  • p

( )

1 . 9 . 1 . 1 1

1 1

, 1 , 1 , 1

= ⋅ + ⋅ = − ⋅ + ⋅ =

t

  • t

t

  • t

t

p P p P P

( )

36 . 4 . 9 . 6 . 1

3 3

1 , 1 , 1 2 ,

= ⋅ + ⋅ = − ⋅ + ⋅ =

− t

  • t

t

  • t

t

p P p P P

( )

58 . 4 . 1 . 6 . 9 . 1

3 3

1 , 1 1 , 2 , 1

= ⋅ + ⋅ = − ⋅ + ⋅ =

t

  • t

t

  • t

t

p P p P P 1 =

t

C

slide-12
SLIDE 12

DATABASE SYSTEMS GROUP

12

Initial Computation (3)

  • Example: n = 4, k = 2
  • j = 1:
  • j = 3:
  • Initial PIR result:

Continuous Inverse Ranking Queries in Uncertain Streams

( )

9 . 9 . 1 1 . 1

1 1

, , 1 1 ,

= ⋅ + ⋅ = − ⋅ + ⋅ =

− t

  • t

t

  • t

t

p P p P P 1 .

1 =

t

  • p

2 =

t

  • p

6 .

3 =

t

  • p

1

4 =

t

  • p

( )

1 . 9 . 1 . 1 1

1 1

, 1 , 1 , 1

= ⋅ + ⋅ = − ⋅ + ⋅ =

t

  • t

t

  • t

t

p P p P P

( )

( )

36 . 4 . 9 . 6 . 1

3 3

1 , 1 , 1 2 , t PBR t

  • t

t

  • t

t

P p P p P P = = ⋅ + ⋅ = − ⋅ + ⋅ =

( )

( )

1 58 . 4 . 1 . 6 . 9 . 1

3 3

1 , 1 1 , 2 , 1 t PBR t

  • t

t

  • t

t

P p P p P P = = ⋅ + ⋅ = − ⋅ + ⋅ = 1 =

t

C

( ) ( ) ( )

1 1 1 1 1 = − = − − =

t PBR t PBR t q

P P P

( ) ( ) ( )

36 . 1 1 2 2 = = − − =

t PBR t PBR t q

P P P

slide-13
SLIDE 13

DATABASE SYSTEMS GROUP

13

Incremental Processing (1)

  • Location update of one alternative location of object o:

compute ∀i = 1,...,k

  • Naive solution: Apply PBR ⇒ O(n) ∀i = 1,...,k
  • Enhanced solution: just consider update of o

⇒ O(1) ∀i = 1,...,k

– Phase 1

  • Remove effect of old value

from ∀i = 0,...,k-1

  • Obtain intermediate result

– Phase 2

  • Incorporate effect of new value

in

  • Obtain new PIR result

Continuous Inverse Ranking Queries in Uncertain Streams

( )

i Pt

q

) (i Pt

PBR

) ( ˆ

1 i

Pt

PBR +

) ( ˆ

1 i

Pt

PBR +

) (

1 i

Pt

q + t

  • p

1 + t

  • p
slide-14
SLIDE 14

DATABASE SYSTEMS GROUP

14

Incremental Processing (2)

  • Phase 1: Three cases

1. ⇒ 2. ⇒ and 3. ⇒ remove from

Continuous Inverse Ranking Queries in Uncertain Streams

( ) ( ) ( ) (

)

t

  • t

PBR t

  • t

PBR t PBR

p i P p i P i P − ⋅ + ⋅ − = 1 ˆ 1 ˆ

( ) ( ) ( )

t

  • t
  • t

PBR t PBR t PBR

p p i P i P i P − ⋅ − − = 1 1 ˆ ˆ

( ) ( )

t

  • t

PBR t PBR

p P P − = 1 ˆ =

t

  • p

( ) ( )

i P i P

t PBR t PBR

= ˆ 1 < <

t

  • p

( ) ( )

i P i P

t PBR t PBR

= ˆ 1 =

t

  • p

1 C C

t 1 t

− =

+

( )

i Pt

PBR t

  • p
slide-15
SLIDE 15

DATABASE SYSTEMS GROUP

15

Incremental Processing (3)

  • Phase 2: Three cases

1. ⇒ 2. ⇒ and 3. ⇒ compute applying PBR

  • New PIR result:

Continuous Inverse Ranking Queries in Uncertain Streams

1 = + t

  • p

1

1 <

<

+ t

  • p

( ) ( )

i P i P

t PBR t PBR

ˆ

1

=

+

1

1 = + t

  • p

1 C C

t 1 t

+ =

+

( ) ( )

i P i P

t PBR t PBR

ˆ

1

=

+

( )

i Pt

PBR 1 +

( ) ( ) ( ) (

)

1 1 1

1 ˆ 1 ˆ

+ + +

− ⋅ + ⋅ − =

t

  • t

PBR t

  • t

PBR t PBR

p i P p i P i P

( )

( )

⎩ ⎨ ⎧ + + ≤ ≤ + − − =

+ + + + +

else 1 1 if 1

1 1 1 1 1

k C i C C i P i P

t t t t PBR t q

slide-16
SLIDE 16

DATABASE SYSTEMS GROUP

  • Example: n = 4, k = 2

1 .

1 =

t

  • p

16

Incremental Processing (4)

Continuous Inverse Ranking Queries in Uncertain Streams

2 =

t

  • p

2 . 6 .

1

3 3

= → =

+ t

  • t
  • p

p 1

2

4 4

= → =

+ t

  • t
  • p

p

1 =

t

C

q

  • 1
  • 3
  • 2
  • 4

q

  • 1
  • 3
  • 2
  • 4
slide-17
SLIDE 17

DATABASE SYSTEMS GROUP

  • Example: n = 4, k = 2

– Phase 1 (Case 3): – Phase 2 (Case 3): – PIR result: 1 .

1 =

t

  • p

17

Incremental Processing (4)

Continuous Inverse Ranking Queries in Uncertain Streams

2 =

t

  • p

2 . 6 .

1

3 3

= → =

+ t

  • t
  • p

p 1

2

4 4

= → =

+ t

  • t
  • p

p

1 =

t

C

( ) ( ) ( ) (

)

72 . 8 . 9 . 2 . 1 ˆ 1 ˆ

1 1 1

3 3

= ⋅ + ⋅ = − ⋅ + ⋅ − =

+ + + t

  • t

PBR t

  • t

PBR t PBR

p P p P P

( ) ( ) ( ) (

)

26 . 8 . 1 . 2 . 9 . 1 1 ˆ ˆ 1

1 1 1

3 3

= ⋅ + ⋅ = − ⋅ + ⋅ =

+ + + t

  • t

PBR t

  • t

PBR t PBR

p P p P P

( ) ( ) ( )

1 . 4 . 6 . 9 . 58 . 1 ˆ 1 1 ˆ

3 3

= ⋅ − = − ⋅ − =

t

  • t
  • t

PBR t PBR t PBR

p p P P P

( ) ( )

9 . 4 . 36 . 1 ˆ

3

= = − =

t

  • t

PBR t PBR

p P P

( ) ( )

1 1

1

= → =

+ t q t q

P P

( ) ( )

72 . 2 36 . 2

1

= → =

+ t q t q

P P

slide-18
SLIDE 18

DATABASE SYSTEMS GROUP

  • Example: n = 4, k = 2

– Phase 1 (Case 1): – Phase 2 (Case 2): – PIR result: 1 .

1 =

t

  • p

18

Incremental Processing (4)

Continuous Inverse Ranking Queries in Uncertain Streams

2 =

t

  • p

2 . 6 .

1

3 3

= → =

+ t

  • t
  • p

p 1

2

4 4

= → =

+ t

  • t
  • p

p

1 =

t

C

( ) ( )

72 . ˆ

1 1

= =

+ + t PBR t PBR

P P

( ) ( ) ( ) ( )

72 . 1 1 1 1

2 2 2 1

= = − − = → =

+ + + + t PBR t PBR t q t q

P P P P

( ) ( ) ( ) ( )

26 . 1 1 2 2 72 . 2

2 2 2 1

= = − − = → =

+ + + + t PBR t PBR t q t q

P P P P

( ) ( )

72 . ˆ

1 2

= =

+ + t PBR t PBR

P P

( ) ( )

26 . 1 ˆ 1

1 2

= =

+ + t PBR t PBR

P P

( ) ( )

26 . 1 1 ˆ

1 1

= =

+ + t PBR t PBR

P P

=

t

C

slide-19
SLIDE 19

DATABASE SYSTEMS GROUP

19

Experiments (1)

Continuous Inverse Ranking Queries in Uncertain Streams

0,2 0,4 0,6 0,8 1 1,2 1,4 1,6 1,8 2 1.000 2.000 3.000 4.000 5.000

time per update [ms] database size n enhanced naive

  • dimensions = 2, m = 10, σ = 5, k = n, buffer = 3
slide-20
SLIDE 20

DATABASE SYSTEMS GROUP

20

Experiments (2)

Continuous Inverse Ranking Queries in Uncertain Streams

2.000 4.000 6.000 8.000 1.000 2.000 3.000 4.000 5.000

time to process the full stream [ms] database size n enhanced naive

  • dimensions = 2, m = 10, σ = 5, k = n, buffer = 3
slide-21
SLIDE 21

DATABASE SYSTEMS GROUP

21

Experiments (3)

Continuous Inverse Ranking Queries in Uncertain Streams

50.000 100.000 150.000 200.000 250.000 1.000 2.000 3.000 4.000 5.000 6.000

time to process the full stream [ms] database size n enhanced naive

  • IIP dataset, dimensions = 2, m = 10, k = n, buffer = 3
slide-22
SLIDE 22

DATABASE SYSTEMS GROUP

10.000 20.000 30.000 40.000 50.000 60.000 70.000 80.000 2 4 6 8 10

time to process the full stream [ms] standard deviation σ

22

Experiments (4)

Continuous Inverse Ranking Queries in Uncertain Streams

enhanced naive

  • n = 10,000, dimensions = 2, m = 10, k = n, buffer = 3
slide-23
SLIDE 23

DATABASE SYSTEMS GROUP

23

Summary

  • Efficient solution for PIR queries on continuous data yielding

update costs of O(k) instead of O(k·n)

  • The framework can be adapted to other query types, e.g. the

probabilistic threshold inverse ranking query

  • Future work: approximate approach using lower and upper

bounds for the probabilities and applying the concept of Generating Functions

Continuous Inverse Ranking Queries in Uncertain Streams