- Query Processing over Incomplete
Autonomous Databases
- !
Query Processing over Incomplete Autonomous Databases - - PowerPoint PPT Presentation
Query Processing over Incomplete Autonomous Databases
,.+./- *- 0,)). 1.2 3 00+,- 00+- ),+4- .10/4 *4 5+ ,+*- .+/- ..+/- 10*1 *. 6'+
7( 8
'
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ' ' ' 4224 ' '
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ' ' ' 4224 ' '
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ; ; ' 4224 ' '
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ; ; ' 4224 '
Select Top K Rewritten Queries <3=>'6? <4=>'6@? <7=>'6.
' <: <' = ' 3 ' ? 4223 ! 4
4224 ! 7 8 . 4225 ! ?
4227 B% 5 ' ! 422? B% A 9 ! 4224 ' B ' ? 422A B% ' <: <' = ' 3 ' ? 4223 ! 4
4224 ! 7 8 . 4225 !
AFD: Model~> Body style
' <: <' = ' 5(' ?
4227 B% 2+B B ' ? 422A B% 2+7 Ranked Relevant Uncertain Answers
Make, Body ~> Model P(Model=Accord | Make=Honda, Body=Coupe)
4223 = <: ' ' !&
– <3=>'6? – <4=>'6@? – <7=>'6.
! 4225 . 8 ! 4227 @?
<' = ' ' ? 4223 !
4224 !
AFD: Model~> Body
3 '?,@?. 4 '
α
All tuples returned for a single query are ranked equally
Make, Body ~> Model yields This car is 83% likely to have Model=Accord given that its Make=Honda and Body=Sedan
>?'@5A
<'>3#?'H<:H<'H=HH<A
#?<:H<'H=HH<A
>?'@5A >?<'@5A >?<'@5A I
#2 F?<'
AFDs learned from Cars.com
<: <' = < ' ' B% A2,?22 9 ! 422? 43,352 B% ! 4224 57,4B5 ' ? 4227 ?3,A52 4225 4224 4223 = <:
B% ') 9 ( ') 57,4B5 4224 !
B% B%
A2,?22 A2,?22 < ')
' '
B% 1* "' B%
'
B% 11
Make=Honda
5 ? 7 4 3 ' !& B% ! B% ! ' <: <' ' ?
?5A@+)H ?5A@+*
8 .
?5A@+4H ?5A@+/
' ! t1 + t3 + t2 = 3 t 1 + t 3 + . 9 ( t 2 ) + . 4 ( t 4 ) = 3 . 3
Include a portion of each tuple relative to the probability its missing value matches the query constraint
>?5?JA 8$'@5A
Only include tuples whose most likely missing value matches the query constraint
– 5
O !+ O B%,55,222&
– 5
O 9PE:( O 33%,422,222&
– 5
O !',!:& O 34%,?5,222&
– 7C35Q'%
– 32Q&( – '''&) ('
–
– <+(+',(((,R,+ – ('+(+,&,+
0.2 0.4 0.6 0.8 1 Year Make Model Price Mileage Body Certified
Accuracy
NBC AFD-Enhanced NBC BayesNet 3 BayesNet 2 Decision Tree
– M <8:E(&
– (( M <8:E/)& % '%
– #E &'
– (((/( ) (&'' – <8:E( R/ )'(%&
–
– *&#C – *(%)
– 8%'&&M &&&!'' 9%,TC9%,!'9% – 8%%&&M /'%&'( %) '&%)
– 9&')%'%(&%%. – ),) ,''&%' '('%
– &&.)( %.) – P(&)('%
– !&&&&(%%( ,,',( ,,+ – P) /'&'%)%) '%( Our work fits here
> L'!!:M 5$> ';C
(9$(
– 5'# – 5&:'# – 5&'# E#
'
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ' ' ' 4224 ' '
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ' ' ' 4224 ' '
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ; ; ' 4224 ' '
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ; ; ' 4224 '
5 ? 7 4 3 ' !& B% ! B% ! ' <: <' ++ ' ?
?5A@+)H ?5A@+*
8 .
?5A@+4H ?5A@+/
' ! t1 + .9(t2) + t3 + .4(t4) =3.3 t 1 + t 2 + t 3 = 3
Only include tuples whose most likely missing value matches the query constraint
Include a portion of each tuple relative to the probability its missing value matches the query constraint
>?5?JA8$'@5A
('(')
%&
'& +(+&) &'
(%''
)*+),- ,.+./- *- 0,)). 1.2 3 00+,- 00+- ),+4- .10/4 *4 5+ ,+*- .+/- ..+/- 10*1 *. 6'+
7( 8
'
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ' ' ' 4224 ' '
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ' ' ' 4224 ' '
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ; ; ' 4224 ' '
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ; ; ' 4224 '
'
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ' ' ' 4224 ' '
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ' ' ' 4224 ' '
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ; ; ' 4224 ' '
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ; ; ' 4224 '
α
NOTE: All tuples returned for a single query are ranked equally
'
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ' ' ' 4224 ' '
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ' ' ' 4224 ' '
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ; ; ' 4224 ' '
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ; ; ' 4224 '
'
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ' ' ' 4224 ' '
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ' ' ' 4224 ' '
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ; ; ' 4224 ' '
3111 ' ' 033,422 032,522
<: <' = ' ' ' 4223 ; ; ' 4224 '
Select Top K Rewritten Queries <3=>'6? <4=>'6@? <7=>'6.
' <: <' = ' 3 ' ? 4223 ! 4
4224 ! 7 8 . 4225 ! ?
4227
' ! 422?
9 ! 4224 ' B ' ? 422A
<: <' = ' 3 ' ? 4223 ! 4
4224 ! 7 8 . 4225 !
AFD: Model~> Body style
' <: <' = ' 5(' ?
4227
B ' ? 422A
Ranked Relevant Uncertain Answers
measure FMeasure=(1+α)*P*R/(α*P+R) P– EstimatedPrecision R– EstimatedRecallbasedonPandEstimated Selectivity
– )/'& &< –
&-- – 8&&)(& 8:-
! 4225 . 8 7 ! 4227 @?
' <: <' = ' 3 ' ? 4223 ! 4
4224 !
AFD: Model~> Body
α
NOTE: All tuples returned for a single query are ranked equally
0.2 0.4 0.6 0.8 1 Year Make Model Price Mileage Body Certified
Accuracy
NBC AFD-Enhanced NBC BayesNet 3 BayesNet 2 Decision Tree
O #C%6CB>%E9'' $($
O 8$$($($''E $Q
Sets tradeoff
& recall Resource limitation on # of rewritten queries
0.2 0.4 0.6 0.8 1 Year Make Model Price Mileage Body Certified
Accuracy
NBC AFD-Enhanced NBC BayesNet 3 BayesNet 2 Decision Tree