Age nda End of Que r y Opt i m i z a t i on • Que s t i ons ? Da t a I nt e gr a t i on • Fi ni s h l a s t bi t s of que r y opt i m i z a t i on • Da t a i nt e gr a t i on: t he l a s t f r ont i e r M ay 24, 2004 Que r y Exe c ut i on Que r y Exe c ut i on Pl a ns Que r y Us e r / upda t e Appl i c a t i on Que r y c om pi l e r SELECT S.sname Que r y e xe c ut i on buyer FROM Purchase P, Person Q pl a n Exe c ut i on e ngi ne WHERE P.buyer=Q.name AND s Re c or d, i nde x Q.city=‘seattle’ AND Ci t y=‘ seat t l e’ phone>’ 5430000’ r e que s t s Q.phone > ‘5430000’ I nde x/ r e c or d m gr . Pa ge Que r y Pl a n: c om m a nds Buyer =nam e ( Si m pl e Nest ed Loops) Buf f e r m a na ge r •l ogi c a l t r e e Re a d/ wr i t e •i m pl e m e nt a t i on Pur chase Per son pa ge s ( Tabl e scan) ( I ndex scan) St or a ge m a na ge r c hoi c e a t e ve r y node Som e oper at or s ar e f r om r el at i onal •s c he dul i ng of al gebr a, and ot her s ( e. g. , scan, gr oup) s t or a ge ar e not . ope r a t i ons . Pl a ns f or Si ngl e - Re l a t i on Que r i e s ( Pr e p f or J oi n or de r i ng) W e ’ ve Se e n So Fa r • Tas k:c r e a t e a que r y e xe c ut i on pl a n f or a s i ngl e Se l e c t - pr oj e c t - gr oup- by bl oc k. • Tr a ns f or m a t i on r ul e s • K e y i de a:c ons i de r e a c h pos s i bl e ac c e s s pat h t o • The c os t m odul e : t he r e l e va nt t upl e s of t he r e l a t i on. Choos e t he – Gi ve n a c a ndi da t e pl a n: wha t i s i t s e xpe c t e d c he a pe s t one . c os t and s i ze of t he r e s ul t ? • The di f f e r e nt ope r a t i ons a r e e s s e nt i a l l y c a r r i e d out t oge t he r ( e . g. , i f a n i nde x i s us e d f or a s e l e c t i on, pr oj e c t i on i s done f or e a c h r e t r i e ve d t upl e , a nd t he • Now: put t i ng i t a l l t oge t he r . r e s ul t i ng t upl e s a r e pi pe l i ne d i nt o t he a ggr ega t e c om put a t i on) . 1
SELECT S.sid Exa m pl e FROM Sailors S De t e r m i ni ng J oi n Or de r i ng WHERE S.rating=8 • I f we ha ve a n I nde x on r at i ng: – ( 1/ NKe ys ( I ) ) *NTupl e s ( R) = ( 1/ 10) * 40000 t upl e s r e t r i e ve d. • R1 R2 … . Rn – Cl us t e r e d i nde x: ( 1/ NKe ys ( I ) ) * ( NPa ge s ( I ) +NPa ge s ( R) ) = ( 1/ 10) * ( 50+500) pa ge s a r e r e t r i e ve d ( = 55) . • J oi n t r e e : – Unc l us t e r e d i nde x: ( 1/ NKeys ( I ) ) * ( NPa ge s ( I ) +NTupl e s ( R) ) = ( 1/ 10) * ( 50+40000) pa ge s a r e r e t r i e ve d. • I f we ha ve a n i nde x ons i d: – W oul d ha ve t o r e t r i e ve a l l t upl e s / pa ge s . W i t h a c l us t e r e d i nde x, t he c os t i s 50+500. • Doi ng a f i l e s c a n: we r e t r i e ve a l l f i l e pa ge s( 500) . R3 R1 R2 R4 • A j oi n t r e e r e pr e s e nt s a pl a n. An opt i m i z e r ne e ds t o i ns pe c t m a ny ( a l l ? ) j oi n t r e e s Type s of J oi n Tr e e s Type s of J oi n Tr e e s • Le f t de e p: • Bus hy: R4 R2 R3 R2 R4 R5 R3 R1 R5 R1 Type s of J oi n Tr e e s Pr obl e m • Ri ght de e p: • Gi ve n: a que r y R1 R2 … Rn • As s um e we ha ve a f unc t i on c os t ( ) t ha t gi ves us t he c os t of eve r y j oi n t r e e R3 • Fi nd t he be s t j oi n t r e e f or t he que r y R1 R5 R2 R4 2
J oi n Or de r i ng by Dyna m i c Dynam i c Pr ogr a m m i ng: s t e p 1 Pr ogr a m m i ng • I de a : f or e a c h s ubs e t of {R1, … , Rn}, c om put e t he be s t • St e p 1: For e a c h {Ri } do: pl a n f or t ha t s ubs e t – Si ze ( {Ri }) = B( Ri ) • I n i nc r e a s i ng or de r of s e t c a r di na l i t y: – Pl a n( {Ri }) = Ri – St e p 1: f or {R1}, {R2}, … , {Rn} – St e p 2: f or {R1, R2}, {R1, R3}, … , {Rn- 1, Rn} – Cos t ( {Ri }) = ( cos t of s ca nni ng Ri ) – … – St e p n: f or {R1, … , Rn} • A s ubs e t of {R1, … , Rn} i s a l s o c a l l e d a s ubque r y Dynam i c Pr ogr a m m i ng: s t e p i : A f e w pr a c t i c a l c ons i de r a t i ons • St e p i : For e a c h Q i n {R1, … ,Rn} of • He ur i s t i c s f or r e duc i ng t he s e a r c h s pa c e – Re s t r i c t t o l ef t l i ne a r t r e e s c a r di na l i t y i do: – Re s t r i c t t o t r e e s “wi t houtc a r t e s i a npr oduc t ” – Com put e Si ze ( Q) • Ne e d m or e t ha n j us t one pl a n f or e a c h s ubque r y: – For e ve r y pa i r of s ubque r i e s Q’ , Q’ ’ – “i nt e r e s t i ng or de r s ” : s a ve a s i ngl e pl a n f or e ve r y s . t . Q = Q’ U Q’ ’ pos s i bl e or de r i ng of t he r e s ul t . c om put e cos t ( Pl a n( Q’ ) Pl an( Q’ ’ ) ) – W hy? – Cos t ( Q) = t he s m al l e s t s uc h cos t – Pl a n( Q) = t he cor r e s pondi ng pl a n Que r y Opt i m i z a t i on Sum m a r y • Cr e a t e i ni t i a l ( na ï ve ) que r y e xe c ut i on pl a n. • Appl y t r a ns f or m a t i on r ul e s : Da t a I nt e gr a t i on – Tr y t o un- ne s t bl ocks – M ove pr e di ca t e s a nd gr oupi ng ope r a t or s . • Cons i de r e a c h bl oc k a t a t i m e : – De t e r m i ne j oi n or de r – Pus h s e l e c t i ons , pr oj e c t i ons i f pos s i bl e . 3
Recommend
More recommend