Age nda • Que s t i ons on pha s e 2 of t he pr oj e c t • Today: DBM S i nt e r na l s pa r t 2 - - DBM S I nt e r na l s – Que r y e xe c ut i on Exe c ut i on a nd Opt i m i z a t i on – Que r y opt i m i z a t i on M ay 10t h, 2004 • Ne xt we e k: – Thur s day, not M onday. – M os t l y Phi l Be r ns t e i n on m e t a - da t a m anage m e nt . Que r y Exe c ut i on Que r y Exe c ut i on Pl a ns Que r y Us e r / upda t e Appl i c a t i on Que r y c om pi l e r SELECT S.sname Que r y e xe c ut i on buyer FROM Purchase P, Person Q pl a n Exe c ut i on e ngi ne WHERE P.buyer=Q.name AND s Re c or d, i nde x Q.city=‘seattle’ AND Ci t y=‘ seat t l e’ phone>’ 5430000’ r e que s t s Q.phone > ‘5430000’ I nde x/ r e c or d m gr . Pa ge Que r y Pl a n: c om m a nds Buyer =nam e ( Si m pl e Nest ed Loops) Buf f e r m a na ge r •l ogi c a l t r e e Re a d/ wr i t e •i m pl e m e nt a t i on Pur chase Per son pa ge s ( Tabl e scan) ( I ndex scan) St or a ge m a na ge r c hoi c e a t e ve r y node Som e oper at or s ar e f r om r el at i onal •s c he dul i ng of al gebr a, and ot her s ( e. g. , scan, gr oup) s t or a ge ar e not . ope r a t i ons . How do we c om bi ne Ope r a t i ons ? The Le a ve s of t he Pl a n: Sc a ns • The i t e r a t or m ode l .Ea c h ope r a t i on i s i m pl em e nt e d by 3 f unc t i ons : – Ope n: s e t s up t he da t a s t r uc t ur e s a nd pe r f or m s i ni t i a l i z a t i ons • Ta bl e s c a n:i t e r a t e t hr ough t he r e c or ds of – Ge t Ne xt : r e t ur ns t he t he ne xt t upl e of t he r e s ul t . t he r e l a t i on. – Cl os e : e nds t he ope r a t i ons . Cl e a ns up t he da t a s t r uc t ur e s . • Ena bl e s pi pe l i ni ng! • I nde x s c a n:go t o t he i nde x, f r om t he r e ge t • Cont r a s t wi t h da t a - dr i ve n m a t e r i a l i z e m ode l . t he r e c or ds i n t he f i l e ( whe n woul d t hi s be • Som e t i m e s i t ’ s t he s a m e ( e . g. , s or t e d s c a n) . be t t e r ?) • Sor t e d s c a n:pr oduc e t he r e l a t i on i n or de r . I m pl e m e nt a t i on de pe nds on r e l a t i on s i z e . 1
I m pl e m e nt i ng Re l a t i ona l Ope r a t i ons Sc he m a f or Exa m pl e s • W e wi l l c ons i de r how t o i m pl e m e nt : Purchase ( buyer :string, seller : string, product : integer), s – Se l e c t i on ( ) Se l e c t s a s ubs e t of r ows f r om r e l a t i on. Person ( name :string, city :string, phone : integer) p – Pr oj e c t i on ( ) De l e t e s unwant e d c ol um ns f r om r e l a t i on. • Pur c ha s e : > < – Joi n ( ) Al l ows us t o com bi ne t wo r e l a t i ons . – Ea c h t upl e i s 40 byt e s l ong, 100 t upl e s pe r page , 1000 – Se t - di f f e r enc e Tupl e s i n r e l n. 1, but not i nr e l n. 2. page s ( i . e . , 100, 000 t upl e s , 4M B f or t he e nt i r e r e l a t i on) . – Uni on Tupl e s i n r e l n. 1 a nd i nr e l n. 2. • Pe r s on: – Aggr e gat i on ( N , e t c . ) a nd SUM , M I GROUP BY – Ea c h t upl e i s 50 byt e s l ong, 80 t upl e s pe r page , 500 page s ( i . e , 40, 000 t upl e s , 2M B f or t he e nt i r e r e l a t i on) . SELECT * Us i ng a n I nde x f or Se l e c t i ons Si m pl e Se l e c t i ons FROM Person R WHERE R.phone < ‘543%’ • Cos t de pe nds on #qual i f yi ng t upl e s , a nd cl us t e r i ng. – Cos t of f i ndi ng qua l i f yi ng da t a e nt r i e s ( t ypi c a l l y s m a l l ) pl us c os t • Of t he f or m s R at ( ) R op . t r val ue of r e t r i e vi ng r e c or ds . – I n e xa m pl e , a s s um i ng uni f or m di s t r i but i on of phone s , a bout 54% • W i t h no i nde x, uns or t e d: M us t e s s e nt i a l l y s c a n t he whol e r e l a t i on; of t upl e s qua l i f y ( 500 pa ge s , 50000 t upl e s ) . W i t h a c l us t e r e d c os t i s M ( #pa ge s i n R) . i nde x, c os t i s l i t t l e m or e t ha n 500 I / Os ; i func l us t e r e d, up t o 50000 • W i t h a n i nde x on s e l e c t i on a t t r i but e : Us e i nde x t o f i nd qua l i f yi ng I / Os ! da t a e nt r i e s , t he n r e t r i e ve c or r e s pondi ng da t a r e c or ds . ( Ha s h i nde x • I mpor t ant r e f i neme nt f orunc l us t e r ed i nde x es : us e f ul onl y f or e qua l i t y s el e c t i ons . ) 1. Fi nd s or t t he r i d’ s of t he qua l i f yi ng da t a e nt r i e s . • Re s ul t s i z e e s t i m a t i on: 2. Fe t c h r i ds i n or de r . Thi s e ns ur e s t ha t e a c h da t a pa ge i s l ooke d a t ( Si z e of R) * r e duc t i on f a c t or . j us t onc e ( t hough # of s uc h pa ge s l i ke l y t o be hi ghe r t ha n wi t h c l us t e r i ng) . M or e on t hi s l a t e r . Two Appr oa c he s t o Ge ne r a l I nt e r s e c t i on of Ri ds Se l e c t i ons • Se c ond a ppr oa c h • Fi r s t a ppr oa c h:Fi nd t he mos t s el e c t i v e ac c e s s pat h, r e t r i e ve t upl e s us i ng i t , a nd a ppl y any r em a i ni ng – Ge t s e t s of r i ds of da t a r e cor ds us i ng e ac h m a t c hi ng t e r m s t ha t don’ t m a t c h t he i nde x: i nde x. – M os t s e l e c t i ve ac ce s s pat h: An i nde x or f i l e s ca n t ha t – The n i nt er s ec tt he s e s e t s of r i ds . we e s t i m a t e wi l l r e qui r e t he f e we s t pa ge I / Os . – Re t r i e ve t he r e cor ds a nd appl y any r e m a i ni ng t e r m s . – Cons i de r c i t y=“s e at t l e AND phone <“543% ” : • A ha s h i nde x on c i t y c an be us e d; t he n, phone<“ 543% ” m us t be c he c ke d f or e ac h r e t r i e ve d t upl e . • Si m i l a r l y, a b- t r e e i nde x on phonecoul d be us e d; c i t y =“s e at t l e ” m us t t he n be c he cke d. 2
Recommend
More recommend