Cour s e Age nda Le c t ur e 5: Re l a t i ona l Al - - PDF document

cour s e age nda le c t ur e 5 re l a t i ona l al ge br a
SMART_READER_LITE
LIVE PREVIEW

Cour s e Age nda Le c t ur e 5: Re l a t i ona l Al - - PDF document

Cour s e Age nda Le c t ur e 5: Re l a t i ona l Al ge br a Today, XM L a nd r e l a t i ona l a l ge br a a nd XM L Ne xt t wo we e ks : t he i nt e r na l s of DBM S. Cove r


slide-1
SLIDE 1

1

1

Le c t ur e 5: Re l a t i

  • na

l Al ge br a a nd XM L

M onday, Apr i l 26t h, 2004

2

Cour s e Age nda

  • Today,

XM L a nd r e l a t i

  • na

l a l ge br a

  • Ne

xt t wo we e ks : t he i nt e r na l s

  • f

DBM S.

– Cove r e d i n gor y de t a i l i n t he book, but s t a y t une d f

  • r

r e a di ng a s s i gnm e nt s .

  • M ay

20t h ( not 17t h! ) : Phi l Be r ns t e i n

  • n

m e t a

  • da

t a m a na ge m e nt .

  • M ay

24t h: da t a i nt e gr a t i

  • n.
  • M ay

27t h: f i na l e xa m .

3

Age nda

  • Re

l a t i

  • na

l a l ge br a

  • XM L:

– W ha t i s i t a nd why do we c a r e ? – Da t a m ode l – Que r y l a nguage : XPa t h – Re al que r y l a nguage : XQue r y. – Ge ne r al r um i na t i

  • ns

a bout XM L.

4

Re l a t i

  • na

l Al ge br a

  • For

m a l i s m f

  • r

c r e a t i ng ne w r e l a t i

  • ns

f r

  • m

e xi s t i ng

  • ne

s

  • I

t s pl a c e i n t he bi g pi c t ur e :

De c l a r t i ve quer y l anguage Al ge br a I m pl em ent a t i

  • n

SQL, r e l a t i

  • nal

c a l cul us Re l a t i

  • nal

al ge br a Re l a t i

  • nal

bag a l gebr a

5

Re l a t i

  • na

l Al ge br a

  • Fi

ve

  • pe

r a t

  • r

s :

– Uni

  • n:

¨ – Di f f e r enc e:

  • – Se

l e c t i

  • n:s

– Pr

  • j

e c t i

  • n:

P – Ca r t e s i an Pr

  • duc

t : ·

  • De

r i ve d

  • r

a uxi l i a r y

  • pe

r a t

  • r

s :

– I nt e r s e c t i

  • n,

c

  • m pl

em ent – J

  • i

ns ( na t ur al , e qui

  • j
  • i

n, t he t a j

  • i

n, s em i

  • j
  • i

n) – Re nam i ng:r

6

1. Uni

  • n

a nd 2. Di f f e r e nc e

  • R1

¨ R2

  • Exa

m pl e :

– Ac t i ve Em pl

  • ye

e s ¨ Re t i r e dEm pl

  • ye

e s

  • R1

– R2

  • Exa

m pl e :

– Al l Em pl

  • ye

e s

  • Re

t i r e dEm pl

  • ye

e s

slide-2
SLIDE 2

2

7

W ha t a bout I nt e r s e c t i

  • n

?

  • I

t i s a de r i ve d

  • pe

r a t

  • r
  • R1

˙ R2 = R1 – ( R1 –R2)

  • Al

s

  • e

xpr e s s e d a s a j

  • i

n ( wi l l s e e l a t e r )

  • Exa

m pl e

– Uni

  • ni

ze dEm pl

  • ye

e s ˙ Re t i r e dEm pl

  • ye

e s

8

3. Se l e c t i

  • n
  • Re

t ur ns a l l t upl e s whi c h s a t i s f y a c

  • ndi

t i

  • n
  • Not

a t i

  • n:

sc( R)

  • Exa

m pl e s

– sSa

l a r y > 40000(

Em pl

  • ye

e ) – sna

m e = “ Sm i t h”(

Em pl

  • ye

e )

  • The

c

  • ndi

t i

  • n

c c a n be =, <, £, >,‡, <>

9

Se l e c t i

  • n

Exam pl e Em pl

  • ye

e SSN Na m e De pa r t m e nt I D Sa l a r y 999999999 J

  • hn

1 30, 000 777777777 Tony 1 32, 000 888888888 Al i c e 2 45, 000 SSN Na m e De pa r t m e nt I D Sa l a r y 888888888 Al i c e 2 45, 000 Fi nd al l e m pl

  • ye

e s wi t h s a l a r y m or e t ha n $40, 000. s

Sa l a r y > 40000(

Em pl

  • ye

e )

10

4. Pr

  • j

e c t i

  • n
  • El

i m i na t e s col um ns , t he n r e m ove s dupl i c a t e s

  • Not

a t i

  • n:

P

A1, … , An(

R)

  • Exam pl

e : pr

  • j

e c t s

  • c

i a l

  • s

e c ur i t y num be r a nd na m e s :

– P SSN,

Na me(

Em pl

  • ye

e ) – Out put s c hem a: Ans wer ( SSN, Nam e )

11

Proj ec t i

  • n

Exam pl e Em pl

  • yee

SSN Na m e De pa r t m e nt I D Sa l a r y 999999999 J

  • hn

1 30, 000 777777777 Tony 1 32, 000 888888888 Al i ce 2 45, 000 SSN Na m e 999999999 J

  • hn

777777777 Tony 888888888 Al i ce P P P P SSN,

Nam e(

Em pl

  • ye

e )

12

5. Ca r t e s i a n Pr

  • duc

t

  • Ea

c h t upl e i n R1 wi t h e a c h t upl e i n R2

  • Not

a t i

  • n:

R1 · R2

  • Exa

m pl e :

– Em pl

  • ye

e · De pe nde nt s

  • Ve

r y r a r e i n pr a c t i c e ; m a i nl y us e d t

  • e

xpr e s s j

  • i

ns

slide-3
SLIDE 3

3

13

Cart e s i an Pr

  • duc

t Exam pl e Em pl

  • ye

e Na m e SSN J

  • hn

999999999 Tony 777777777 De pe nde nt s Em pl

  • yee

SSN Dnam e 999999999 Em i l y 777777777 J

  • e

Em pl

  • ye

e x De pe ndent s Na m e SSN Em pl

  • ye

eSSN Dna m e J

  • hn

999999999 999999999 Em i l y J

  • hn

999999999 777777777 J

  • e

Tony 777777777 999999999 Em i l y Tony 777777777 777777777 J

  • e

14

Re na m i ng

  • Cha

nge s t he s c he m a , not t he i ns t a nc e

  • Not

a t i

  • n:

r B1,

… , Bn (

R)

  • Exa

m pl e :

– rLas

t Nam e , Soc SocNo (

Em pl

  • ye

e ) – Out put s c he m a : Ans we r ( Las t Nam e , Soc Soc No)

15

Re nam i ng Exam pl e

Em pl

  • ye

e Name SSN J

  • hn

999999999 Tony 777777777 La s t Name Soc Soc No J

  • hn

999999999 Tony 777777777

rLas

t Name, SocSocNo (

Em pl

  • ye

e )

16

Na t ur a l J

  • i

n

  • Not

a t i

  • n:

R1 ⋈ R2

  • M e

a ni ng: R1 ⋈ R2 = P A( sC( R1 · R2) )

  • W he

r e :

– The s e l e c t i

  • n

sC c he cks e qual i t y

  • f

a l l com m on a t t r i but e s – The pr

  • j

e c t i

  • n

e l i m i na t e s t he dupl i c at e com m on a t t r i but e s

17

Nat ur al Joi n Exam pl e Em pl

  • ye

e Name SSN J

  • hn

999999999 Tony 777777777 De pendent s SSN Dnam e 999999999 Em i l y 777777777 J

  • e

Name SSN Dnam e J

  • hn

999999999 Em i l y Tony 777777777 J

  • e

Em pl

  • ye

e De pe ndent s = P Na

me , SSN, Dna me(

s SSN=SSN2( Empl

  • ye

e x rSSN2,

Dna me(

De pendent s ) )

18

Na t ur a l J

  • i

n

  • R=

S=

  • R

⋈ S=

V Z Z Y Z X Y X B A V Z W V U Z C B

W V Z V Z Y U Z Y V Z X U Z X C B A

slide-4
SLIDE 4

4

19

Na t ur a l J

  • i

n

  • Gi

ve n t he s c he m a s R( A, B, C, D) , S( A, C, E) , wha t i s t he s c he m a

  • f

R ⋈ S ?

  • Gi

ve n R( A, B, C) , S( D, E) , wha t i s R ⋈ S ?

  • Gi

ve n R( A, B) , S( A, B) , wha t i s R ⋈ S ?

20

The t a J

  • i

n

  • A

j

  • i

n t ha t i nvol ve s a pr e di c a t e

  • R1

⋈ q R2 = s q ( R1 · R2)

  • He

r e q c a n be a ny c

  • ndi

t i

  • n

21

Eq- j

  • i

n

  • A

t he t a j

  • i

n whe r e q i s a n e qua l i t y

  • R1

⋈A=B R2 = s

A=B (

R1 · R2)

  • Exa

m pl e :

– Em pl

  • ye

e ⋈SSN=SSN De pe nde nt s

  • M os

t us e f ul j

  • i

n i n pr a c t i c e

22

Se m i j

  • i

n

  • R

⋉ S = P A1,

… , An (

R ⋈ S)

  • W he

r e A 1, … , A n a r e t he a t t r i but e s i n R

  • Exa

m pl e :

– Em pl

  • ye

e ⋉ De pe nde nt s

23

Se m i j

  • i

ns i n Di s t r i but e d Da t a ba s e s

  • Se

m i j

  • i

ns a r e us e d i n di s t r i but e d da t a ba s e s

. . . . . . Nam e SSN Dname . . . . . . Age SSN

Em pl

  • ye

e De pe ndent s ne t wor k

Em pl

  • ye

e ⋈s

s n=s s n (

s

age>71

( De pe nde nt s ) )

T = P SSN s

a ge >71

( De pe ndent s ) R = Empl

  • ye

e ⋉ T Ans wer = R ⋈ De pe nde nt s

24

Com pl e x RA Expr e s s i

  • ns

Pe r s

  • n

Pur cha s e Pe r s

  • n

Pr

  • duc

t

snam e=f

r ed

snam e=gi

zm o

P pi

d

P s

s n

s e l l e r

  • s

s n=s s n pi d=pi d buye r

  • s

s n=s s n

P na

m e

slide-5
SLIDE 5

5

25

Ope r a t i

  • ns
  • n

Ba gs

A bag = a s e t wi t h r e pe a t e d e l em ent s Al l

  • per

a t i

  • ns

ne e d t

  • be

de f i ne d c a r ef ul l y

  • n

bags

  • {a

, b, b, c }¨ {a , b, b, b, e , f , f }={a , a , b, b, b, b, b, c , e , f , f }

  • {a

, b, b, b, c , c } – {b, c , c , c , d} = {a , b, b, d}

  • sC(

R) : pr e s e r ve t he num ber

  • f
  • c

cur r enc e s

  • P A(

R) : no dupl i c a t e e l i m i na t i

  • n
  • Ca

r t e s i an pr

  • duct

, j

  • i

n: no dupl i c a t e el i m i na t i

  • n

I m por t ant ! Rel a t i

  • nal

Engi ne s wor k

  • n

bags , not s e t s ! Re a di ng a s s i gnm ent : 5. 3 – 5. 4

26

Fi na l l y: RA ha s Li m i t a t i

  • ns

!

  • How

do we c

  • m put

e “ t r a ns i t i ve c l

  • s

ur e ” ?

  • Fi

nd a l l di r e c t a nd i ndi r e c t r e l a t i ve s

  • f

Fr e d

Si s t e r Lou Nanc y Spous e Bi l l M ar y Cous i n J

  • e

M ar y Fat he r M ar y Fr e d Re l at i

  • ns

hi p Nam e 2 Nam e 1

27

XM L

28

XM L

  • e

Xt e ns i bl e M a r kup La ngua ge

  • XM L

1. –a r e c

  • m m e

nda t i

  • n

f r

  • m

W 3C, 1998

  • Root

s : SGM L ( a ve r y na s t y l a ngua ge ) .

  • Af

t e r t he r

  • ot

s : a f

  • r

m a t f

  • r

s ha r i ng dat a

29

W hy XM L i s

  • f

I nt e r e s t t

  • Us
  • XM L

i s j us t s ynt a x f

  • r

da t a

– Not e : we have no s ynt ax f

  • r

r e l a t i

  • nal

da t a – But XM L i s not r e l a t i

  • nal

: s emi s t r uc t ur e d

  • Thi

s i s e xc i t i ng be c a us e :

– Ca n t r a ns l a t e any da t a t

  • XM L

– Ca n s hi p XM L

  • ve

r t he W e b ( HTTP) – Ca n i nput XM L i nt

  • a

ny a ppl i c a t i

  • n

– Thus : da t a s ha r i ng a nd e xc ha nge

  • n

t he W e b

30

XM L Da t a Sha r i ng a nd Exc ha nge

appl i cat i

  • n

r el at i

  • nal

dat a

Tr ansf

  • r

m I nt egr at e W ar ehouse

XM L Dat a W EB ( HTTP)

appl i cat i

  • n

appl i cat i

  • n

l egacy dat a

  • bj

ect

  • r

el at i

  • nal

Spe c i f i c da t a m a na ge m e nt t a s ks

slide-6
SLIDE 6

6

31

Fr

  • m

HTM L t

  • XM L

HTM L de s c r i be s t he pr e s e nt a t i

  • n

32

HTM L

<h1> Bi bl i

  • gr

a phy </ h1> <p> <i > Founda t i

  • ns
  • f

Da t a ba s e s </ i > Abi t e boul , Hul l , Vi a nu <br > Addi s

  • n

W e s l ey, 1995 <p> <i > Da t a

  • n

t he W e b </ i > Abi t e

  • ul

, Bune m a n, Suc i u <br > M or ga n Ka uf m a nn, 1999

33

XM L

<bi bl i

  • gr

a phy> <book> <t i t l e > Founda t i

  • ns

… </ t i t l e > <a ut hor > Abi t e boul </ aut hor > <a ut hor > Hul l </ aut hor > <a ut hor > Vi anu </ aut hor > <publ i s he r > Addi s

  • n

W e s l ey </ publ i s he r > <ye a r > 1995 </ ye a r > </ book> … </ bi bl i

  • gr

a phy>

XM L descr i bes t he cont ent

34

W e b Se r vi c e s

  • A

ne w pa r a di gm f

  • r

c r e a t i ng di s t r i but e d a ppl i c a t i

  • ns

?

  • Sys

t e m s c

  • m m uni

c a t e vi a m e s s a ge s , c

  • nt

r a c t s .

  • Exa

m pl e :

  • r

de r pr

  • c

e s s i ng s ys t e m .

  • M S

. NET, J 2EE –s

  • m e
  • f

t he pl a t f

  • r

m s

  • XM L

– a pa r t

  • f

t he s t

  • r

y; t he da t a f

  • r

m a t .

35

XM L Te r m i nol

  • gy
  • t

a gs : book, t i t l e , a ut hor , …

  • s

t a r t t a g: <book>, e nd t ag: </ book>

  • e

l e m e nt s : <book>… <book>, <a ut hor >… </ a ut hor >

  • e

l e m e nt s a r e ne s t e d

  • e

m pt y e l e m e nt : <r e d></ r e d> a bbr v. <r e d/ >

  • a

n XM L doc um e nt : s i ngl e r

  • ot

e l e ment wel l f

  • r

m ed XM L docum ent : i f i t has m at chi ng t ags

36

M or e XM L: At t r i but e s

<bookpr i c e= “ 55” c ur r e ncy= “ USD” > <t i t l e > Founda t i

  • ns
  • f

Da t a ba s e s </ t i t l e > <a ut hor > Abi t e boul </ a ut hor > … <ye a r > 1995 </ ye a r > </ book> at t r i but es ar e al t er nat i ve ways t

  • r

epr esent dat a

slide-7
SLIDE 7

7

37

M or e XM L: Oi ds a nd Re f e r e nc e s

<pe r s

  • n i

d=“o555” > <name > J a ne </ nam e> </ pe r s

  • n>

<pe r s

  • n i

d=“o456” > <name > M a r y </ nam e > <c hi l dr en i dr ef =“o123

  • 555”

/ > </ pe r s

  • n>

<pe r s

  • n i

d=“o123” mot her =“o456” ><nam e>J

  • hn</

nam e > </ pe r s

  • n>
  • i

ds and r ef er ences i n XM L ar e j ust synt ax

38

XM L Se m a nt i c s : a Tr e e !

<da t a > <pe r s

  • n

i d=“

  • 555”>

<nam e > M a r y </ na m e > <a ddr e s s > <s t r ee t > M a pl e </ s t r e et > <no> 345 </ no> <c i t y> Se a t t l e </ c i t y> </ a ddr es s > </ pe r s

  • n>

<pe r s

  • n>

<nam e > J

  • hn

</ nam e > <a ddr e s s > Tha i l and </ a ddr es s > <phone> 23456 </ phone > </ pe r s

  • n>

</ da t a>

da t a M ar y pe r s

  • n

pe r s

  • n

na me a ddr e s s na me a ddr e s s s t r e e t no c i t y M apl e 345 Se at t l e John Thai phone 23456 i d

  • 555

El e m ent node Te xt node At t r i but e node

Or de r m a t t e r s ! ! !

39

XM L Da t a

  • XM L

i s s e l f

  • de

s c r i bi ng

  • Sc

he m a e l e m e nt s be c

  • m e

pa r t

  • f

t he da t a

– Re a t i

  • nal

s c he m a : pe r s

  • ns

( nam e , phone ) – I n XM L <pe r s

  • ns

>, <nam e >, <phone > a r e pa r t

  • f

t he da t a , a nd a r e r e pe a t e d m a ny t i m e s

  • Cons

e que nc e : XM L i s m uc h m or e f l e xi bl e

  • XM L

= s e m i s t r uc t ur e dda t a

40

Re l a t i

  • na

l Da t a a s XM L

<pe r s

  • n>

<r

  • w>

<na m e >J

  • hn</

na m e > <phone > 3634</ phone ></ r

  • w>

<r

  • w>

<na m e >Sue </ na m e > <phone > 6343</ phone > <r

  • w>

<na m e >Di c k</ na m e > <phone > 6363</ phone ></ r

  • w>

</ pe r s

  • n>

nam e phone J

  • hn

3634 Sue 6343 Di ck 6363

r

  • w

r

  • w

r

  • w

nam e nam e nam e phone phone phone “ John” 3634 “ Sue” “ Di ck” 6343 6363

pe r s

  • n

XM L:

per son 41

XM L i s Se m i

  • s

t r uc t ur e d Da t a

  • M i

s s i ng a t t r i but e s :

  • Coul

d r e pr e s e nt i n a t a bl e wi t h nul l s

<pe r s

  • n>

<na m e > J

  • hn</

na m e > <phone >1234</ phone > </ pe r s

  • n>

<pe r s

  • n>

<na m e >J

  • e

</ na m e > </ pe r s

  • n>

no phone !

  • J
  • e

1234 J

  • hn

phone na m e

42

XM L i s Se m i

  • s

t r uc t ur e d Da t a

  • Re

pe a t e d a t t r i but e s

  • I

m pos s i bl e i n t a bl e s :

<pe r s

  • n>

<na m e > M a r y</ na m e > <phone >2345</ phone > <phone >3456</ phone > </ pe r s

  • n>

t wo phone s ! 3456 2345 M a r y phone na m e

? ? ?

slide-8
SLIDE 8

8

43

XM L i s Se m i

  • s

t r uc t ur e d Da t a

  • At

t r i but e s wi t h di f f e r ent t ype s i n di f f er ent

  • bj

e c t s

  • Ne

s t e d c

  • l

l e c t i

  • ns

( no 1NF)

  • He

t e r

  • gene
  • us

c

  • l

l e c t i

  • ns

:

– <db> c

  • nt

a i ns bot h <book>s a nd <publ i s he r >s

<pe r s

  • n>

<nam e > <f i r s t > J

  • hn

</ f i r s t > <l a s t > Sm i t h </ l a s t > </ name > <phone >1234</ phone > </ pe r s

  • n>

s t r uc t ur e d na m e !

44

Doc um e nt Type De f i ni t i

  • ns

DTD

  • pa

r t

  • f

t he

  • r

i gi na l XM L s pe c i f i c a t i

  • n
  • a

n XM L doc um e nt m ay ha ve a DTD

  • XM L

doc um e nt :

we l l

  • f
  • r

m e d = i f t a gs a r e cor r e c t l y cl

  • s

e d Val i d = i f i t ha s a DTD a nd conf

  • r

m s t

  • i

t

  • va

l i da t i

  • n

i s us e f ul i n da t a e xc ha nge

45

Ve r y Si m pl e DTD

<! DO CTYPE com pany [ <! ELEM ENT com pany ( ( per son| pr

  • duct

) * ) > <! ELEM ENT per son ( ssn, nam e,

  • f

f i ce, phone?) > <! ELEM ENT ssn ( #PCDATA) > <! ELEM ENT nam e ( #PCDATA) > <! ELEM ENT

  • f

f i ce ( #PCDATA) > <! ELEM ENT phone ( #PCDATA) > <! ELEM ENT pr

  • duct

( pi d, nam e, descr i pt i

  • n?)

> <! ELEM ENT pi d ( #PCDATA) > <! ELEM ENT descr i pt i

  • n (

#PCDATA) > ] >

46

Ve r y Si m pl e DTD

<com pany> <per son> <ssn> 123456789 </ ssn> <nam e> John </ nam e> <of f i ce> B432 </

  • f

f i ce> <phone> 1234 </ phone> </ per son> <per son> <ssn> 987654321 </ ssn> <nam e> Ji m </ nam e> <of f i ce> B123 </

  • f

f i ce> </ per son> <pr

  • duct

> . . . </ pr

  • duct

> . . . </ com pany>

Exam pl e

  • f

val i d XM L document :

47

DTD: The Cont e nt M ode l

  • Cont

ent m odel :

– Com pl e x = a r e gul a r e xpr e s s i

  • n
  • ve

r

  • t

he r e l e m e nt s – Te xt

  • nl

y = #PCDATA – Em pt y = EM PTY – Any = ANY – M i xe d c

  • nt

e nt = ( #PCDATA | A | B | C) *

<! ELEM ENT t ag( CONTENT) >

c

  • nt

ent m ode l

48

DTD: Re gul a r Expr e s s i

  • ns

<! ELEM ENT nam e ( f i r st Nam e, l ast Nam e) )

<nam e> <f i r st N am e> . . . . . </ f i r st N am e> <l ast N am e> . . . . . </ l ast Nam e> </ nam e>

<! ELEM ENT nam e ( f i r st Nam e?, l ast Nam e) )

DTD XM L

<! ELEM ENT per son ( nam e, phone* ) )

s e quenc e

  • pt

i

  • nal

<! ELEM ENT per son ( nam e, ( phone| em ai l ) ) )

Kl e ene s t a r a l t e r na t i

  • n

<per son> <nam e> . . . . . </ nam e> <phone> . . . . . </ phone> <phone> . . . . . </ phone> <phone> . . . . . </ phone> . . . . . . </ per son>

slide-9
SLIDE 9

9

49

Que r yi ng XM L Da t a

  • XPa

t h = s i m pl e na vi ga t i

  • n

t hr

  • ugh

t he t r e e

  • XQue

r y = t he SQL

  • f

XM L

  • XSLT

= r e c ur s i ve t r a ve r s a l

– wi l l not di s c us s i n cl a s s

50

Sa m pl e Da t a f

  • r

Que r i e s

<bi b> <book> <publ i s he r > Addi s

  • n-

W e s l ey </ publ i s he r > <aut hor > Se r ge Abi t e boul </ aut hor > <aut hor > <f i r s t

  • nam e

> Ri ck </ f i r s t

  • nam e

> <l as t

  • nam e

> Hul l </ l as t

  • nam e

> </ aut hor > <aut hor > Vi c t

  • r

Vi anu </ aut hor > <t i t l e > Foundat i

  • ns
  • f

Dat abas e s </ t i t l e > <ye ar > 1995 </ ye ar > </ book> <book pr i ce =“55” > <publ i s he r > Fr e eman </ publ i s he r > <aut hor > J e f f r ey D. Ul l man </ aut hor > <t i t l e > Pr i nc i pl e s

  • f

Dat abas e and Knowl edge Bas e Sys t ems</ t i t l e > <ye ar > 1998 </ ye ar > </ book> </ bi b>

51

Da t a M ode l f

  • r

XPa t h

bi b book book publ i s he r aut hor

. . . .

Addi s

  • n-

W e s l ey Se r ge Abi t e boul

The r

  • ot

The r

  • ot

el ement

52

XPa t h: Si m pl e Expr e s s i

  • ns

Re s ul t : <ye a r > 1995 </ ye a r > <ye a r > 1998 </ ye a r > Re s ul t : e m pt y ( t he r e we r e no pa pe r s )

/ bi b/ book/ ye a r

/ bi b/ pa pe r / ye a r

53

XPa t h: Re s t r i c t e d Kl e e ne Cl

  • s

ur e

Re s ul t :

<a ut hor > Se r ge Abi t e boul </ a ut hor > <a ut hor > <f i r s t

  • na

m e > Ri c k </ f i r s t

  • na

m e > <l a s t

  • na

m e > Hul l </ l a s t

  • na

m e > </ a ut hor > <a ut hor > Vi c t

  • r

Vi a nu </ a ut hor > <a ut hor > J e f f r ey D. Ul l m a n </ a ut hor > Re s ul t : <f i r s t

  • na

m e > Ri c k </ f i r s t

  • na

m e >

/ / a ut hor

/ bi b/ / f i r s t

  • nam e

54

Xpa t h: Te xt Node s

Re s ul t :

Se r ge Abi t e boul J e f f r e y D. Ul l m a n Ri c k Hul l doe s n’ t a ppe a r be c a us e he ha s f i r s t na m e , l a s t na m e

Func t i

  • ns

i n XPa t h:

– t e xt ( ) = ma t che s t he t ext val ue – node ( ) = ma t che s a ny node ( = *

  • r

@ *

  • r

t e xt ( ) ) – nam e( )= r e t ur ns t he nam e

  • f

t he cur r ent t a g

/ bi b/ book/ a ut hor / t e xt ( )

slide-10
SLIDE 10

10

55

Xpa t h: W i l dc a r d

Re s ul t :

<f i r s t

  • na

m e > Ri c k </ f i r s t

  • na

m e > <l a s t

  • na

m e > Hul l </ l a s t

  • na

m e >

*M at c he s a ny e l em e nt

/ / a ut hor / *

56

Xpa t h: At t r i but e Node s

Re s ul t : “ 55” @ pr i c em e a ns t ha t pr i c e i s ha s t

  • be

a n a t t r i but e / bi b/ book/ @ pr i c e

57

Xpa t h: Pr e di c a t e s

Re s ul t :<aut

hor > <f i r s t

  • nam e

> Ri c k </ f i r s t

  • nam e

> <l a s t

  • nam e

> Hul l </ l a s t

  • nam e

> </ a ut hor >

/ bi b/ book/ a ut hor [ f i r s t na m e ]

58

Xpa t h: M or e Pr e di c a t e s

Re s ul t :

<l a s t na m e > … </ l a s t na m e > <l a s t na m e > … </ l a s t na m e >

/ bi b/ book/ aut hor [ f i r s t nam e ] [ a ddr e s s [ / / zi p] [ c i t y] ] / l a s t nam e

59

Xpa t h: M or e Pr e di c a t e s

/ bi b/ book[ @ pr i c e < “ 60” ] / bi b/ book[ a ut hor / @ a ge < “ 25” ] / bi b/ book[ a ut hor / t e xt ( ) ]

60

Xpa t h: Sum m a r y

bi b m a t c he s a bi be l e m e nt * m a t c he s a ny e l e m e nt / m a t c he s t he r

  • ote

l e m e nt / bi b m a t c he s a bi be l e m e nt unde r r

  • ot

bi b/ pa pe r m a t c he s a pa pe ri n bi b bi b/ / pa pe r m a t c he s a pa pe ri n bi b, a t a ny de pt h / / pa pe r m a t c he s a pa pe r a t a ny de pt h pa pe r | book m a t c he s a pa pe ror a book @ pr i c e m a t c he s a pr i c ea t t r i but e bi b/ book/ @ pr i c e m a t c he s pr i c ea t t r i but e i n book, i n bi b bi b/ book/ [ @ pr i c e <“ 55” ] / a ut hor / l a s t na m e m a t c he s …

slide-11
SLIDE 11

11

61

Com m e nt s

  • n

XPa t h?

  • W ha

t ’ s good a bout i t ?

  • W ha

t c a n’ t i t do t ha t you wa nt i t t

  • do?
  • How

doe s i t c

  • m pa

r e , s ay, t

  • SQL?

62

XQue r y

  • Ba

s e d

  • n

Qui l t , whi c h i s ba s e d

  • n

XM L- QL

  • Us

e s XPa t h t

  • e

xpr e s s m or e c

  • m pl

e x que r i e s

63

FLW R ( “ Fl

  • we

r ” ) Expr e s s i

  • ns

FOR . . . LET. . . W HERE. . . RETURN. . .

64

XQue r y

Fi nd a l l book t i t l e s publ i s he d a f t e r 1995:

FOR $xI N docum ent ( " bi b. xm l " ) / bi b/ book W HERE $x/ ye a r> 1995 RETURN { $x/ t i t l e } Re s ul t : <t i t l e > a bc </ t i t l e > <t i t l e > def </ t i t l e > <t i t l e > ghi </ t i t l e >

65

XQue r y

Fi nd book t i t l e s by t he c

  • a

ut hor s

  • f

“ Da t a ba s e The

  • r

y” :

FOR $xI N bi b/ book[ t i t l e / t ext ( ) = “Da t a ba s e The

  • r

y” ] / aut hor $yI N bi b/ book[ aut hor / t ext ( ) = $x/ t ext ( ) ] / t i t l e RETURN <ans we r > { $y/ t ext ( ) } </ a ns we r > Re s ul t : <a ns we r > a bc </ a ns we r > < a ns we r > de f </ a ns we r > < a ns we r > ghi </ a ns we r > The a ns we r wi l l c

  • nt

ai n dupl i c a t e s !

66

XQue r y

Sa m e a s be f

  • r

e , but e l i m i na t e dupl i c a t e s :

FOR $xI N bi b/ book[ t i t l e / t ext ( ) = “Da t a ba s e The

  • r

y” ] / aut hor $yI N di s t i nc t ( bi b/ book[ aut hor / t ext ( ) = $x/ t ext ( ) ] / t i t l e ) RETURN <ans we r > { $y/ t ext ( ) } </ a ns we r > Re s ul t : <a ns we r > a bc </ a ns we r > < a ns we r > de f </ a ns we r > < a ns we r > ghi </ a ns we r > di st i nct= a f unct i

  • n

t hat el i m i nat es dupl i cat es

slide-12
SLIDE 12

12

67

XQue r y: Ne s t i ng

For e a c h a ut hor

  • f

a book by M or ga n Ka uf m a nn, l i s t a l l books s he publ i s he d:

FOR $aI N di s t i nc t ( doc um e nt ( " bi b. xm l " ) / bi b/ book[ publ i s he r =“ M or ga n Ka uf m a nn” ] / a ut hor ) RETURN <r e s ul t > { $a , FOR $tI N / bi b/ book[ a ut hor =$a ] / t i t l e RETURN $t } </ r e s ul t >

68

XQue r y

<r e s ul t > <a ut hor >J

  • ne

s </ a ut hor > <t i t l e > a bc </ t i t l e > <t i t l e > de f </ t i t l e > </ r e s ul t > <r e s ul t > <a ut hor > Sm i t h </ a ut hor > <t i t l e > ghi </ t i t l e > </ r e s ul t > Re s ul t :

69

XQue r y

  • FOR $x i

n e xpr

  • bi

nds $xt

  • e

a c h va l ue i n t he l i s t e xpr

  • LET $x =

e xpr

  • bi

nds $xt

  • t

he e nt i r e l i s t e xpr

– Us e f ul f

  • r

c

  • m m on

s ube xpr e s s i

  • ns

and f

  • r

a ggr e ga t i

  • ns

70

XQue r y

count = a ( aggr egat e) f unct i

  • n

t hat r et ur ns t he num ber

  • f

el m s

<bi g_publ i sher s> FO R $p I N di st i nct ( docum ent ( " bi b. xm l " ) / / publ i sher ) LET $b : = docum ent ( " bi b. xm l " ) / book[ publ i sher= $p] W HERE count ( $b) > 100 RETURN { $p } </ bi g_publ i sher s>

71

XQue r y

Fi nd books whos e pr i c e i s l a r ge r t ha n a ve r a ge :

LET $a =a vg( docum ent ( " bi b. xm l " ) / bi b/ book/ pr i c e ) FOR $bi n docum ent ( " bi b. xm l " ) / bi b/ book W HERE $b/ pr i c e> $a RETURN { $b }

Le t ’ s t r y t

  • wr

i t e t hi s i n SQL…

72

XQue r y

Sum m a r y:

  • FOR-

LET- W HERE- RETURN = FLW R

FO R/ LET Cl auses W HERE Cl ause RETURN Cl ause Li st

  • f

t upl es Li st

  • f

t upl es I nst ance

  • f

Xquer y dat a m odel

slide-13
SLIDE 13

13

73

FOR v. s . LET

FOR

  • Bi

nds node v ar i abl e s i t e r a t i

  • n

LET

  • Bi

nds c

  • l

l e c t i

  • n

v ar i abl e s

  • ne

va l ue

74

FOR v. s . LET

FOR $xI N docum ent ( " bi b. xm l " ) / bi b/ book RETURN <r e s ul t > { $x} </ r e s ul t >

Re t ur ns :

<r e s ul t > <book>. . . </ book></ r e s ul t > <r e s ul t > <book>. . . </ book></ r e s ul t > <r e s ul t > <book>. . . </ book></ r e s ul t > . . .

LET $xI N docum ent ( " bi b. xm l " ) / bi b/ book RETURN <r e s ul t > { $x} </ r e s ul t >

Re t ur ns :

<r e s ul t > <book>. . . </ book> <book>. . . </ book> <book>. . . </ book> . . . </ r e s ul t >

75

Col l e c t i

  • ns

i n XQue r y

  • Or

de r e d a nd unor de r e d col l e c t i

  • ns

– / bi b/ book/ aut hor = a n

  • r

de r e d c

  • l

l e c t i

  • n

– Di s t i nc t ( / bi b/ book/ aut hor ) = a n unor der e d c

  • l

l e c t i

  • n
  • LET $a=

/ bi b/ book $ai s a col l e c t i

  • n
  • $b/

a ut hor a c

  • l

l e c t i

  • n

( s e ve r al a ut hor s . . . )

RETURN <r e s ul t > { $b/ aut hor} </ r e s ul t >

Re t ur ns :

<r e s ul t > <a ut hor >. . . </ a ut hor > <a ut hor >. . . </ a ut hor > <a ut hor >. . . </ a ut hor > . . . </ r e s ul t >

76

Col l e c t i

  • ns

i n XQue r y

W ha t a bout c

  • l

l e c t i

  • ns

i n e xpr e s s i

  • ns

?

  • $b/

pr i c e l i s t

  • f

n pr i c e s

  • $b/

pr i c e* 0. 7 l i s t

  • f

n num ber s

  • $b/

pr i c e* $b/ quant i t y l i s t

  • f

n x m num be r s ? ?

  • $b/

pr i c e* ( $b/ quant 1 + $b/ quant 2) „ $b/ pr i c e* $b/ quant 1 + $b/ pr i c e* $b/ quant 2 ! !

77

Sor t i ng i n XQue r y

<publ i s he r _l i s t > FOR $p I N di s t i nc t ( doc um e nt ( " bi b. xm l " ) / / publ i s he r ) RETURN <publ i s he r > <na m e > { $p/ t e xt ( ) } </ na m e > , FOR $b I N doc um e nt ( " bi b. xm l " ) / / book[ publ i s he r= $p] RETURN <book> { $b/ t i t l e, $b/ pr i c e } </ book> SORTBY( pr i c eDESCENDI NG) </ publ i s he r > SORTBY( na m e ) </ publ i s he r _l i s t >

78

I f

  • The

n- El s e

FOR $h I N / / hol di ng RETURN <hol di ng> { $h/ t i t l e , I F $h/ @ t ype = " J

  • ur

na l " THEN $h/ e di t

  • r

ELSE $h/ a ut hor } </ hol di ng> SORTBY ( t i t l e )

slide-14
SLIDE 14

14

79

Exi s t e nt i a l Qua nt i f i e r s

FOR $b I N / / book W HERE SOM E $p I N $b/ / par aSATI SFI ES c

  • nt

a i ns ( $p, " s a i l i ng" ) AND c

  • nt

a i ns ( $p, " wi nds ur f i ng" ) RETURN { $b/ t i t l e }

80

Uni ve r s a l Qua nt i f i e r s

FOR $b I N / / book W HERE EVERY $p I N $b/ / pa r aSATI SFI ES c

  • nt

a i ns ( $p, " s a i l i ng" ) RETURN { $b/ t i t l e }

81

Ot he r St uf f i n XQue r y

  • BEFORE a

nd AFTER

– f

  • r

de al i ng wi t h

  • r

de r i n t he i nput

  • FI

LTER

– de l e t e s s

  • m e

e dge s i n t he r e s ul t t r e e

  • Re

c ur s i ve f unc t i

  • ns

– Cur r e nt l y: a r bi t r a r y r e c ur s i

  • n

– Pe r ha ps m or e r e s t r i c t i

  • ns

i n t he f ut ur e ?

82

Fi na l Com m e nt s

  • n

XM L

  • How

a r e we goi ng t

  • pr
  • c

e s s XM L e f f i c i e nt l y?

– Spe c i al pur pos e XM L e ngi ne s ,

  • r

– Add f unc t i

  • nal

i t y t

  • r

e l a t i

  • nal

e ngi ne s ?

  • Ne

e d t

  • m a

na ge XM L s t r e ams .

  • He

r e , da t a m a na ge m e nt i s m uc h c l

  • s

e r t

  • t

he r pr

  • gr

a m m i ng t a s ks .