http://www.xerial.org/ I DECIDED TO EVERYBODY MUST START MASTERING - - PowerPoint PPT Presentation

http xerial org
SMART_READER_LITE
LIVE PREVIEW

http://www.xerial.org/ I DECIDED TO EVERYBODY MUST START MASTERING - - PowerPoint PPT Presentation

http://www.xerial.org/ I DECIDED TO EVERYBODY MUST START MASTERING XML IS LEARNING SAX, DOM, START A NEW CRUCIAL TO OUR XPATH, XQUERY, DTD, XML PROJECT COMPANY BECAUSE IT XML SCHEMA, RELAX NG IS COMPLETELY A NEW DATA MODEL. Its


slide-1
SLIDE 1

http://www.xerial.org/

slide-2
SLIDE 2

 It’s a kind of tragedy…

2

I DECIDED TO START A NEW XML PROJECT MASTERING XML IS CRUCIAL TO OUR COMPANY BECAUSE IT IS COMPLETELY A NEW DATA MODEL.

EVERYBODY MUST START LEARNING SAX, DOM, XPATH, XQUERY, DTD, XML SCHEMA, RELAX NG…

slide-3
SLIDE 3

 Benefits of using XML:

› XML is a portable text-data format › Tree-structured XML can reduce redundancy

  • f relational data.

Company

Employee

Office

1 e1 NY 1 e2 NY

3

XML Data Relational Data

<Company value=“1”> <Emp value=“e1”> <Office>NY</Office> </Emp> <Emp value=“e2”> <Office>NY</Office> </Emp> </Company>

Co Emp Emp Office Office

e1 e2 NY NY

slide-4
SLIDE 4

 Querying relational data translated into XML  Q: Retrieve a node tuple (Co, Emp, Office)

from the XML data

› e.g. XPath, a path expression query

/Co/Emp/Office

4

Co Emp Office

1 e1 NY 1 e2 NY XML Data Relational Data Co Emp Emp Office Office

e1 e2 NY NY

slide-5
SLIDE 5

 Tree-representation of relational data is

not unique.

5

Relational Data

Co Emp Office

1 e1 NY 1 e2 NY Co Emp Emp Office Office

e1 e2 NY NY

Office Co Emp Emp

NY e1 e2

Office Emp Emp Co

e1 e2 NY

slide-6
SLIDE 6

 User must know the entire XML structures

to produce correct path queries.

6

/Office[Co]/Emp

/Co/Emp[Office]

/Co/Office/Emp [X] : twig node to test

Office Co Emp Emp

NY e1 e2

Co Emp Emp Office Office

e1 e2 NY NY

Office Emp Emp Co

e1 e2 NY

slide-7
SLIDE 7

Office Emp Emp Co

e1 e2 NY

Co Emp Emp Office Office

e1 e2 NY NY

Office Co Emp Emp

NY e1 e2 Co Emp Office

1 e1 NY 1 e2 NY

 A key observation:

› Relation is simply embedded

in XML

7

Relational Data

slide-8
SLIDE 8

8

WHY DO WE HAVE TO USE XPATH?

slide-9
SLIDE 9

Office Co Emp Emp

NY e1 e2

Co Emp Emp Office Office

e1 e2 NY NY

Office Emp Emp Co

e1 e2 NY

Co Emp Office

1 e1 NY 1 e2 NY

 Query relations in XML

› with an SQL-like syntax

 SELECT Co, Emp, Office from (XML Data)

9

Result

 The query statement is stable for

variously structured XML data

Input XML Data

SQL over XML!

slide-10
SLIDE 10

 Convert an SQL query, SELECT A, B, C, into an

XML structure query.

› There can be many structural variations of (A, B, C)

10

B A C A B C C A B C A B B C A

…..

 For N nodes, there exists NN-1 structural

variations.

slide-11
SLIDE 11

 A node tuple (A, B, C) is an amoeba iff

  • ne of the A, B and C is a common

ancestor of the others.

11

B A C A B C C A B C A B B C A

…..

 Amoeba join retrieves all amoeba

structures in the XML data.

slide-12
SLIDE 12

 Some amoeba structure may not form a

relation.

› Why this structure is not allowed?

 Because there are functional dependencies

(FD) implied in the XML structure.

12

Office Emp Company Emp Office Emp Emp

ER-diagram (Data Model)

company

  • ffice

employee

1 M 1 N

slide-13
SLIDE 13

13

Office Emp Company Emp Office Emp Emp

ER-diagram (Data Model)

INVALID STRUCTURE!

 FD: X -> Y (From a given X, Y is uniquely determined) › employee-> office (Each employee belongs to an office) › office -> company (Each office belongs to a company)

company

  • ffice

employee

1 M 1 N

 Relation in XML must have an amoeba structure

corresponding to each FD.

slide-14
SLIDE 14

 The company has M offices, and each office has N

employees:

 # of (company, office, employee) tuples:

› When M = 100, N = 5 100 x (100 x 5) = 50,000

 While, # of correct answers is only M * N = 500

14

Office Emp Emp Company Emp Office Emp Emp Emp Office Emp Emp Emp company

  • ffice

employee

1 M 1 N

slide-15
SLIDE 15

15

Office Emp Emp Company Emp Office Emp Emp Emp Office Emp Emp Emp

 FDs: Emp -> Office, Office -> Company  Bottom-up construction of query results 1.

Amoeba Join (Employee, Office)

2.

Amoeba Join (Office, Company)

 FD-aware amoeba join avoids invalid XML structures.

company

  • ffice

employee

1 M 1 N

slide-16
SLIDE 16

 FD-aware amoeba join scales well

› For various sizes of XML data

16

slide-17
SLIDE 17

 Relational query into XML query

› SELECT Co, Office, Emp

 (with FDs: Emp -> Office, Office -> Co)

Office Co Emp Co Office Emp Emp Co Office Office Emp Co

…..

Co Office Emp

17

 XML structures of interest are automatically

determined from a relation and functional dependencies

slide-18
SLIDE 18

 A type of FDs required to determine XML structures to

query is one-to-many (or one-to-one) relationships:

› FD: Emp -> Office

 Each employee belongs to an office  An office may have several employees (one-to-many)

 We can observe these relationships by counting node

  • ccurrences or directory from the ER-diagram.

18

Office Emp Company Emp Office Emp Emp

company

  • ffice

employee

1 M 1 N

slide-19
SLIDE 19

 First, consider

› XML := Relations + their annotations

 Steps

› 1. Detect relational part from XML data › 2. Detect one-to-many(one) relationships (FDs) › 3. Write relational queries

 SELECT Co, Emp, Office

19

company

employee employee

  • ffice
  • ffice

c1 NY NY e2 e1

absent

annotation

 Note: 

It is also possible to include annotations in query statements.

slide-20
SLIDE 20

 Relation in XML

› Defined using amoeba structure and FDs

 Relational-Style XML Query

› Retrieves relations in XML with a SQL-like

query syntax (SQL over XML)

› Allows structural variations of XML data

 Departure from path expression queries

› Target XML structures are automatically

determined.

20

slide-21
SLIDE 21

 (see the paper for details)  XML Algebra

› Based on relational-semantics

 selection, projection, etc.  Keys for XML

› A key is a special-case of FDs

 Database integration  Schema evolution  Managing relational data enhanced with

XML syntax

 A lot more…

21

slide-22
SLIDE 22

22

 Before going deep into the XML world,

Think in Relational-Style!!!

I DECIDED TO START A NEW XML PROJECT MASTERING XML IS CRUCIAL TO OUR

  • COMPANY. BUT XML

IS QUITE A FAMILIER DATA MODEL TO US.

TAKE IT EASY!

 “It’s Just SQL”

 A large number of XML data and queries are still

relational.