OrientX: an Integrated, Schema- Based Native XML Database System - - PowerPoint PPT Presentation

▶

Nov 30, 2022 348 likes •570 views

OrientX: an Integrated, Schema- Based Native XML Database System Meng Xiaofeng, Wang Xiaofeng, Xie Min, Zhang Xin, Zhou Junfeng School of information, Renmin University of China WISA2006 1 Introduction OrientX means: O riginal R UC I DK

SLIDE 1

OrientX: an Integrated, Schema- Based Native XML Database System

Meng Xiaofeng, Wang Xiaofeng, Xie Min, Zhang Xin, Zhou Junfeng School of information, Renmin University of China WISA2006

SLIDE 2

2/21

WISA2006

Introduction

OrientX means:

Original RUC IDKE Native XML Database – RUC: Renmin University of China – IDKE: Institute of Data and Knowledge Engineering – Native XML DataBase: Exposing a logical model

f storing and retrieving XML documents.

(non Native XML DataBase: for example, based on relation database)

SLIDE 3

3/21

WISA2006

Outline

Architecture and Features
Storage and data management
Indexing Schema
Query processing
Conclusion and Future Work

SLIDE 4

4/21

WISA2006

Architecture

SLIDE 5

5/21

WISA2006

Features

Full support to XML Schema
Supporting XQuery1.0 and XPath2.0

Data Model

Various native storage techniques
Path index and value index
Multi-Query Processing strategies

based on native storage.

SLIDE 6

6/21

WISA2006

Outline

Architecture and Features
Storage and data management
Indexing Schema
Query processing
Conclusion and Future Work

SLIDE 7

7/21

WISA2006

Different storage granularities

Document:

– do not decompose the document, build index on it to direct the structure. – Query complexity and efficiency are restricted by the power of index.

Sub tree:

– decompose the document into sub trees according to storage space partition. – Persistent the structure in the tree. – save space

Node:

– decompose the document into nodes sequence , each node corresponding to a type (element, attribute, …). – May use too many links to persistent relation between nodes

SLIDE 8

8/21

WISA2006

Storage Techniques in OrientX

CSB CEB Clustered BSB BEB Broad-first DB DSB DEB Depth-first Document- based SubTree- based Element- based

Implemented techniques are marked in red

One node is a record, through preorder traversing tree Like DEB, but each record is a sub-tree. The size of sub tree is close to physical page size One element is a record, but all node with the same tag name will be clustered-stored. Akin to DSB, each record is a sub tree. But all sub trees with the same structure are clustered store.

SLIDE 9

9/21

WISA2006

Example-- Element based

a1 a2 t1 l1 l2 f2 f1 DEB CEB

a2 a1 l2 f2 l1 f1 t1 t1

a1 a2 f1 f2 l2 l1

Source doc

SLIDE 10

10/21

WISA2006

Example-- Subtree based

a1 a2 t1 l1 l2 f2 f1

DSB(Depth-first sub-tree based) CSB (clustered sub-tree based)

a1 a2 t1 l1 l2 f2 f1

DOC

Proxy node (virtual node)

Also have Proxy node

SLIDE 11

11/21

WISA2006

Outline

Architecture and Features
Storage and data management
Indexing Schema
Query processing
Conclusion and Future Work

SLIDE 12

12/21

WISA2006

Path index SUPEX: Index Architecture

SLIDE 13

13/21

WISA2006

Features of SUPEX

Constructed based on DTD,Schema
Integrating path index with value indexes
Supporting Twig query efficiently
Supporting label path expressions

( bib//author)

Supporting the evaluation of value-based

condition predicates (//author[firstname = “jone”])

SLIDE 14

14/21

WISA2006

Outline

Architecture and Features
Storage and data management
Indexing Schema
Query processing
Conclusion and Future Work

SLIDE 15

15/21

WISA2006

Query processing

Navigation strategy

– Supporting XPath2.0 and XQuery1.0 – Combine continuous steps in one XPath into a single path. – Reform syntax tree into reduced execution plan. – Introducing the pipeline operator to XQuery process.

SLIDE 16

16/21

WISA2006

1. Step 2. CondTreeNode 3. Path 4. ForVarBind 5. LetVarBind 6. FLWR 7. EleConstructor 8. AttrConstructor 9. BuiltInFun

10. IfThenElse
11. Quanlify
12. SetOpt
13. SortBy

Currently, Navigation Containing 13 operators:

Operators in Navigation

SLIDE 17

17/21

WISA2006

General Steps to process XQuery

Parser and Translator

ptimizer

Evaluator Engine

XQuery Query Initial Query plan

ptimized Query plan

SLIDE 18

18/21

WISA2006

The query plan

SLIDE 19

19/21

WISA2006

Outline

Architecture and Features
Storage and data management
Indexing Schema
Query processing
Conclusion and Future Work

SLIDE 20

20/21

WISA2006

Conclusion and Future Work

Conclusion:

– OrientX is an integrated, schema-based native XML database system. – It implements storing and querying xml data.

Future work:

– XQuery optimization. – Xml Update and Other XQuery processing engine.

SLIDE 21

21/21

WISA2006

OrientX: an Integrated, Schema- Based Native XML Database System - - PowerPoint PPT Presentation

OrientX: an Integrated, Schema- Based Native XML Database System

Introduction

Outline

Architecture

Features

Data Model

based on native storage.

Outline

Different storage granularities

Storage Techniques in OrientX

Example-- Element based

Example-- Subtree based

Outline

Features of SUPEX

condition predicates (//author[firstname = “jone”])

Outline

Query processing

– Supporting XPath2.0 and XQuery1.0 – Combine continuous steps in one XPath into a single path. – Reform syntax tree into reduced execution plan. – Introducing the pipeline operator to XQuery process.

Operators in Navigation

General Steps to process XQuery

The query plan

Outline

Conclusion and Future Work

– OrientX is an integrated, schema-based native XML database system. – It implements storing and querying xml data.

– XQuery optimization. – Xml Update and Other XQuery processing engine.

Thanks

Q&A☺

Welcome to our website http://idke.ruc.edu.cn to obtain more information about OrientX