OraGIST How to Make User-Defined Indexing Become Usable and Useful - - PowerPoint PPT Presentation
OraGIST How to Make User-Defined Indexing Become Usable and Useful - - PowerPoint PPT Presentation
1 OraGIST How to Make User-Defined Indexing Become Usable and Useful Carsten Kleiner, Udo Lipeck Universit at Hannover BTW 2003, 27.02.2003 2 The Situation (+) Object-relational DBMSs are extensible by: user-defined datatypes
2
The Situation (+)
Object-relational DBMSs are extensible by:
- user-defined datatypes (UDT)
- together with corresponding query operators
- user-defined indexing
- together with corresponding user-defined
- ptimizer tuning
⇒ ”data blades”, ”cartridges”, ”extenders”, ...
3
The Situation (−)
But user-defined indexing
- needs expensive implementation
- needs experimental selection for new query scenarios
- is restricted:
a user-defined index supports only single operators, no combinations
- assumes that an operator is supported by at most
- ne index
4
Our Proposal
- Use combined datatypes and operators to allow
index support for attribute combinations
- Provide medium-dimensional index structures
- Apply a generic indexing framework
⋆ based on generalized search trees (GiST) ⋆ easily specializable into indexes for particular UDTs and operators ⋆ coupled with DB storage and index definition ⇒ prototypical tool OraGiST (GiST for Oracle)
5
Generalized Search Trees (GiST) — Class Hierarchy
GiST
#insert(e:GiSTEntry,level:int) #chooseSubtree(e:GiSTEntry,level:int) #split(n:GiSTNode,e:GiSTEntry) #adjustKeys(n:GiSTNode) #delete(e:GiSTEntry)
OrderedGiST
#findMin(q:predicate) #next(q:predicate,e:GiSTEntry)
UnorderedGiST
#search(q:predicate)
BTreeGiST
#consistent(e:GiSTEntry,q:predicate) #union(l:ListOfGiSTEntry) #penalty(e1:GiSTEntry,e2:GiSTEntry) #pickSplit(l:ListOfGiSTEntry) #compare(e1:GiSTEntry,e2:GiSTEntry)
RTreeGiST
#consistent(e:GiSTEntry,q:predicate) #union(l:ListOfGiSTEntry) #penalty(e1:GiSTEntry,e2:GiSTEntry) #pickSplit(l:ListOfGiSTEntry)
RStarTreeGiST
#penalty(e1:GiSTEntry,e2:GiSTEntry) #pickSplit(l:ListOfGiSTEntry)
SSTreeGiST
#penalty(e1:GiSTEntry,e2:GiSTEntry)
RSSTreeGiST
#penalty(e1:GiSTEntry,e2:GiSTEntry)
6
Index Usage
CREATE INDEXTYPE polygonRTree FOR overlaps(polygon,polygon) USING overlaps fct; CREATE TABLE county (id NUMBER, population NUMBER, ... shape polygon, ...); CREATE INDEX idx geoCounty ON county(shape) INDEXTYPE IS polygonRTree PARAMETERS(...); SELECT * FROM county WHERE overlaps(shape,polygon(rectangle(0,0,20,10)))) = true;
7
Extensible Indexing in ORDBS
- Sequence of index operations over index lifetime (e.g.):
ODCIIndexCreate ODCIIndexInsert ODCIIndexDrop for each tuple
- Sequence of index operations in query execution:
ODCIIndexFetch ODCIIndexClose
delete structures set up structures
ODCIIndexStart Scan Context
read and modify
until all tuples fetched
8
OraGiST — Architecture and Functionality
GiST GiSTExtension GiSTEntry
1 n
GiSTIndexFile DBSExtensibleIndexing UserDefinedIndexStructure DBSUserDefinedObject
1 n
DBSIndexTable
Oracle ORDBS
1 1
libgist library
1 1
OraGiST
OraGiST Library OraGiST Toolbox
initiates calls is stored in 1 n
OraGiSTExtension
+getExtension() +getQuery() +needVerify()
TypeMap
+approximate()
9
OraGiST — Architecture and Functionality
GiST RTree RTreeEntry
1 n
GiSTIndexFile OracleExtensibleIndexing UserDefinedRTree GeometryObject
1 n
OracleIndexTable
Oracle ORDBS
1 1
libgist library
1 1
OraGiST
getExtension getQuery approximate
10
Case Study — Spatial/Thematic Analysis
- Typical query: Find all counties overlapping a given
window where the median rent is below ... and the population is higher than ...
- Definition of combined datatypes and operators:
CREATE TYPE integerGeometry AS OBJECT (shape OGCGeometry, theme INTEGER); CREATE TYPE twoIntegerGeometry AS OBJECT (shape OGCGeometry, theme1,theme2 INTEGER); CREATE OPERATOR [two]BetweenOverlaps ... ; CREATE TABLE county OF [two]IntegerGeometry;
11
Performance Evaluation on 2D Spatial Data
10 20 30 40 50 60 5 10 15 20 25 30 35 40 response time selectivity in % Oracle R-Tree Oracle Quadtree User-Def-R*-Tree
12
... on 2D spatial data + one thematic dimension
10 20 30 40 50 60 70 80 2 4 6 8 10 12 14 response time selectivity in % Oracle R-Tree User-Def-3D-RSS-Tree User-Def-3D-R*-Tree Oracle Quadtree
13
... on 2D spatial data + two thematic dimensions
10 20 30 40 50 60 70 80 5 10 15 20 response time selectivity in % User-Def-4D-RSS-Tree User-Def-4D-R*-Tree User-Def-3D-RSS-Tree User-Def-2D-RSS-Tree
14
Conclusions
- R∗- and RSS-trees can be recommended as (medium-
dimensional) index types for combined query scenarios.
- Combined scenarios occur often in spatio-temporal-
thematic DBS; think of, e.g., ”validtimeInteger” !
- For new UDTs/operators,
for combined UDTs/operators, and for experimental index selection, a framework for adaptable indexing in ORDBS is required.
15
Conclusions (cont.)
- OraGiST is such an indexing framework:
⋆ extensible library + toolbox ⋆ coupling the GiST-family with an ORDBMS ⋆ for adoption of existing index types ⋆ for fast development of new index types by object-oriented specialization
16
Future Questions
- How to improve behaviour on ’complicated’ objects ?
- How to include user-defined cost/selectivity estimation ?
- How to generate and hide the combined operators ?