EXODUS Extensible DBMS EX tensible O bject-oriented D atabase S - - PowerPoint PPT Presentation

exodus extensible dbms
SMART_READER_LITE
LIVE PREVIEW

EXODUS Extensible DBMS EX tensible O bject-oriented D atabase S - - PowerPoint PPT Presentation

EXODUS Extensible DBMS EX tensible O bject-oriented D atabase S ystem University of Wisconsin Efficient support of non-traditional applications Engineering applications (CAD/CAM) Scientific and statistical applications Image


slide-1
SLIDE 1

EXODUS Extensible DBMS

  • EXtensible Object-oriented Database System
  • University of Wisconsin
  • Efficient support of non-traditional applications

– Engineering applications (CAD/CAM) – Scientific and statistical applications – Image and voice (e.g. Satellite images)

  • Need new data types and operations to support

new application domains efficiently

slide-2
SLIDE 2

New vs. Conventional Apps

  • Each require different set of data modeling tools

– E.g. VLSI circuit designs require different entities and

relationships from a banking application

  • Each require a special set of operations

– E.g. Satellite images can't be joined together – Must be efficiently supported

  • This could mean new structures and access methods
  • Some might require multiple versions of entities

– E.g. PROBE

slide-3
SLIDE 3

DBMS for New Applications

  • POSTGRES (Berkeley)

– Predefined way of support complex objects

  • Using POSTQUEL and procedures as data types

– Make as few changes to the relational model as possible

  • PROBE (CCA)

– Mechanism for directly representing complex objects – Rule-based approach to optimization

  • Optimizers extended to handle new operators

– New methods for existing operators

  • Both “complete” DBMS systems
slide-4
SLIDE 4

DBMS for New Applications - 2

  • GENESIS (U. Texas)

– A modular (and modifiable) system – Extended for new applications

  • No “complete” support designed in advance
  • EXODUS uses the same methods

– A collection of kernel DBMS facilities – Software tools for semi-automatic creation of high-

performance application-specific DBMSs for new areas.

slide-5
SLIDE 5

EXODUS

  • A Toolbox Not a complete DBMS

– Can be easily adapted by new applications – A third group of users: database implementors (DBIs)

  • Start with a generic solution applicable anywhere

– E.g. Support arbitrary size storage objects

  • Provide a generator or a library to aid in

generating application-specific portions

  • Better to see it from the viewpoint of the of the

application-specific system that is built using it

slide-6
SLIDE 6

EXODUS Architecture

  • 1. The Storage Object Manager
  • 2. E programming language and

compiler

  • 3. A generalized Type Manager
  • 4. A library of independent

access methods

  • 5. A lock manager and recovery

protocol stubs

  • 6. A rule-based query optimizer

and compiler

  • 7. Tools for constructing user

front ends

An implemented DBMS

slide-7
SLIDE 7

Storage Objects

  • Basic unit of data in the Storage Object Manager

– A byte sequence of arbitrary size – untyped, uninterpreted, variable-length – Object Identifier (OID): (page #, slot #)

  • Two types of storage objects (internally)

– Small storage objects

  • Single page, OID points to the object
  • Automatically converted into large objects

– Large storage objects

  • Multiple pages, OID points to the large object header
slide-8
SLIDE 8

Large Storage Objects

  • Representation

– Conceptually an

uninterpreted byte sequence

– Physically a B+ tree like

index on byte position within the object and a collection of leaf blocks

  • Disk location

– Headers can reside on

slotted page with other headers / small objects

– Other pages are private (but

can be shared with other versions – if used)

  • Primitive versioning included

(Must support different versioning)

  • Only pages that differ are copied
slide-9
SLIDE 9

Storage Object Manager

  • Read, write and update storage objects

– Built-in search, insert, append, delete algorithms – Automatically converts small storage objects to large

  • bjects when they can't fit on a page

– Can implement application-specific versioning

  • Provides locking, buffering and recovery protocols

– E.g. Read non-empty portions of the leaf blocks of the

desired byte range into a variable length buffer block

slide-10
SLIDE 10

File Objects

  • Collections of storage objects
  • Used to group objects together

– Read related objects in sequence (in physical order) – Related objects can be co-located on disk

  • Have an OID like large storage objects

– Objects can be accessed directly as well

  • A B+ tree-like index structure

– Use disk page number as the key – Leaf pages contain page numbers

  • Standard disk allocation for pages themselves
slide-11
SLIDE 11

E Programming Language

  • Used for all components that a DBI deals with
  • Extends C to support “persistent objects”

– Correspond to storage objects – References are similar to those of C structures

  • DBI can deal with array of key-pointer pairs

– The E translator deals with the internal structure of

persistent objects (e.g. lock/unlock, log) not the DBI

  • DBI can deliberately exercise control

– E supports statements to associate locking, buffering,

recovery with references to persistent objects

slide-12
SLIDE 12

E Programming Language (cont'd)

  • Other additions to C

– OID data type (for storage object Ids) – Parameterized types – Addition of “type” as a valid parameter data type for E

procedures (only!)

  • To allow access methods to use multiple data types

– Type constructors to define fields of persistent objects – E allows the DBI to manipulate the internal structure of

storage objects

  • Not a database programming language!

– E is to develop internal system software

slide-13
SLIDE 13

Access Methods

  • Associative access to file of storage objects

– Further support for versioning if needed

  • A library of type-independent index structures

– B+ tree, Grid files, Linear hashing, etc. – Implemented using the “type parameter” property in E

  • Use existing access methods with DBI-defined abstract

data types without modifications

– As long as access method requirements are satisfied

  • Can easily implement new access methods

– Don't have to deal with main memory data structures

slide-14
SLIDE 14

Operator Methods

  • A collections of methods and their combination (as

E procedures) to operate on storage objects

– Schema-independent (necessary schema information

requested at run-time or compiled by the optimizer)

  • Contains code by both the DBI and EXODUS

– EXODUS provides code for operators that operate on a

single type of storage object (e.g. Selection)

  • Does not provide application (or data model) specific

methods (e.g. Relational join, examining images)

– The DBI may implement one or more methods for each

  • perator in the target query language
  • Can be schema-dependent (Hire-employee, change-job)
slide-15
SLIDE 15

The Type Manager

  • Schema support for application-specific systems

– Handle wide range of application efficiently

  • Class hierarchy with multiple inheritance

– Base types (integer, char, object ID, etc.) – Constructed types (record, array, set and bag) – DBI can define new base types and operations

  • Using abstract data types
  • One-to-one mapping between class instances

(typed objects) and storage objects

– A class of typed objects can include fields with large

multidimensional arrays of real numbers

slide-16
SLIDE 16

Type Manager Class Hierarchy

  • Loose hierarchy

– Classes can inherit one or more classes

  • If field names are the same choose one or rename
  • Meta-class Class contains inheritance information
  • All classes are subclasses or class Object

– Including “Class”

  • Files can contain objects of only one class

– But this can be the Object class!

slide-17
SLIDE 17

Query optimizer and compiler

  • Query execution in EXODUS (similar to system R)
  • The parser transform the query to an initial tree

– Logical operators as internal nodes, relations as leaves

  • The optimizer creates an access plan

– A rearranged tree of operator methods (particular

instances of operators)

  • Methods as internal nodes, files/indices as leaves
  • The Type Manager is invoked during parsing and optimization

Parse Optimize Compile as Executable

slide-18
SLIDE 18

Query optimizer (the generator)

  • A generator that produces an optimizer for an

application-specific database system

  • The DBI must supply

– A description of the operators of the target query

language

– A list of methods to implement each operator – A cost formula for each operator method – A collection of transformation rules

  • The generator transforms these description files

into C code for the target query language optimizer

slide-19
SLIDE 19

The optimization procedure

  • Uses two principal data structures

– MESH: A directed graph of alternative operator trees

  • Initially the tree of the original query

– OPEN: A priority queue of the applicable transformations

  • rdered by the expected cost decrease of transformation
  • Select lowest cost method for each node in MESH
  • Find possible transformations and insert into OPEN
  • Repeat until OPEN is empty, then apply the most

promising transformation to MESH

  • Reuse equal nodes (same operator, argument and

inputs)

slide-20
SLIDE 20

CC and Recovery

  • Based on Wekum's layered transaction model:

– Each layer presents a set of objects and associated

  • perations (aka mini-transactions) to client layers
  • Each transaction in layer is a series of mini-transactions in
  • ne or more of its servant layers

– Two-phase locking on objects within a given layer

  • Objects in the servant layer are locked on behalf of

transaction in the client layer (held until it completes)

– Level-specific recovery information is logged

  • When the mini-transaction completes, log is replaced with

a simpler client-level representation of the entire operation

  • First need to undo the last incomplete mini-transactions,

then run the inverse of each completed mini-transaction

slide-21
SLIDE 21

CC and Recovery (cont'd)

  • EXODUS is not strictly hierarchical
  • Two-phase locking is unacceptable for access

methods: EXODUS provides more general locking

– Locks set by a servant can be explicitly released or

passed to its client

– Lock passing is needed to prevent consistency problems

  • Prevents phantoms when multiple clients use an index and

a new key-pointer pair is inserted

  • EXODUS employs more efficient circular log

management techniques (instead of stacks)

– Ignores entries of completed mini-transactions during

recovery process

slide-22
SLIDE 22

User interface construction

  • UI at brainstorming stage at the time of writing
  • Can not design a generic user interface

– Interactive (ad-hoc) interfaces

  • Use a generator to facilitate interface creation

– Embedded query interfaces

  • If program only calls operator methods (parser and
  • ptimizer bypassed): Use a linker to bind the methods
  • Otherwise use a generalized tool to handle such programs

– Some applications may require completely different

forms of UI which has to be called for

slide-23
SLIDE 23

EXODUS 22 Years Later...

  • EXODUS is dead. Latest paper was dated 1990.
  • It was succeeded by SHORE project

– SHORE also died in 1997

  • But...

– The Shore Storage Manager remains more or less active

today

slide-24
SLIDE 24

THE END

  • Reference

– Michael J. Carey, David J. DeWitt, Daniel Frank, Goetz

Graefe, Joel E. Richardson, Eugene J. Shekita, M. Muralikrishna: The Architecture of the EXODUS Extensible DBMS. On Object-Oriented Database System