AMAχOS—Abstract Machine for Xcerpt
➊ Principles ➋ Architecture
François Bry, Tim Furche, Benedikt Linse
PPSWR ‘06, Budva, Montenegro, June 11th, 2006
AM A O SAbstract Machine for Xcerpt Principles Architecture - - PowerPoint PPT Presentation
AM A O SAbstract Machine for Xcerpt Principles Architecture PPSWR 06, Budva, Montenegro, June 11th, 2006 Franois Bry, Tim Furche , Benedikt Linse Abstract Machine(s) Definition and Variants abstract machine :=
François Bry, Tim Furche, Benedikt Linse
PPSWR ‘06, Budva, Montenegro, June 11th, 2006
2
3
ECMA-335
3rd Edition / June 2005Common Language Infrastructure (CLI) Partitions I to VI
C O V E R F E A T U R E
Once confined to specialized server and mainframe systems, virtualization is now supported in off-the-shelf systems based on Intel architecture
for processor virtualization, enabling simplifications of virtual machine monitor software. Resulting VMMs can support a wider range of legacy and future operating systems while maintaining high performance.
Carbon and Mac OS X
As shown in the following figure, Carbon is one of several application environments available on Mac OS X.
4
5
6
8
9
10
‘Cicero’
bib conference paper paper posters title author
‘Wax Tablets’
pc name member
‘Cicero’ ‘Storage Media’
member
‘Hirtius’
1 1 2 3 4 1 3 1 1 2
d1 d2 d3 d4 d7 d8 d9 paper
2
d5
1
author d6 d11 d12 d14 d13 author
1
d10
Variable Node Sub-Matrix v5 d2 Variable Node Sub-Matrix v4 d3 v3 d11 v2 d13 v4 d5 Variable Node Sub-Matrix Variable Node Sub-Matrix v1 d6 v1 d7
conference paper name member v2 Child+ Child Child+ Root Child+ v1 author v3 v4 v5
11
12
13
14
15
0.01 0.1 1 10 100 1000 10000 5 10 15 20 25 30 35 40 time (msec, logarithmic) query size (variables) without memoization with memoization
data size fixed
16
100 200 300 400 500 600 700 800 900 5 10 15 20 25 time (msec) data size (MB) top-down
query size fixed (~ 20 nodes)
17
AMAχOS Node Local Data Source
—e.g. document —e.g. database
Remote Data Source
—e.g. Web service
Application
control API (Java)
Application
Web Service API
Application
command-line interface
Xcerpt Node Query Compiler
Xcerpt Program
rule 1: c1 ← q1,1 ∧ q1,2 ∧ … ∧ q1,k1 rule 2: c2 ← q2,1 ∧ q2,2 ∧ … ∧ q2,k2 rule 3: c3 ← q3,1 ∨ q3,2 ∨ … ∧ q3,k3 …
rule 1 rule 2 rule n
AMAχOS Code
Hint Segment Dependency Segment Code Segment
…
rule 1 rule 2 rule n
AMAχOS Node AMAχOS Node Local Data Source
—e.g. document —e.g. database
AMAχOS Node AMAχOS Node Local Data Source
—e.g. document —e.g. database
AMAχOS Node Remote Data Source
—e.g. Web service
rule 1 query conjunct q1,1 query conjunct q1,2 rule 2 query conjunct q2,1 query conjunct q2,2
18
Compilation API
— simple observation and control API — compilation strategies
Execution & Answer API Data Access Layer Parsing & Validation Layer Compilation Layer Serialization Layer Schema Access Layer Execution Layer (AMAχOS)
— control, observation, parameterization — OO & Web Service API — program parsing and validation — multi-parser, normalization, modules — unsatisfiable, tautological parts — extensive query optimization — pattern matching engine — rule dispatcher and engine — provides access to schema of data — type checking for compilation — incremental data access — storage and indexing engine — incremental answer creation — versatile Web format support
Data Plane Program Plane Control Plane
19
rule 1 rule 2
AM Code
Hint Segment Dependency Segment Code Segment
…
Abstract Machine AMAχOS Rule Engine Storage Manager Pattern Matching Engine
Variable Node Sub-Matrix v5 d2 Variable Node Sub-Matrix v4 d3 v3 d11 v2 d13 v4 d5 Variable Node Sub-Matrix Variable Node Sub-Matrix v1 d6 v1 d7Memoization Matrix
Static Function Library
Storage & Index Hints
Rule Dispatch
Code Scheduler Dependency Hints Function Call
Construction Engine
Substitution Sets Answer Construction rule 1 rule 2
In-Memory Answer Abstract Machine Code Rule Call (Recursion)
20 Query Compilation Logical Optimization—Algebraic Optimization Physical Plan Generation Code Generation
Index and Storage Model Selection
Rewriting System Typed AST
Query Classification
Optimized Logical QP Translation logical algebra — patterns: annotated conjunctive queries over semi-structured graphs — rules: unfolding into complex value or object algebra where possible Physical Query Plan
Operator Algorithm Selection
Code Generator Rewriting system — elimination of dead and tautological query parts — join placement optimization — query compaction (common subexpressions) Query Plan Canonic Logical determines class of query, e.g., to choose efficient alg. for sub-languages determines realization of operators generate AM-code — direct representation of physical query plan — platform-independent — motion of invariant code — dead-code elimination
Child Child+ Root Child+ z x w v y Child s r πw w ×
r(y,z)
y z πx x ×
s(x,w)
s w
Translator AM Code selects in-memory representation and indices for data access
21
➊ “Compile once” ➋ “Execute anywhere” ➌ “Optimize all the time”