\maketitle
Tutorialon
TreeTransducers
HendrikJanHoogeboom
LIACSLeiden
CSL/GAMES
Lausanne,sept 07
TreeTransducers HendrikJanHoogeboom LIACSLeiden CSL/GAMES - - PowerPoint PPT Presentation
\maketitle Tutorialon TreeTransducers HendrikJanHoogeboom LIACSLeiden CSL/GAMES Lausanne,sept 07 from tree to tree parse parse val val symbolic/ syntax%directedtranslation compilertheory
Tutorialon
LIACSLeiden
Lausanne,sept 07
parse val parse val
symbolic/ syntax%directedtranslation
1960Ironssyntax%directedtranslation 1968Knuthattributegrammar 1968Thatcher&Roundstop%down,bottom%up 1980Aho&Ullman tree%walkingtr. 1985Engelfriet&Vogler macrotreetr. 2000Milo&Suciu&Vianu pebbletreetr.XML Fülöp&Vogler booktreetransducers Maneth Tarragonalectures
rankedtrees~terms
[nestedstrings]
f f f f
k 1 2
f ∈ Σk f(x1x2…xk) fx1x2…xk … rankedalphabet (Σ,rank) rank:Σ → Σkrankk δ σ a b b a
1 1 2 3 2
Σ0 ={a,b} Σ2 ={δ} Σ3 ={σ}
bottom%up evaluation top%down grammatical tree%walking navigation ‘parallel’
T F T F ∨ ∨ ∨ ∨ T ∧ ∧ ∧ ∧ ∨ ∨ ∨ ∨ ∨ ∨ ∨ ∨ F ∧ ∧ ∧ ∧ F T ∨ ∨ ∨ ∨ T ∨ ∨ ∨ ∨ ∧ ∧ ∧ ∧ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ∧ ∧ ∧ ∧ ∧ ∧ ∧ ∧ 1 1 1 1 ∧ ∧ ∧ ∧ 1 1 1 1 ∧ ∧ ∧ ∧ 1 1 1 1 1 1 1 1 1 1 1 1 ∨ ∨ ∨ ∨ ∨ ∨ ∨ ∨ 1 1 1 1 1 1 1 1 ∨ ∨ ∨ ∨ 1 1 1 1 1 1 1 1 ∨ ∨ ∨ ∨ 1 1 1 1 1 1 1 1 1 1 1 1 F T 1 1 1 1
evaluation rules:
1 k
q
i
1 k i
q1 qi qk σ(q1…qk)→ qrank(σ)=k σ → qrank(σ)=0 ⇒ ⇒ ⇒ ⇒
∨ ∨ ∨ ∨ 1 1 1 1 1 1 1 1 1 1 1 1 F T 1 1 1 1
F→ 0 T→ 1 ∨ ∨ ∨ ∨11→1 F,T∈ Σ0 ∨ ∨ ∨ ∨,∧ ∧ ∧ ∧ ∈ Σ2
acceptancebyfinalstate(atroot)
p1 :A→ AaB p2 :A→ a p3 :B→ Bb p4 :B→ A B p1 p2 a p3 p4 b a A
p1
A a B p1 A a p3 B b ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒2 rewritingatleaves derivationtree p3 B b →
regulartreegrammar
A
closedunderintersection, complementation decidableemptiness (equivalence) natural! ► ► ►
T F F ∨ ∨ ∨ ∨ T ∧ ∧ ∧ ∧ ∨ ∨ ∨ ∨ ∨ ∨ ∨ ∨ F ∧ ∧ ∧ ∧ T ∨ ∨ ∨ ∨ ∨ ∨ ∨ ∨ ∧ ∧ ∧ ∧ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ∧ ∧ ∧ ∧ 1 1 1 1 ∧ ∧ ∧ ∧ 1 1 1 1 1 1 1 1 1 1 1 1 ∨ ∨ ∨ ∨ ∨ ∨ ∨ ∨ 1 1 1 1 1 1 1 1 F T 1 1 1 1 2 1 2 1 2 1 2 1 ∧ ∧ ∧ ∧ 1 ∨ ∨ ∨ ∨ 1 1 1 1 1 1 1 1 1 ? ? ?
cf.two%wayfinitestateautomaton
evaluatesand/ortrees!
3 1 2 H down2 lab∨∧
∨∧ ∨∧ ∨∧:down1
labTF ch1:up ch2:up root
example:treetraversal walkalongedges,movesbasedon
(=incomingedge)
∧ ∧ ∧ ∧ 1 2 1
2 lab ch 2 1 3
whenlabelis∨ ∨ ∨ ∨ or∧ ∧ ∧ ∧ movetofirstchild
bottom%up
evaluation
top%down
grammatical
tree%walking
navigation
⊇ ⊇ ⊇ ⊇
“twa easily loosetheirway”
Bojańczyk & Colcombet
TWA⊂ ⊂ ⊂ ⊂ REG (aa)* notbyTWA butFO(!) a a a a a a a a b b b b b a a b a
Bojańczyk & Colcombet
TWA⊂ ⊂ ⊂ ⊂ REG notbyTWA butPTWA usingapebble a a a a a a a a b b b b b a a b a
nestedlifetimes nestedlifetimes nestedlifetimes nestedlifetimes LIFO pebble pebble pebble pebble:marksanode ‘regular’ extension
drop retrieve
fixednumberforautomaton canbedistinguished&reused
1 1 1 1 2 2 2 2 3 3 3 3 2 2 2 2
J.Engelfriet,H.J.Hoogeboom.Tree%walkingpebble automata,Jewelsareforever,1999. M.Bojańczyk,T.Colcombet.Tree%walkingautomatado notrecognizeallregularlanguages,STOC'05. J.Engelfriet,H.J.Hoogeboom.Nestedpebblesand transitiveclosure,LMCS,2007. M.Bojańczyk,M.Samuelides,T.Schwentick,L.Segoufin. Ontheexpressivepowerofpebbleautomata, ICALP'06.
FO(+1) FO(<) TWA TWA TWA TWA dTW dTW dTW dTW FO+posTC FO+posTC FO+posTC FO+posTC1
1 1 1=
= = =PTWA PTWA PTWA PTWA MSO MSO MSO MSO= = = =REG REG REG REG
⊂ ⊂ ⊂ ⊂ ? ? ? ?
BojSamSchSeg’06
⊂ ⊂ ⊂ REGstrict
physicalvs.abstract FO+dTC FO+dTC FO+dTC FO+dTC1
1 1 1=
= = =dPTWA dPTWA dPTWA dPTWA TC(FO+mod) NevSch
nested pebbles & transitive closure EngHoo’06
Doner; ThaWri’68
σ x1 x2 x3 q q q q() δ ε a q q q q′ ′ ′ ′(x1) q q q q′′ ′′ ′′ ′′(x2)
state + subtree ≡ node (input)
→ → → → q(σ(x1 … xk)) → t ∈ T∆[Q(Xk)]rank(σ)=k q(σ) → t ∈ T∆
rank(σ)=0
A A A A =(Σ,∆,Q,Qd,R) {(t,s)∈ TΣ×T∆ |q(t)⇒* s,q∈Qd }
Σ0 ={e},Σ1 ={a} ∆0 ={e},∆2={d} a a a e a a e a a d e q(a(x))→ d(q(x),q(x)) q(e)→ e ⇒ ⇒ ⇒ ⇒
q() q() q() ⇒
⇒ ⇒ ⇒ a e a a d e
q()
d a e
q() q()
e a a d e
q()
d a e
q()
⇒ ⇒ ⇒ ⇒ d e
q() q()
Σ0 ={e},Σ1 ={a} ∆0 ={e},∆2={d} a a a e q(a(x))→ d(q(x),q(x)) q(e)→ e
q()
⇒ ⇒ ⇒ ⇒* e d e e d e d e d e e d e d d exponential sizeincrease
q(a(x1))→ d(q(x1)q(x1))copy q(c(x1x2x3))→ d(q(x1)q(x2))delete rulesareconfluent linearheightincrease exponentialsizeincrease
yield
yieldlinearinput→ ET0L
moreLindenmayer connections
⇒ ⇒ ⇒ ⇒*
‘T’ copyinginput,processingcopiesdifferently
Σ0 ={e},Σ1 ={a,σ} ∆0 ={e},∆1 ={a,b},∆2={σ} a a σ a e a a a e a a σ a e q(σ(x))→ σ(q(x),q(x)) q(a(x))→ a(q(x))| b(q(x)) q(e)→ e b a b e b b σ a e ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒*
q q q
‘B1’ copyingoutputafternondet processinginput
e → q(e) a(q(x))→ q(a(x))| q(b(x)) σ(q(x))→ q(σ(xx)) Σ0 ={e},Σ1 ={a,σ} ∆0 ={e},∆1 ={a,b},∆2={σ} a a σ a e b a σ b e b a b e b a σ b e ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒*
q
a a σ a e ⇒ ⇒ ⇒ ⇒
q
a a σ a e ⇒ ⇒ ⇒ ⇒
q
‘B1’ copyingoutputafter nondet processinginput
a a σ a e b a σ b e b a b e b a σ b e ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒*
q ‘T’ copyinginput,processing copiesdifferently
a a σ a e a a a e a a σ a e b a b e b b σ a e ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒*
q q q
top%down bottom%up
copy relabel copy relabel
top%down2
2 2 2
bottom%up2
2 2 2
TDTandBUT … haveREGdomains … REGclosedunderinverse … areincomparable … arenotclosedundercomposition
linear linear linear linear – nocopy lin%BUT =lin%TDT* =lin%TD+regularlook%ahead =(FTA∪ RELAB∪ LHOM)*
q(a(x1))→ d(q(x1)q(x1))
singlestate, nocopy intersection REG
… isclosedundercomposition … REGclosedunderlin%BUT
[Knuth68] ⇒ ⇒ ⇒ ⇒*
1 + exp
1 + 1 + exp 1 1 + 1 + 1 + exp +
11010 21+22+24 S:S→ N p0:N→ N0 p1:N→ N1 e:N→ 1 p0 p1 p1 e p0 s handlecontext!
B→ bA B→ BbA aabaB ⇒ aababA aaBbaA ⇒ aaBbAbaA
STRINGS TREES
A B B B A B B → ⇒ A B B A B
tree substitution
B→ bA B→ BbA aabaB ⇒ aababA aaBbaA ⇒ aaBbAbaA
STRINGS TREES
B
howtohandlesubtrees?
A B B →
B
parameters←actual subtrees
A A B →
B→ t∈ TΣ[N]Nnonterminals B∈N
B(y1,…,ym)→ t∈ TΣ∪ N[Ym] Nranked nonterminals B∈Nm Ym ={y1,…,ym}parameters y3 y1 y2 y3 y2 y2 t1 t3
y + y y + a F y + b F y F → | a F b F | → S y F → S ⇒ a F a + b F ⇒ + b F a + b ⇒ ⇒ + + b a + b + b a + b
yield{ww |w∈ {a,b}* } notcontext%free
S∈ N0 F∈ N1 a,b ∈ Σ0 +∈ Σ2
a + A y + y F y F → A F → S → S ⇒ A F A + A F ⇒ ⇒ S,A∈ N0 F∈ N1 A a,b ∈ Σ0 +∈ Σ2 A + a | | b | y
yield {w∈{a,b}* |whas2n b’s }
A + A A A + + F A + A A A + + ⇒ ⇒* …
A + A A A + + F cfg:leftmostvs.unrestrictedderivations OIoutside%in top%down‘lazy’ unrestriced IOinside%out bottom%up‘eager’
a + A y + y F y F → A F → S → S ⇒ A F S,A∈ N0 F∈ N1 A a,b ∈ Σ0 +∈ Σ2 A + a | | b | y
yield {w^2n |w∈ a*ba* }
⇒* + a a + b + a + a F ⇒* …
IO%CFTandOI%cft incomparable
IOgeneratesmoreequalcopies OIislazy:unsuccessfulsubtrees
OI≡ unrestricted
postpone‘inner’ steps context%freeproperty
yieldREG=CFL yieldOI%CFT=Indexed
J.Engelfriet,H.Vogler.Macrotreetransducers,JCSS, 1985.
top%downtreetransducers(input)& context%freetreegrammars(output) A B B → σ x1 x2 x3 q q q q δ ε a q q q q′ ′ ′ ′(x1) q q q q′′ ′′ ′′ ′′(x2)
state + subtree ≡ node (input)
→ → → → q(σ(x1 … xk)) → t ∈ T∆[Q(Xk)]rank(σ)=k
regular
top%downtreetransducers(input)& context%freetreegrammars(output)
context%free
A B B → σ x1 x2 x3 q(y1,y2) δ y2 q′(x1) q′′(x2)
state + subtree ≡ node (input)+ parameters (output)
→ → → → y2 a q(σ(x1 … xk),y1 … ym) → t ∈ T∆∪Q(Xk)[Ym]rank(σ)=k,rank(q)=m
q0a(x1) → q(x1)(q(x1)e) qa(x1),y1 → q(x1)(q(x1)(y1)) qe,y1 → a(y1) x a
q0
y q(x) q(x) → → → → exponentialsize%to%height doubleexponentional size%to%size q0(aae)⇒ q(ae)q(ae)e ⇒ q(e)q(e)q(ae)e ⇒ aq(e)q(ae)e ⇒ aaq(ae)e ⇒ aaq(e)q(e)e ⇒ aaaq(e)e ⇒ aaaae
bottom%upinspection
T%1(R)∈ REG
T.Milo,D.Suciu,V.Vianu.Typechecking forXML transformers,JCSS,2003. J.Engelfriet,S.Maneth.Acomparisonofpebbletree transducerswithmacrotreetransducers,Acta Inf, 2003.
Miloetal.2000: ‘allXMLquerylanguages canbemodeled byk%pebbletreetransducers’ greatfornavigation,but: k%PTTcannottestallregulardomains comparisonpebbles vs.macro:
⊆ 0%dPTT3 ⇒ samecompositionclosure tree%walkingautomataAho&Ullman 71
x y x x x x≤ ≤ ≤ ≤ y y y y x ∀ ∀ ∀ ∀x x x xϕ ϕ ϕ ϕ(x) (x) (x) (x)
alwayshalting freevariables~ fixedpebbles
laba(x) edgi(x,y) x x x x≤ ≤ ≤ ≤ y y y y x=y ¬ ∧ ∧ ∧ ∧ ∨ ∨ ∨ ∨ ∀ ∀ ∀ ∀x x x x ∃ ∃ ∃ ∃x ϕ*(x,y)
pebblesarenice (intheory)! theyimplement
a a a a a a a a b b b b b a a b a pebblesarenice(inpractice)! theycanbe‘programmed’
► ‘classic’ pebbles comparisonpebblesvs.macro:
⊆ 0%dPTT3 ► introducinginvisiblepebbles issues% decomposition % complexityperpebble
by
with
Joost Engelfriet HendrikJanHoogeboom BartSamwel
(LeidenUniversity,NL) PODSBeijingJune2007
<A> <B>… </B> <C>… </C> <B>… </B> </A>
A B C B
ranked trees nodelabelswithrank unboundednumberofchildren (forests)aretobecoded [usually]thisisnoproblem
rank(A)=3
1 3 2
Milo Milo Milo MiloSuciu Suciu Suciu Suciu Vianu Vianu Vianu Vianu PODS2000 PODS2000 PODS2000 PODS2000 typecheckingforXMLtransformersisdecidable transformerswith‘visible’ pebbles: finitenumberofcoloured markersontree
typechecking
decidewhethertree(document)generatedby transformation satisfiesdescription
Gout
1. automatawithpebbles 2. decomposition 3. typechecking 4. regulartrees 5. documentnavigation 6. patternmatching 7. conclusion
1 n j
q
local local local local configuration configuration configuration configuration qstate σ nodelabel jchild number j=0root bpebble colours b⊆ C i instructions instructions instructions instructions (q,σ,b,j)→ (halt) (q’,stay) (q’,up) (q’,downi) (q’,dropc) (q’,liftc)
with pebbles c b={c}
% finitesetC ofpebbles % nestedlifetimes stackbehaviour
% allobservable
with visible pebbles ‘colours’ used once always observable weadd invisible pebbles colours used many times
stack behaviour ofpebbles! (avoid ‘counting’)
donotrecognizeall regulartreelanguages ≡ MSOproperties ☺ recognizeregular &decidabletypechecking &bettercomplexity
(q,σ,b,j)→ (q’,stay) b contains %allvisiblepebbles %invisiblewhentopmost c1 c c2 c c
top
c c u2 u3 u4 u5 u1
validation navigation pattern matching
recursivelygenerateoutput q q q q
1 n
q1
2
q1 qn q (q,σ,b,j)→ δ(q1,q2 … qn)
inputtree
t q σ δ
1 n
q1
2
q1 qn q (q,σ,b,j)→ δ(q1,q2 … qn)
inputtree
t q σ δ recursivelygenerateoutput
eachqworksonseparatecopyinputtree
q’s maymoveup↑ anddown↓ inbetween NOTE
(q,σ,b,j)→ δ(q1,q2 … qn)
1 n
q1
2
q1 qn
inputtree
t qn σ
inputtree
t q2 σ
inputtree
t q1 σ
inputtree
t q σ δ recursivelygenerateoutput
a a a a a a a a b b b b b b b b b b b b b b b b b b b b b b b b a a a a a a a a b b b b a a a a b b b b b b b b a a a a a a a a a a a a a a a a b b b b
withoutpebbles
a a a a a a a a b b b b b b b b3
3 3 3
b b b b b b b b b b b b b b b b a a a a a a a a b b b b1
1 1 1
a a a a b b b b b b b b a a a a a a a a a a a a a a a a b b b b 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 2 1 1 2 1 2 1 2 1 3
(↓,b,%,j)→ (↓,down1) (↓,b,%,j)→ (↓,down2)
↓
j=0,1,2i=1,2
(↑,b,%,1)→ b(↑1,c2) (↑,b,%,2)→ b(c1,↑2) (↑i,b,%,i)→ (↑,up) (copy,a,%,j)→ a() (copy,b,%,j)→ b(c1,c2) (ci,b,%,j)→ (copy,downi) walkdown copyup copydown
the power of composition ⇒ ⇒ ⇒ ⇒* n+1 2n ⇒ ⇒ ⇒ ⇒* n n together:exponentialsize%to%height n%PTT:polynomialsizeincrease
VkI%PTT Vk%PTT I%PTT TT visible+invisible kvisiblepebblesMiloetal. invisibleonly tree%walking(nopebbles) PebbleTreeTransducers PebbleTreeAutomata VkI%PTA Vk%PTA I%PTA
1. automatawithpebbles 2. decomposition 3. typechecking 4. regulartrees 5. documentnavigation 6. patternmatching 7. conclusion
VkI%dPTT ⊆ dTT ◦ Vk%1I%dPTT
simulation k%1vis.pebbles deterministic preprocessing
iterate
VkI%dPTT ⊆ dTTk ◦ I%dPTT
(1) (2)
in in
copying can be done withoutpebbles preprocessing
t u v t↑u t↑v u u
‘root’ ‘root’
t u v
first visible pebble
moveup /down
into subtree
t↑u t↑v u u
‘root’ ‘root’
VkI%dPTT ⊆ dTT ◦ Vk%1I%dPTT I%dPTT ⊆ TT◦ dTT
(deterministic) THEOREM
1. automatawithpebbles 2. decomposition 3. typechecking 4. regulartrees 5. documentnavigation 6. patternmatching 7. conclusion
Bartha 1982 regulartreegrammarG forthedomain
inexponential time
inversetypeinference inversetypeinference inversetypeinference inversetypeinference
giventransducer andregularGout, constructregularGin suchthat L(Gin)=%1 L(Gout)
Gout
inversetypeinferenceissolvable ⇒ forTTinexponentialtime ⇒ forTTk ink%foldexponentialtime
typechecking typechecking typechecking typechecking
giventransducer andregularGin,Gout, decidewhether(L(Gin))⊆ L(Gout)
Gout
wecantypecheck ⇒ TTk in(k+1)%foldexponentialtime ⇒ Vk%PTT in(k+2)%foldexponentialtime ⇒ VkI%PTT in(k+3)%foldexponentialtime
Vk%PTT ⊆ TTk+1 VkI%PTT ⊆ TTk+2
M(A)⊆Biff A∩M%1(Bc)=∅
‘typechecking’ ‘inversetype inference’
invisiblepebblesarealmostforfree!
1. automatawithpebbles 2. decomposition 3. typechecking 4. regulartrees 5. documentnavigation 6. patternmatching 7. conclusion
regulartreelanguage ≡ bottom%uptreeevaluation ≡ post%orderevalation withstack
1 2 3 4 5 1 2 3 4
popchildren evaluate&push postorder evaluation
regulartreelanguage ≡ bottom%uptreeevaluation ≡ post%orderevalation withstack
VkI%PTT ⊆ TTk+2
Bojańczyk etal.
regulartreelanguage ≡ bottom%uptreeevaluation ≡ post%orderevalation withstack
VkI%PTT ⊆ TTk+2
I%PTAcan % evaluatemarked trees % testtheirvisibleconfiguration
1. automatawithpebbles 2. decomposition 3. typechecking 4. regulartrees 5. documentnavigation 6. patternmatching 7. conclusion
I%PTAcan % evaluatemarked trees % testtheirvisibleconfiguration
I%PTAcan % evaluatemarked trees % testtheirvisibleconfiguration
head +invisible
VI%PTAcantestϕ(x1,…,xn) withn%2 visiblepebbles (usinghead)
generaltestϕ(x1,…,xn)
XQuery for x1,…,xn with ϕ1∧
∧ ∧ ∧… ∧ ∧ ∧ ∧ϕn return t ϕibinary
u1 u5 u2 u3 u6 u4
ϕ1(x1,x2)∧ ∧ ∧ ∧ϕ2(x3,x6)∧ ∧ ∧ ∧ϕ3(x4,x3)∧ ∧ ∧ ∧ϕ4(x5,x6)∧ ∧ ∧ ∧ϕ5(x1,x4)
example ϕ3 ϕ4 ϕ2 ϕ5 ϕ1
1. automatawithpebbles 2. decomposition 3. typechecking 4. documentnavigation 5. patternmatching 6. conclusion
V%PTT I%PTT=TL
DTLdocumenttransformationlanguage Maneth etal.PODS’05
Milo,Suciu,Vianu