Learning Tree to Word Transducers LATA 2014 Aur elien Lemay joint - PowerPoint PPT Presentation

Learning Tree to Word Transducers LATA 2014 Aur´ elien Lemay joint work with: Gr´ egoire Laurence Joachim Niehren Slawek Staworko Marc Tommasi March 11, 2014 Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 1 / 32

Learning Tree Transductions Transforming structured datas Example of XSLT transformation : from XML to XHTML Many applications, many formalisms... requires some expertise One solution : infer the transformation Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 2 / 32

Learning Subsequential Transducer Learning Subsequential Transducer[OncinaGarciaVidal93] Subsequential transducers are learnable from examples with polynomial time and data (Gold Model [Gold78]) Two main ideas: Onward normal form [Choffrut79] : produce the output as soon as possible a /ε a / b 0 1 0 1 a / b a /ε Two subsequential transducers for τ ( a 2 n ) = b n State merging algorithm : OSTIA [OncinaGarciaVidal93] Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 3 / 32

Extensions of OSTIA - two learnable classes Rational Functions [BoiretLemayNiehren12] ◮ Represented by Subsequential transducers w. deterministic look-ahead ◮ Normal form (inspired by bimachines [ReteunauerSchutzenberger92]) ◮ Learning algorithm ≃ learn the look-ahead, then apply OSTIA Top-Down Tree-to-Tree Transducers [LemayManethNiehren11] ◮ Earliest normal form [EngelfrietManethSeidl09] : earliest production (produce as ’up’ as possible), ◮ Myhill-Nerode kind of theorem in [LemayManethNiehren11] ◮ Learning based on a state merging algorithm Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 4 / 32

Toward learning MSO tree transformations ? MSO tree Transformation [Courcelle92] : an interesting target for learning tree transformation ! The big picture MSO tree transformations ≃ Macro Tree Transducers w. regular look-ahead ( MTT R ) [EngelfrietManeth03] ≃ Top-Down + Concatenation + Look-ahead Top-Down Tree transducers : learnable � Look-ahead : learnable � ◮ not extended to trees yet Concatenation in the output : ? Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 5 / 32

Outline Tree to Word Transducers 1 Normal Form 2 A Myhill-Nerode Theorem 3 learning Algorithm 4 Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 6 / 32

Tree to Word Transducers - An example XML-like Serialization Axiom : q ( x 0 ) q ( f ( x 1 , x 2 )) → < f > · q ( x 1 ) · q ( x 2 ) < / f > q ( g ( x 1 , x 2 )) → < g > · q ( x 1 ) · q ( x 2 ) < / g > q ( a ) → < a / > q ( b ) → f g b a b Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 8 / 32

Tree to Word Transducers - An example XML-like Serialization Axiom : q ( x 0 ) q ( f ( x 1 , x 2 )) → < f > · q ( x 1 ) · q ( x 2 ) < / f > q ( g ( x 1 , x 2 )) → < g > · q ( x 1 ) · q ( x 2 ) < / g > q ( a ) → < a / > q ( b ) → q f g b a b Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 8 / 32

Tree to Word Transducers - An example XML-like Serialization Axiom : q ( x 0 ) q ( f ( x 1 , x 2 )) → < f > · q ( x 1 ) · q ( x 2 ) < / f > q ( g ( x 1 , x 2 )) → < g > · q ( x 1 ) · q ( x 2 ) < / g > q ( a ) → < a / > q ( b ) → q q <f> </f> g b a b Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 8 / 32

Tree to Word Transducers - An example XML-like Serialization Axiom : q ( x 0 ) q ( f ( x 1 , x 2 )) → < f > · q ( x 1 ) · q ( x 2 ) < / f > q ( g ( x 1 , x 2 )) → < g > · q ( x 1 ) · q ( x 2 ) < / g > q ( a ) → < a / > q ( b ) → q <f> </f> g a b Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 8 / 32

Tree to Word Transducers - An example XML-like Serialization Axiom : q ( x 0 ) q ( f ( x 1 , x 2 )) → < f > · q ( x 1 ) · q ( x 2 ) < / f > q ( g ( x 1 , x 2 )) → < g > · q ( x 1 ) · q ( x 2 ) < / g > q ( a ) → < a / > q ( b ) → q q <f> <g> </g> </f> a b Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 8 / 32

Tree to Word Transducers - An example XML-like Serialization Axiom : q ( x 0 ) q ( f ( x 1 , x 2 )) → < f > · q ( x 1 ) · q ( x 2 ) < / f > q ( g ( x 1 , x 2 )) → < g > · q ( x 1 ) · q ( x 2 ) < / g > q ( a ) → < a / > q ( b ) → <g> <a/> </g> </f> <f> Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 8 / 32

Tree to word Transducers - Presentation Axiom : u 0 · q ( x 0 ) · u 1 Rules : q ( f ( x 1 , x 2 )) → u 0 · q 1 ( x 1 ) · u 1 · q 2 ( x 2 ) · u 2 Three Restrictions Deterministic Linear (no copy) Ordered (no swap) Deterministic Sequential Tree to Words (STW) Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 9 / 32

Normal Form Earliest STW : produce as soon as possible Example transformation : count the number of symbols. τ count ( f ( a , f ( a , b ))) = ##### An STW for τ count Axiom: q q ( f ) → # q ( x 1 ) q ( x 2 ) q ( a ) → # q ( b ) → # Not earliest ! At least one ’#’ could be output from the beginning. Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 11 / 32

Normal Form Another STW for τ count : Axiom: # q q ( f ) → # q ( x 1 )# q ( x 2 ) q ( a ) → ε q ( b ) → ε Earliest (Rule 1) Produce as ’up’ as possible Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 12 / 32

Normal Form We want a unique normal form : Do we want : q ( f ) → # q ( x 1 )# q ( x 2 ) or q ( f ) → ## q ( x 1 ) q ( x 2 ) (or another choice ?) Earliest - Rule 2 produce as ’left’ as possible Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 13 / 32

Normal Form Earliest STW (eSTW) : produce as ’up’ and as ’left’ as possible Theorem [LaurenceLemayNiehrenStaworkoTommasi11] For any STW, there exists an equivalent unique minimal eSTW Possibly of exponential size The minimal eSTW of τ count Axiom: # q q ( f ) → ## q ( x 1 ) q ( x 2 ) q ( a ) → ε q ( b ) → ε Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 14 / 32

A Myhill-Nerode Theorem for STW constructive algorithm for can ( τ ) (minimal eSTW for τ ) builds for each input path p a τ p p ≃ p ′ iff τ p = τ p ′ Myhill-Nerode Theorem for STW τ is represented by a STW ⇔ ≃ is of Finite Index ⇔ can ( τ ) is the minimal eSTW of τ Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 16 / 32

Building Axiom Axiom : lcp ( range ( τ )) · q ε · lcs ′ ( range ( τ )) lcp : longest common prefix lcs’ : longest common suffix (minus what is in lcp) For τ count : lcp ( range ( τ count )) = # lcs ′ ( range ( τ count )) = ε Axiom : # q ε Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 17 / 32

Building τ ε Axiom : # q ε We define τ ε : For any t , τ ε ( t ) = # − 1 τ count ( t ) Defining τ ε a → ε b → ε f ( a , a ) → # 2 ... f ( f ( a , b ) , a ) → # 4 ... τ ε ( t ) : # | t |− 1 Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 18 / 32

Building Rules for Leaf Symbols Rules from state q ε For leaf symbols : τ ε ( a ) → ε τ ε ( b ) → ε Rules q ε ( a ) → ε q ε ( b ) → ε Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 19 / 32

Building Other Rules (1) build the rule q ε ( f ( x 1 , x 2 )) → u 0 · q ( f , 1) · u 1 · q ( f , 2) · u 2 First, u 0 = lcp ( { τ ε )( f (? , ?))) } ) Compute u 0 from τ ε ( f (? , ?)) f ( a , a ) → # 2 ... f ( f ( a , b ) , a ) → # 4 ... u 0 = # 2 q ε ( f ( x 1 , x 2 )) → # 2 · q ( f , 1) · u 1 · q ( f , 2) · u 2 Aur´ elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11, 2014 20 / 32

Learning Tree to Word Transducers LATA 2014 Aur elien Lemay joint - PowerPoint PPT Presentation

Learning Tree to Word Transducers LATA 2014 Aur elien Lemay joint work with: Gr egoire Laurence Joachim Niehren Slawek Staworko Marc Tommasi March 11, 2014 Aur elien Lemay (INRIA Lille) Learning Tree to Word Transducers March 11,

Tree Transducers and Tree Adjoining Grammars Historical and Current Perspectives William C.

Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 ? C O L C O L B+ tree

Memory Memory Decoders M bits M bits RWM NVRWM ROM S 0 S 0 Word 0 Word 0 S 1 Word 1 Word

Equivalence of Deterministic Tree-to-String Transducers Is Decidable Helmut Seidl, Sebastian

Tree Transducers Niko Paltzer Seminar Formal Grammars WS 06/07 Advisor: Marco Kuhlmann

Weighted Tree Transducers in Natural Language Processing Andreas Maletti Universitat Rovira i

A Gentle Introduction to Weighted Extended Top-down Tree Transducers Andreas Maletti Universitat

Categorical views on bottom-up tree transducers Tarmo Uustalu joint work with Ichiro Hasuo, Bart

Compositions of Extended Top-down Tree Transducers Andreas Maletti March 30, 2007 Short

Tree Transducers in Machine Translation Andreas Maletti Universitt Stuttgart Institute for

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Relating Tree Series Transducers and Weighted Tree Automata Andreas Maletti December 17, 2004

Towards Register Minimisation of Streaming String Transducers Pierre-Alain Reynier LIS,

Kleenex: From nondeterministic finite state transducers to streaming string transducers Fritz

Tree-sitter @maxbrunsfeld What is Tree-sitter? Why I wrote Tree-sitter What were

The Learning Tree Workshop: The Learning Tree Workshop: Experience-based Learning Series on

Wikidata as authority linking hub Joachim Neubert (ZBW) Jakob Vo (VZG) Introduction

Goals for Today Learning Objective: Define a taxonomy for virtualization architectures

Almost disjoint families and relative versions of covering properties of -paracompactness type

The Art and Science of Memory Allocation Don Porter 1 COMP 790: OS Implementation Logical

Inferring Energy Bounds via Static Program Analysis and Evolutionary Modeling of Basic Blocks

Lets port together. Debian fun for everyone. Peter De Schrijver Most civilised people are

COMPONENT DRIVEN DESIGN AND DEVELOPMENT Cristina Chumillas CRISTINA CHUMILLAS @chumillas ckrina

Chapter 12: Systematic Development Helmut Simonis Cork Constraint Computation Centre Computer