LEARNING AND INFERENCE WITH CONSTRAINTS ? Marco Gori University of Siena (Italy) ILP 2018
Outline • Environment and constraints • Bridging logic and real-valued constraints • Representational issues • Learning, Reasoning and Inference with constraints (lyrics s/w environment) ILP 2018
ENVIRONMENTS AND CONSTRAINTS ILP 2018
Supervised Learning ∧ ∨ L = { (( 0 , 0 ), 0 ), (( 0 , 1 ), 1 ), (( 1 , 0 ), 1 ), (( 1 , 1 ), 0 ) } = 1 2 y ((0 , 1) , (1 , 0)) ¬ y ((0 , 0) , (1 , 1)) . 3 4 “hard” architectural constraints Lagrangian framework x κ 3 − σ ( w 31 x κ 1 + w 32 x κ 2 + b 3 ) = 0 x κ 4 − σ ( w 41 x κ 1 + w 42 x κ 2 + b 4 ) = 0 κ = 1 , 2 , 3 , 4 x κ 5 − σ ( w 53 x κ 3 + w 54 x κ 4 + b 4 ) = 0 training set constraints x 15 = 1 , x 25 = 1 , x 35 = 0 , x 45 = 0 ILP 2018
Enforcing Consistencies f ω h : W → H : h → ω (h), f ah : W → A : h → a(h), f ω a : A → W : a → ω (a), f ω h (h) = f ω a ◦ f ah (h). This functional equation is imposing the circulation of coherence. Since the functions are linear, this constraint can be converted to w ω h h + b ω h = w ω a w ah h + (w ah b ah + b ω a ) . The equivalence ∀ h ∈ R + yields w ω a w ah − w ω h = 0 , w ah b ah + b ω a − b ω h = 0 .
Diagnosis and Prognosis in Medicine Pima Indian Diabetes Dataset ( MASS ≥ 30) ∧ ( PLASMA ≥ 126) ⇒ positive ( MASS ≤ 25) ∧ ( PLASMA ≤ 100) ⇒ negative. body mass index blood glucose Wisconsin Breast Cancer Prognosis ( SIZE ≥ 4) ∧ ( NODES ≥ 5) recurrent ⇒ ≥ ∧ ≥ ⇒ ( 1 9) ( = 0) ( SIZE ≤ 1 . 9) ∧ ( NODES = 0) non recurrent, ⇒ diameter of the tumor number of metastasized lymph nodes ILP 2018
Reconstruction of overwritten chars MNIST ? I was told that the foreground char is less or equal to the background char Recognize the foreground and background numbers DeepLearn 2018
Reconstruction of overwritten chars MNIST ILP 2018
Patterns, labels, and individuals ( X, x ) Giuseppe 178, 70, 45 label X pattern x What about learning and inference with individuals? ILP 2018
Inference in formal logic only labels are involved! Domain(label="People") Individual(label="Marco", "People") Individual(label="Giuseppe", "People") Individual(label="Michelangelo", "People") Individual(label="Francesco", "People") Individual(label="Franco", "People") Individual(label="Andrea", "People") Predicate(label="fatherOf", ("People", "People")) Predicate(label="grandFatherOf", ("People", "People")) Predicate(label="eq", ("People", "People"), function=eq) Constraint("fatherOf(Marco, Giuseppe)") Constraint("fatherOf(Giuseppe, Michelangelo)") Constraint("fatherOf(Giuseppe, Francesco)") Constraint("fatherOf(Franco, Andrea)") Constraint("forall x: not fatherOf(x,x)") Constraint("forall x: not grandFatherOf(x,x)") ILP 2018
Inference in formal logic Constraint("forall x: forall y: fatherOf(x,y) -> not fatherOf(y,x)") Constraint("forall x: forall y: grandFatherOf(x,y) -> not grandFatherOf(y,x)") Constraint("forall x: forall y: fatherOf(x,y) -> not grandFatherOf(x,y)") Constraint("forall x: forall y: grandFatherOf(x,y) -> not fatherOf(x,y)") Constraint("forall x: forall y: forall z: fatherOf(x,z) and fatherOf(z,y) -> grandFatherOf(x,y)") Constraint("forall x: forall y: forall z: (fatherOf(x,y) and not eq(x,z)) -> not fatherOf(z,y)") ILP 2018
Inference in formal logic true: grandFatherOf("Marco", "Michelangelo") , ¬ , grandFatherOf("Marco", "Francesco") , Constraint("forall x: forall y: forall z: grandFatherOf(x,z) and fatherOf(y,z) -> fatherOf(x,y)") ILP 2018
Full inference on individuals ( X, x ) from formal logic from neural nets consistency constraints ( age x , weight x , height x , age y , weight y , height y ) Complexity issues: the inference in the environment avoids massive exploration of the Boolean hypercube ILP 2018
BRIDGING LOGIC AND REAL-VALUED CONSTRAINTS ? learning relations and logic “There are finer fish in the sea that have ever been caught,” Irish proverb
Two Schools of Thought (Formal) Logic Optimization, statistics Any break through the wall?
Logic by Real Numbers p-norm Φ ( x, f ( x )) = 0 general form Φ ( f ( x )) = 0 ∀ x ILP 2018
Logic by Real Numbers (con’t) Gödel T-norm ILP 2018
Tricky Issues : t f 1 (x 1 )( 1 − f 2 (x 2 )) = 0 e 1 ⇒ 2 − = 0 holds true. Of also f 2 (x 2 )( 1 − f 1 (x 1 )) = 0 h 2 ⇒ 1 f 1 (x 1 ) + f 2 (x 2 ) − 2 f 1 (x 1 )f 2 (x 2 ) = 0 , 2 ⇔ 1 f 2 1 ( x 1 ) + f 2 2 ( x 2 ) − 2 f 1 ( x 1 ) f 2 ( x 2 ) = ( f 1 ( x 1 ) − f 2 ( x 2 )) 2 = 0 ? f 1 ( x 1 ) = f 2 ( x 2 ) Petr Hájek on Mathematical Fuzzy Logic, Springer 2016
Supervised Learning The discover of loss by t-norms … f ( x κ ) ⇔ y κ , = 1 , . . . , ` and Ł ukasiewicz, f ( x κ ) ⇒ y κ : min { 1 − f ( x κ ) + y κ , 1 } y κ ⇒ f ( x κ ) : min { 1 − y κ + f ( x κ ) , 1 } ( f ( x κ ) ⇒ y ( x κ )) ∧ ( y κ ⇒ f ( x κ )) max { min { 1 − f κ ( x κ ) + y κ , 1) } + min { 1 − y κ + f ( x κ ) , 1) , 1 }} 1 − | y κ − f ( x κ ) | Φ ( x, f ( x )) = 0 ILP 2018
Unsupervised Learning two groups ∀ x ( A ( x ) ⊕ B ( x )) ∧ D ( x ) exclusive properties all data are in a certain domain ∀ x ( A ( x ) ∨ B ( x )) ∧ D ( x ) inclusive properties ILP 2018
REPRESENTATIONAL ISSUES “the simplest solution” compatible with the constraints We use the Lagrangian optimization framework ILP 2018
A New Communication Protocol data + constraints ∀ x Φ ( x, f ( x )) = 0 from constraints to X φ 2 ( x κ , f ( x κ )) loss functions κ ∈ U
A New Communication Protocol data + constraints learning of constraints cognitive laws learning problem : φ i ( x , f ( x )) = 0 , • Supervised • Unsupervised tasks ? , f ( x )) • Semi-supervised perceptual space f ( x ))
The New Role of Learning Data ∈ hair ( x ) ⇒ mammal ( x ) mammal ( x ) ∧ hoofs ( x ) ⇒ ungulate ( x ) cognitive laws ungulate ( x ) ∧ white ( x ) ∧ blackstripes ( x ) ⇒ zebra ( x ) . : φ i ( x , f ( x )) = 0 , ? tasks f hair ( x )( 1 − f mammal ( x )) = 0 f mammal ( x ) f hoofs ( x )( 1 − f ungulate ( x )) = 0 , f ( x )) f ungulate ( x ) f white ( x ) f blackstripes ( x )( 1 − f zebra ( x )) = 0. perceptual space penalty functions perceptual space f ( x )) f ( x ))
The Marriage of Parsimony Principle and Constraints Parsimony Principle Constraints turn out to be loss functions ∥ f ∥ P keep these loss functions as small as possible f e f hair ( f hair ( x )( 1 − f mammal ( x )) = 0 ( ) ( ( x ) f hoofs ( )( − f mammal ( f mammal ( x ) f hoofs ( x )( 1 − f ungulate ( x )) = 0 )( − ( ) f ungulate ( x ) f white ( x ) f blackstripes ( x )( 1 − f zebra ( x )) = 0. ( )( − f ungulate ( ( − ( x ) f white ( ( ) ( )( ( )( − ( penalty functions perceptual space ( x ) f blackstripes ( − )( − f zebra ( f ( x ))
How to represent the tasks? f e ? Primal space Dual Space Kernel Machines …
Semi-norm in Sobolev Spaces where a � ⌃ C ∞ ( under proper boundary conditions ... ILP 2018
Parsimony Principle admissible w.r.t the collection of constraints inference in the environment! strictly (hard) partially (soft) check of a “new” constraint ILP 2018
Inference check of a new constraint C | = ⇧ , operator al Facing the intractability coming from formal logic formal ILP 2018
Representer Theorem single constraint Gnecco et al (2015) ˜ ψ ( x, f ( x )) = 0 Lf ⋆ + p µ ∇ f ˜ ψ = 0 . constraint reaction f ⋆ = g ∗ ω ˜ ψ , ψ (x) = − 1 µp(x) ∇ f ˜ ψ (x, f ⋆ (x)). ω ˜ ˆ f ⋆ ( ξ ) = ˆ g( ξ ) · ˆ ω ˜ ψ ( ξ ). ILP 2018
Representation of the solution D ( φ 1 , . . . , φ m ) ⌦ x � X i ⇧ X : φ i ( x, f ( x )) = 0 , i � I N m hard constraints D ( f 1 , . . . , f m ) = 0 . m m ⌅ ⇤ ⇥ Lagrangian approach L ( f ) = ⌘ f ⌘ 2 P, γ + λ i ( x ) · φ i ( x, f ( x )) dx. X i =1 m Euler-Lagrange equations ⇥ Lf ( x ) + λ i ( x ) · ◆ f φ i ( x, f ( x )) = 0 , i =1 Green function reaction of the constraint support constraints Fredholm eq. (II kind) “merging of two ideas ...” ILP 2018
Lagrange Multipliers and Probability Density hard constraints ⌦ x � X i ⇧ X : φ i ( x, f ( x )) = 0 , i � I N m m ⌅ soft constraints
Parsimony and architectural constraints 1 ij + P ` j 2 H o w 2 P P P minimize j 2 H � j | x j | =1 2 i 2 O � P � subject to = 0 , = 1 , . . . , ` , x i − � j 2 pa( i ) w ij x j i ∈ H ∪ O, 1 − x i y i ≤ 0 = 1 , . . . , ` i ∈ O, ` ✓ L ( w, x, ↵ , � ) = 1 X X X X w 2 ij + � m | x m | [ m ∈ H ] 2 =1 m i 2 O j 2 H o ✓ ✓ ◆◆ X + ↵ m x m − � [ m ∈ H ∪ O ] w mr x r r 2 pa( m ) ◆ X � � + 1 − x i y i � i , + i 2 O �P � ILP 2018
Recommend
More recommend