Learning theorem proving through self-play Stanisaw Purga Overview - PowerPoint PPT Presentation

Learning theorem proving through self-play Stanisław Purgał

Overview • AlphaZero • Proving game • adjusting MCTS for proving game • some results 2019-10 1

Neural black box game state S move policy expected outcome π ∈ R n v ∈ R 2019-10 2

Neural black box ( S 1 , π 1 , v 1 ) . . . ( S n , π n , v n ) 2019-10 3

Monte-Carlo Tree Search game state S move policy expected outcome π ∈ R n v ∈ R 2019-10 4

Monte-Carlo Tree Search S choose a child according S 1 S 2 S 3 to the formula: v π √ n n i π i + v i c · log n + c base + 1 � � c = + c init c base weighted c base = 19652 average c init = 1 . 25 2019-10 5

Monte-Carlo Tree Search 2019-10 6

Monte-Carlo Tree Search 2019-10 7

Why not maximum? game state S move policy expected outcome π ∈ R n v ∈ R v = t + error 2019-10 8

Why not maximum? v 1 = t 1 + error v 2 = t 2 + error v 3 = t 3 + ERROR min / max v = t + ERROR 2019-10 9

Why not maximum? v 1 = t 1 + error v 2 = t 2 + error v 3 = t 3 + ERROR average v = t + Σ error n 2019-10 10

Closing the loop • play lots of games • choose moves randomly, according to MCTS policy • use finished games for training: • desired value in the result of the game • desired policy is the MCTS policy • also add noise to neural network output to increase exploration 2019-10 11

Proving game theorem Prove the theorem win lose 2019-10 12

Proving game Construct a theorem Adversary wins Prove the theorem Prover wins 2019-10 13

Prolog-like proving A ⊢ X A ⊢ Y (1) A ⊢ X ∧ Y holds ( A , and ( X , Y )) :- holds ( A , X ) , holds ( A , Y ) (2) 2019-10 14

Prolog-like proving [ X : A ⊢ X ∧ ¬¬ X , ... ] A ⊢ X ∧ Y :- A ⊢ X , A ⊢ Y X : A ⊢ X ∧ ¬¬ X :- X : A ⊢ X , X : A ⊢ ¬¬ X [ X : A ⊢ X , X : A ⊢ ¬¬ X , ... ] 2019-10 15

Prolog-like proving [ X : A , and ( X , not ( not ( X )))) , ... ] holds ( A , and ( X , Y )) :- holds ( A , X ) , holds ( A , Y ) holds ( X : A , and ( X , not ( not ( X )))) :- holds ( X : A , X ) , holds ( X : A , not ( not ( X ))) [ holds ( X : A , X ) , holds ( X : A , not ( not ( X ))) , ... ] 2019-10 16

Prolog-like theorem constructing [ holds ( X : A , and ( X , not ( not ( X )))) , ... ] holds ( X : A , and ( X , not ( not ( X )))) :- holds ( X : A , X ) , holds ( X : A , not ( not ( X ))) holds ( A , and ( X , Y )) :- holds ( A , X ) , holds ( A , Y ) [ holds ( X : A , X ) , holds ( X : A , not ( not ( X ))) , ... ] bad idea 2019-10 17

Prolog-like theorem constructing [ holds ( A , ♣ ) , ... ] holds ( A , ♣ ) :- holds ( A , or ( ♦ , ♥ )) , holds ( A , implies ( ♦ , ♣ )) , holds ( A , implies ( ♥ , ♣ )) holds ( A , Z ) :- holds ( A , or ( X , Y )) , holds ( A , implies ( X , Z )) , holds ( A , implies ( Y , Z )) [ � , � , � , ... ] bad idea 2019-10 18

Prolog-like theorem constructing [ T ] holds ( A , and ( X , Y )) :- holds ( A , X ) , holds ( A , Y ) holds ( A , and ( X , Y )) :- holds ( A , X ) , holds ( A , Y ) [ holds ( A , X ) , holds ( A , Y ) ] 2019-10 19

Prolog-like theorem constructing T holds ( X : A , and ( X , not ( not ( X )))) holds ( x:a , and ( x , not ( not ( x )))) 2019-10 20

Forcing termination of the game Step limit: • ugly extension of game state • strategy may depend on number of steps left • even if we hide it, there is a correlation: large term constructed ∼ few steps left ∼ will likely lose 2019-10 21

Forcing termination of the game Sudden death chance: • game states nicely equal • no hard limit for length of a theorem During training playout, randomly terminate game with chance p d . In MCTS, adjust value v ′ = ( − 1 ) · p d + v · ( 1 − p d ) . 2019-10 22

Disadvantages of this game • two different players - if one player starts winning every game, we can’t learn much • proof use single inference steps - inefficient • players don’t take turns - MCTS not designed for that situation 2019-10 23

Not using maximum 2019-10 24

Certainty propagation 2019-10 28

Certainty propagation recursively: for uncertain leafs: for certain leafs: v = min( u , max( l , a )) v = � v = result a = � +Σ v i · n i a = � a = result n + 1 l = max i l i l = − 1 l = result u = max i u i u = 1 u = result when player changes: • values and bounds flip • lower and upper bound switch places 2019-10 31

Toy problem evaluation ablist([a,b,a,b,a,b,b]), revablist([]), revablist([a]), revablist([b]), revablist([c,d]), revablist([c,a,b]), revablist([a,d,c,b]), revablist([a,d,c,a,a]), revablist([a,b,c,d,b,d]), revablist([d,b,c,a,d,a,b]), revablist([a,c,b,a,c,a,d,d]) 2019-10 33

Certainty propagation effect 2019-10 34

Learning the proving game Like AlphaZero, with few differences: • using Graph Attention Network for � • for theorems that prover failed to prove, show proper path with additional policy training samples • during evaluation, greedy policy and step limit instead of sudden death 2019-10 35

Proving game evaluation Construct a theorem evaluation theorem Adversary wins Prove the theorem Prover wins 2019-10 36

Learning toy problem 2019-10 37

Intuitionistic propositional logic holds([A|T], A). holds(T, A) :- holds([B|T], A), holds(T, B). holds([H|T], A) :- holds(T, A). holds(T, impl(A, B)) :- holds([A|T], B). holds(T, B) :- holds(T, A), holds(T, impl(A, B)). holds(T, or(A, B)) :- holds(T, A). holds(T, or(A, B)) :- holds(T, B). holds(T, C) :- holds(T, or(A, B)), holds([A|T], C), holds([B|T], C). holds(T, and(A, B)) :- holds(T, A), holds(T, B). holds(T, A) :- holds(T, and(A, B)). holds(T, B) :- holds(T, and(A, B)). holds([false|T], A). 2019-10 38

Classical propositional logic holds([A|T], A). holds(T, A) :- holds([B|T], A), holds(T, B). holds([H|T], A) :- holds(T, A). holds(T, impl(A, B)) :- holds([A|T], B). holds(T, B) :- holds(T, A), holds(T, impl(A, B)). holds(T, or(A, B)) :- holds(T, A). holds(T, or(A, B)) :- holds(T, B). holds(T, C) :- holds(T, or(A, B)), holds([A|T], C), holds([B|T], C). holds(T, and(A, B)) :- holds(T, A), holds(T, B). holds(T, A) :- holds(T, and(A, B)). holds(T, B) :- holds(T, and(A, B)). holds([false|T], A). holds(T, A) :- holds([impl(A, false)|T], false). 2019-10 39

Learning classical propositional logic 2019-10 40

Constructed theorem example ConstantNode ConstantNode and/2 ConstantNode ⊢ ((( d ∧ b ∧ c ) ∨ ( b ∧ c ∧ d )) = ⇒ b ) ∨ e and/2 and/2 ⊥ ⊢ a ∨ b ∨ c or/2 ⊢ (((( a ∧ ⊥ ∧ b ) = ⇒ c ) = ⇒ d ) = ⇒ d ) (( a ∧ b ) = ⇒ a ) = ⇒ ( ⊥ ∧ c ) ⊢ d impl/2 ConstantNode (( a = ⇒ ⊥ ) = ⇒ b ) , c , ( a = ⇒ b ) ⊢ b ConstantNode or/2 holds/2 RootNode 2019-10 41

First-order logic %some classical logic neq(var([a|_]), var([b|_])). neq(var([b|_]), var([a|_])). neq(var([_|A]), var([_|B])) :- neq(var(A), var(B)). repl(var(A), R, var(A), R). repl(var(A), R, var(B), var(B)) :- neq(var(A), var(B)). repl(var(A), R, op(O, X1, Y1), op(O, X2, Y2)) :- repl(var(A), R, X1, X2), repl(var(A), R, Y1, Y2). repl(var(A), R, q(O, var(A), P), q(O, var(A), P)). repl(var(A), R, q(O, var(B), P1), q(O, var(B), P2)) :- neq(var(A), var(B)), repl(var(A), R, P1, P2). repl(var(A), R, false, false). repl(var(A), R, [], []). repl(var(A), R, [H1|T1], [H2|T2]) :- repl(var(A), R, H1, H2), repl(var(A), R, T1, T2). holds(T, q(forall, var(A), Phi)) :- repl(var(A), var(B), Phi, PhiBA), repl(var(B), false, [Phi|T], [Phi|T]), holds(T, PhiBA). holds(T, Phi) :- holds(T, q(forall, var(A), PhiA)), repl(var(A), B, PhiA, Phi). holds(T, q(exists, var(A), Phi)) :- repl(var(A), R, Phi, PhiR), holds(T, PhiR). holds(T, P) :- holds(T, q(exists, var(A), Phi)), repl(var(B), false, Phi, Phi), repl(var(A), var(B), Phi, PhiB), holds([PhiB|T], P). 2019-10 42

Future work • better rule representation? • proper prover with a different construction mechanism? • different use cases? • more computational power? 2019-10 43

Thank you for your attention! Stanisław Purgał

Learning theorem proving through self-play Stanisaw Purga Overview - PowerPoint PPT Presentation

Learning theorem proving through self-play Stanisaw Purga Overview AlphaZero Proving game adjusting MCTS for proving game some results 2019-10 1 Neural black box game state S move policy expected outcome R n v

Visual theorem proving with the Incredible Proof Machine The idea Theorem Proving without

Learning theorem proving through self-play Stanisaw Purga The goal Learn to prove theorems

Artificial Intelligence in Theorem Proving Cezary Kaliszyk VTSA Overview Last Lecture theorem

Theorem-Proving Environments Nathan Ng CSC2547: Learning to Search Theorem Proving What is a

On Theorem Proving for Program Checking Historical perspective and recent developments Maria

Symbolic Computation and Theorem Proving in Program Analysis Laura Kov acs Chalmers

Artificial Intelligence in Theorem Proving Cezary Kaliszyk VTSA 2019 Computer Theorem Proving

Does God play dice with the cell? Does God play dice with the cell? Does God play dice with the

31. Stokes Theorem Stokes theorem is to Greens theorem, for the work done, as the

Automated Theorem Proving 1/4: Introduction and Propositional Theorem Proving A.L. Lamprecht

Instantiation-Based Automated Theorem Proving for First-Order Logic Konstantin Korovin The

Functional Programming Functional Programming and Theorem Proving and Theorem Proving for

Formal Verification Methods 4: Theorem Proving John Harrison Intel Corporation Need for

Automated Theorem Proving 2/4: First-Order Theorem Proving A.L. Lamprecht Course Program

The Role of Play in Self-Regulation The Role of Play in Self-Regulation Opportunities to teach

Saturation-based Theorem Proving and ML Course Machine Learning and Reasoning 2020 MLR 2020 1 1

how a semantic web will change web interaction Tom Heath Knowledge Media Institute The Open

Social Impact of Open Data by Sandra Moscoso, World Bank World Bank Group Open Finances,

Memory Layout, File I/O Bryce Boe 2013/06/27 CS24, Summer

Tick-Borne Disease Working Group Meeting #8 | December 3, 2018 Tick-Borne Disease Working

Testing, Abstraction, Theorem Proving: Better Together Authors: Greta Yorsh Thomas Ball Mooly

Automated theorem proving by resolution in non-classical logics Viorica Sofronie-Stokkermans

The Complexity of Theorem Proving in Autoepistemic Logic Olaf Beyersdorff School of Computing

Intuitionistic Type Theory Lecture 2 Peter Dybjer Chalmers tekniska hgskola, Gteborg Summer