ICLP09 PRISM: an overview LP connections Semantics Logic Tabling - PowerPoint PPT Presentation

ICLP09

 PRISM: an overview  LP connections ◦ Semantics Logic ◦ Tabling Proba- Learning bility ◦ Program synthesis  ML example PRISM ICLP09

 Major framework in machine learning ◦ clustering, classification, prediction, smoothing,… in bioinformatics, speech/pattern recognition, text processing, robotics, Web analysis, marketing,…  Define p(x,y| θ ), p(x|y, θ ) (x:hidden cause, y:observed effect, θ :parameters) ◦ by graphs (Bayesian networks, Markov random fields, conditional random fields,…) ◦ by rules (hidden Markov models, probabilistic context free grammars,…)  Basic tasks: ◦ probability computation (NP-hard) ◦ learning parameter/structure ICLP09

 Graphical models for probabilistic modeling ◦ Intuitive and popular but only numbers, no structured data, no variable, no relation  complex modeling difficult  More expressive formalisms (90’s~) ◦ PLL (probabilistic logic learning)  {ILP, MRDM}+probability, probabilistic abduction ◦ SRL (statistical relational learning)  {BNs, MRFs} + relations  Many proposals (alphabet soup) ◦ Generative: p(x,y| θ ), hidden x generates observation y ◦ Discriminative : p(x|y, θ ) ICLP09

 Defines a generation process of an output in a sample space ◦ Bayesian approach such as LDA  prior distribution p( θ | α )  distribution p(D| θ )  data D  Given D, predict x by ◦ Probabilistic grammars such as PCFGs p ( τ )  Rules are chosen probabilistically in the derivation  Prob. of sentence s :  Defining distributions by (logic) programs (in PLL) ◦ PHA[Poole’93], PRISM[Sato et al.’95,97], SLPs[Muggleton’96, Cussens’01], P-log[Baral et al.’04], LPAD[Vennekens et al.’04], ProbLog[De Raedt et al.’07]… ICLP09

 Prolog's probabilistic extension ◦ Turing machine with statistically learnable state transitions  Syntax: Prolog + msw/2 (random choice) ◦ Variables, terms, predicates, etc available for p.-modeling  Semantics: distribution semantics ◦ Program DB defines a probability measure P DB ( ) on least Herbrand models  Pragmatics:(very) high level modeling language ◦ Just describe probabilistic models declaratively  Implementation: ◦ B-Prolog (tabled search) + parameter learning (EM,VB-EM) ◦ Single data structure : expl. graphs, dynamic programming ICLP09

Formal Negative goals Linear tabling EM learning semantics Prism1.6 Prism1.8 2003 1995 1997 2004 Tabled Distribution PRISM Negation semantics search Prism1.12 Prism1.9 Prism1.11 2006 2009 2007 Gaussian Log-linear Belief Modeling Variational BDD propagation environment Bayes … BN subsumed Ease of modeling Bayesian approach ICLP09

 PRISM subsumes three representative generative models, PCFGs, HMMs and BNs (and their Bayesian version). They are uniformly computed/learned by a generic algorithm PCFGs HMMs BNs IO (inside-outside) FB (forward- BP (belief propagation) prob. computation backward) algorithm PRISM ICLP09

father mother a b o a AB A child b o B ICLP09

btype(X):- gtype(Gf,Gm), pg_table(X,[Gf,Gm]). pg_table(X,Gtype):- ((X=a;X=b),(GT=[X,o];GT=[o,X];GT=[X,X]) ; X=o,GT=[o,o] ; X=ab,(GT=[a,b];GT=[b,a])). gtype(Gf,Gm):- msw(abo,Gf),msw(abo,Gm). (probabilistic switch) (parameter) P msw (msw(abo,a)=1) = θ (abo,a) = 0.3,…  P DB (msw(abo,a)=x 1 ,msw(abo,b)=x 2 ,msw(abo,o)=x 3 , btype(a)=y 1 ,btype(b)=y 2 ,btype(ab)=y 3 ,btype(o)=y 4 )  P DB (btype(a)=1) = 0.4 (parameter learning is inverse direction) ICLP09

 Distribution semantics  Tabling  Program synthesis ICLP09

 Possible world semantics: For a closed α , p( α ) is the sum of probabilities of possible worlds M that makes α true ◦ p M ( α (M)) = 1 if M |= α = 0 o.w.  When α has a free variable x, p M ( α (M)) is the ratio of individuals in M satisfying α ICLP09

 DB = F U R ◦ F : set of ground msw/2 atoms = { msw(abo,a),msw(abo,o),… } ◦ R : set of definite clauses, msw/2 allowed only in the body = {btype(X):- gtype(Gf,Gm), pg_table(X,[Gf,Gm]) … } ◦ P F ( ) : infinite product of some finite distributions on msws  We extend P F ( ) to P DB ( ), probability measure over H- interpretations for DB using the least model semantics and Kolmogorov’s extension theorem ◦ F’ ~ P F : ground msw atoms sampled from P F ( ) ◦ M(R U F’) : the least H-model for R U F’ always exists  (infinite) random vector taking H-interpretations ◦ P DB ( ) : prob. measure over such H-interpretations induced by M(R U F’) ICLP09

R F  DB = { a :- b, a :- c, b, c } P F (b,c) given Sample (b, Sam b,c) ~P ~P F (.,. .,.) Sam Sampled Herbrand a P DB DB (a,b ,b,c ,c) b b c c DB’ DB mode del 0 (false) 0 a:-b, a:-c {} 0 = P F (0,0) 0 1 (true) a:-b, a:-c {c,a} 1 = P F (0,1) c 1 0 a:-b, a:-c {b,a} 1 = P F (1,0) b 1 1 a:-b, a:-c {b,c,a} 1 = P F (1,1) b, c anything else = 0 ICLP09

 Unconditionally definable ◦ Arbitrary definite program allowed (even a:- a) ◦ No syntactic restriction (such as acyclic, range-restricted)  Infinite domain ◦ Countably many constant/function/predicate symbols ◦ Infinite Herbrand universe allowed  Infinite joint distribution (prob. measure) ◦ Not a distribution on infinite ground atoms ◦ Countably many i.i.d. ground atoms available  recursion, PCFG possible  Parameterized with LP semantics ◦ Currently the least model semantics used ◦ The greatest model semantics, three valued semantics,… ICLP09

 P DB (iff(DB))=1 holds in our semantics  We rewrite goal G by SLD to an equivalent random boolean formula G ⇔ E 1 v … vE N , E i = msw 1 & … & msw k  Assume the exclusiveness of E i s, then P DB (G) = P DB (E 1 )+ … +P DB (E N ) and P DB (E i ) = P DB (m 1 ) … P DB (m k )  Simple but exponential in #explanations  tabling ICLP09

P DB (btype(a)) All solution search for ?- btype(a) 0 with tabling btype/1, gtype/2 yields 1 AND/OR boolean formulas 2 1 2 1 3 3 4 4 2 5 6 2 5 6 3 7 8 3 7 8 4 10 9 9 10 4 Explanation graph ICLP09

 PRISM uses linear tabling (Zhou et al.’08) ◦ single thread (not suspend/resume scheme) ◦ iteratively computes all answers by backtracking for each top-most-looping subgoal  Looping subgoals :-p ◦ … :- A,B  …  :- A’,C and A, A’ are variants, they are looping subgoals :-q ◦ If A has no ancestor in any loop containing A, it is the top-most goal :-r :-q :-p SLD tree ICLP09

 Thanks to tabling, PRISM's prob. computation is as efficient as the existing model-specific algorithms Model family EM algorithm Time complexity O ( N 2 L ) Baum-Welch Hidden Markov models N : number of states algorithm L : max. length of sequences O ( N 3 L 3 ) Probabilistic context-free Inside-outside N : number of non-terms grammars algorithm L : max. length of sentences Singly-connected EM based on O ( N ) π - λ computation Bayesian networks N : number of nodes BP (belief propagation) is an instance of PRISM’s general probability computation scheme(Sato’07) ICLP09

S  NP VP (1.0) • compact s(X,[]) :- np(X,Y), vp(Y,[]). • readable NP  NP PP (0.2) | np(X,Z) :- msw(np,RHS), ( RHS=[np,pp], np(X,Y), pp(Y,Z) cars (0.1) | ; RHS=[ears], X=[ears|Z] ; … ). stars (0.2) | pp(X,Z]) :- p(X,Y), np(Y,Z). telescopes (0.3) | vp(X,Z) :- msw(np,RHS), astronomers (0.2) ( RHS=[vp,pp], vp(X,Y), pp(Y,Z) PP  P NP (1.0) ; RHS=[v,np], v(X,Y), np(Y,Z) ) V  see (0.5) | v(X,Y) :- msw(v,RHS), ( RHS=[see], X=[see|Y] ; saw (0.5) RHS=[saw], X=[saw|Y] ). P  in (0.3) | p(X,Y) :- msw(p,RHS), ( RHS=[in], X=[in|Y] ; RHS=[at], X=[at|Y] at (0.4) | ; RHS=[with] & X=[with|Y] ). with (0.3) values_x(np, [[np,pp],[ears],…], [0.1,0.2,…]). values_x(v, [[see],[saw]], [0.5,0.5]). values_x(p,[ [in],[at],[with]], [0.3,0.4,0.3]). ICLP09

Parsing by 20,000 CFG rules extracted from 49,000 (POS) sentences from WSJ portion of Penn tree bank with uniform prob. Randomly selected 20 sentences are used for the average probability computation (on the left) and Viterbi parsing (on the right) ICLP09

 Agreement of number (A=singular, plural) agree(A):- A, B randomly chosen msw(subj,A), agree(A) succeeds only msw(verb,B), when A=B, o.w. fails A=B.  Observable distribution is a conditional one P(agree(A) | ∃ X agree(X) ) = P(msw(subj,A))P(msw(verb,A)) / P( ∃ X agree(X) ) P( ∃ X agree(X) ) = Σ A=sg,pl P(msw(subj,A))P(msw(verb,A))  Parameters are learnable by FAM(Cussens ’01) but it requires a failure program ICLP09

 A failure program for agree/1: “failure  not( ∃ X agree(X))” expresses how ?- agree(X) probabilistically fails agree(A):- failure:- msw(subj,A), msw(subj,A), msw(verb,B), msw(verb,B), A=B. ¥+A=B.  PRISM uses FOC(first-order compiler) to automatically synthesize failure programs (negation elimination) ICLP09

 FOC automatically eliminates negation from the source program using continuation (Sato ’89)  Compiled program DB c positively computes the finite failure set of DB If DB c is terminating, failure = negation and M(DB c )= HB-M(DB) M(DB c ) M(DB) HB ICLP09

ICLP09 PRISM: an overview LP connections Semantics Logic Tabling - PowerPoint PPT Presentation

ICLP09 PRISM: an overview LP connections Semantics Logic Tabling Proba- Learning bility Program synthesis ML example PRISM ICLP09 Major framework in machine learning clustering, classification, prediction,

Lattice Design for PRISM-FFAG A. Sato Osaka University for the PRISM working group contents

Probabilistic Analysis of Discrete Orgnisations Chemical System to in PRISM PRISM model

PRISM BREAK PRISM BREAK A Post-PRISM Journey Outside the Big 12 by Will Rico, LibrePlanet Boston

from the PRISM Project What is PRISM? PRISM: Provider Reimbursement Information System for

Magnet Design for the PRISM-FFAG Y. Arimoto Osaka U. Contents Type of PRISM-FFAG

PRISM/PRIME Overview Yoshitaka Kuno Department of Physics Osaka University November 8th, 2010

Lattice Design for PRISM-FFAG Akira Sato Osaka University 4th Aug. 2004 : NP04 at Tokai

R&D Status of PRISM-FFAG Akira Sato Osaka University for the PRISM working group FFAG04 @

Towards realising PRISM based muon to electron conversion experiment J. Pasternak, Imperial

PRISM RF Ultra High Field Gradient RF for Bunch Rotation Y.KURIYAMA OSAKA UNIV. Contents

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Cone It is a pyramid whose base is a circle Prism It is a solid with a polygonal base which is

1 The Energy Centre and PRISM About the Energy Centre at University of Auckland Goal is to

Mainstreaming PRISM Model in IDE Nepal Projects: Advancement from Conceptualization to Action

Project Launch Tuesday, May 5, 2015 Consultant Team Members PRISM Partners Inc. (PRISM)

20/20 FORESIGHT Diana Amador THE FUTURE OF TRANSFORMATION Solutions Architect Blue Prism

Understanding Manycore Scalability of File Systems Changwoo Min , Sanidhya Kashyap, Stefgen Maass

An Introduction to Multi Relational Data Mining Outline Introduzione e concetti di base

Status of FNAL SciBooNE experiment Yasuhiro Nakajima (Kyoto Univ.) TAUP2007, Sendai September

Quantitative characterisation of mollusc shell textures D. Chateigner Lab. Physique de lEtat

Data Cleaning for Data Integration Advanced School on Data Exchange, Integration, and Streams

Primary 1 February 2020 Vision: Hearts of Service * M inds of Inquiry * Joy in Learning *

COMP364: Working with Matplotlib Jrme Waldisphl, McGill

CLASS 1: ASSEt pricing. CAPM. Theory and Experiment Theory: The economy Two dates (today,

ICLP09 PRISM: an overview LP connections Semantics Logic Tabling - PowerPoint PPT Presentation

ICLP09 PRISM: an overview LP connections Semantics Logic Tabling Proba- Learning bility Program synthesis ML example PRISM ICLP09 Major framework in machine learning clustering, classification, prediction,

Lattice Design for PRISM-FFAG A. Sato Osaka University for the PRISM working group contents

Probabilistic Analysis of Discrete Orgnisations Chemical System to in PRISM PRISM model

PRISM BREAK PRISM BREAK A Post-PRISM Journey Outside the Big 12 by Will Rico, LibrePlanet Boston

from the PRISM Project What is PRISM? PRISM: Provider Reimbursement Information System for

Magnet Design for the PRISM-FFAG Y. Arimoto Osaka U. Contents Type of PRISM-FFAG

PRISM/PRIME Overview Yoshitaka Kuno Department of Physics Osaka University November 8th, 2010

Lattice Design for PRISM-FFAG Akira Sato Osaka University 4th Aug. 2004 : NP04 at Tokai

R&amp;D Status of PRISM-FFAG Akira Sato Osaka University for the PRISM working group FFAG04 @

Towards realising PRISM based muon to electron conversion experiment J. Pasternak, Imperial

PRISM RF Ultra High Field Gradient RF for Bunch Rotation Y.KURIYAMA OSAKA UNIV. Contents

Probabilistic model Probabilistic model c Probabilistic model Probabilistic model c c

Cone It is a pyramid whose base is a circle Prism It is a solid with a polygonal base which is

1 The Energy Centre and PRISM About the Energy Centre at University of Auckland Goal is to

Mainstreaming PRISM Model in IDE Nepal Projects: Advancement from Conceptualization to Action

Project Launch Tuesday, May 5, 2015 Consultant Team Members PRISM Partners Inc. (PRISM)

20/20 FORESIGHT Diana Amador THE FUTURE OF TRANSFORMATION Solutions Architect Blue Prism

Understanding Manycore Scalability of File Systems Changwoo Min , Sanidhya Kashyap, Stefgen Maass

An Introduction to Multi Relational Data Mining Outline Introduzione e concetti di base

Status of FNAL SciBooNE experiment Yasuhiro Nakajima (Kyoto Univ.) TAUP2007, Sendai September

Quantitative characterisation of mollusc shell textures D. Chateigner Lab. Physique de lEtat

Data Cleaning for Data Integration Advanced School on Data Exchange, Integration, and Streams

Primary 1 February 2020 Vision: Hearts of Service * M inds of Inquiry * Joy in Learning *

COMP364: Working with Matplotlib Jrme Waldisphl, McGill

CLASS 1: ASSEt pricing. CAPM. Theory and Experiment Theory: The economy Two dates (today,

R&D Status of PRISM-FFAG Akira Sato Osaka University for the PRISM working group FFAG04 @