Unifying Logic, Dynamics and Probability: Founda9ons, Algorithms - - PowerPoint PPT Presentation

unifying logic dynamics and probability founda9ons
SMART_READER_LITE
LIVE PREVIEW

Unifying Logic, Dynamics and Probability: Founda9ons, Algorithms - - PowerPoint PPT Presentation

Unifying Logic, Dynamics and Probability: Founda9ons, Algorithms and Challenges Vaishak Belle University of Edinburgh 1 About this tutorial pages.vaishakbelle.com/logprob On unifying logic and probability Slides online at end of


slide-1
SLIDE 1

Unifying Logic, Dynamics and Probability: Founda9ons, Algorithms and Challenges

Vaishak Belle University of Edinburgh

1

slide-2
SLIDE 2

About this tutorial

pages.vaishakbelle.com/logprob

  • On unifying logic and probability
  • Slides online at end of tutorial

2

slide-3
SLIDE 3

Scope

  • Symbolic analogues for graphical models;*
  • First-order logics for probability
  • Overview of formal construc>ons and some algorithmic ideas

* Discussion limited to inference (i.e., no learning)

3

slide-4
SLIDE 4

Structure of tutorial

  • Mo$va$on
  • Logical formula$ons of graphical models
  • Algorithms
  • Logics for reasoning about probability
  • Reduc$on theorems
  • Outlook and Open Ques$ons

4

slide-5
SLIDE 5

Mo#va#on

5

slide-6
SLIDE 6

2 disciplines in AI

Symbolic AI Statistical AI Propositional logic, First-order Logic SAT, Theorem proving Bayesian network, Markov network Sampling, filtering

6

slide-7
SLIDE 7

Complementary views of AI

Logic and probability are some3mes seen as complementary ways

  • f represen3ng and reasoning about the world:
  • Symbolic vs numeric
  • Qualita4ve vs quan4ta4ve
  • Rela4ons, objects vs random variables

Clearly, we o+en need features of both worlds!

7

slide-8
SLIDE 8

The expressivity argument

Encoding of rules of chess:

  • 1 page in FOL
  • pages in PL (~ graphical models)

Encoding of Alice's knowledge: Hand a card each to A ( ) and B

  • in PL
  • in FOL

8

slide-9
SLIDE 9

The database argument

Data is o(en stored in (rela/onal) databases, industry and scien/fic fields alike This data is almost always probabilis1c (NELL, Probbase, Knowledge Vault)

  • Rela&ons extracted by mining and learning from unstructured

texts

  • Probabili&es on tuples approximate confidence

9

slide-10
SLIDE 10

The implicit beliefs argument

It is impossible to store the implicit beliefs of an explicit representa3on in a logically complete manner

  • E.g., from , we get

, etc. Verifying behavioural proper2es of the system's design (safety, liveness)

10

slide-11
SLIDE 11

Top-down vs bo,om-up

Gary Marcus: To get computers to think like humans, we need a new A.I. paradigm,

  • ne that places “top down” and “bo<om up” knowledge on equal

foo?ng. Bo<om-up knowledge is the kind of raw informa?on we get directly from our senses, like pa<erns of light falling on our re?na. Top- down knowledge comprises cogni?ve models of the world and how it works.

11

slide-12
SLIDE 12

So, useful to combine, but how?

  • Theorists such as Boole and de Fine3 discussed the connec4ons

between logic and probability, and considered probability to be part of logic

  • Heavy development in the last 70 years, star4ng with Gaifman

and lots of exci4ng work in AI

  • Key idea: for

, accord weights to

12

slide-13
SLIDE 13

A spectrum (roughly speaking)

Relational BNs, MLNs, WMC Symbolic (propositional) languages BLOG, Probabilistic logic programming Some first-order features Languages for logic, probability, action Generalised measures, undecidable in general

13

slide-14
SLIDE 14

Logical formula-ons of graphical models

14

slide-15
SLIDE 15

Bayesian network

Construct a proposi-onal theory, e.g:

  • 15
slide-16
SLIDE 16

Probabilis)c rela)onal languages

Finite-domain FOL + graphical models for compact codifica8on, and have natural logical encodings:

  • Atoms = random variables
  • Weights on formulas = poten8als on

cliques

16

slide-17
SLIDE 17

Algorithms

17

slide-18
SLIDE 18

Compu&ng probabili&es

Logical constraints = zero-probability regions How do we design effec-ve algorithms that embody implicit beliefs and determinis-c reasoning? Weighted model coun/ng

18

slide-19
SLIDE 19

SAT, #SAT and WMC

Given proposi+onal CNF :

  • SAT: find a model
  • #SAT: count models; here 5: (0,1,0), (0,1,1), (1,1,0), (1,1,1), (1,0,0)
  • WMC: count weights of models

19

slide-20
SLIDE 20

p, q p, !q M1 M2 .4 .6

An illustra+on

  • and
  • mi)ed
  • 20
slide-21
SLIDE 21

WMC for BNs

Suppose encodes a BN. Then: Axioms of probability and BN condi5onal independence proper5es carry over to encoding: correctness preserving.

21

slide-22
SLIDE 22

Advanced topics

  • Caching sub-formula evalua5ons: component caching
  • Con%nuous distribu%ons: weighted model integra/on
  • Exploi(ng symmetry over predicate instances: li#ed reasoning
  • Learning weights and theories
  • Modeling dynamic BNs

22

slide-23
SLIDE 23

Is this enough? Do we need to go beyond symbolic analogues?

23

slide-24
SLIDE 24

The John McCarthy cri0que

  • 1. It is not clear how to a/ach probabili2es to statements containing

quan2fiers in a way that corresponds to the amount of convic2on people have.

  • 2. The informa2on necessary to assign numerical probabili2es is not
  • rdinarily available. Therefore, a formalism that required numerical

probabili2es would be epistemologically inadequate.

24

slide-25
SLIDE 25

The role of probabili-es

When are probabili-es appropriate? Which events are to be assigned probabili-es, and how to assign them? Some%mes need to say: the probability of E is less than F, and at least twice the probability of G. A basket contains an unknown number of fruits (bananas, oranges). We draw some fruits, observing the type of each and replacing it. We cannot tell fruits of the same type apart. How many fruits are in the basket?

25

slide-26
SLIDE 26

Long-lived systems

When it comes to reasoners and learners that poten1ally run forever, we need to be able to reason about probabilis1c events in a more general way. For example, we may need to compare the probabili1es of hypothe1cal outcomes, or analyse the behaviour of non-termina1ng probabilis1c programs. A general first-order logical language as a mathema4cal framework, with "reduc4on theorems."

26

slide-27
SLIDE 27

Logics for reasoning about probabili2es

27

slide-28
SLIDE 28

Seman&cal set up

Exis%ng case: A finite set of finite sta$c structures, a distribu2on

  • n them

p, q p, !q M1 M2 .4 .6

28

slide-29
SLIDE 29

Seman&cal set up contd.

The general case: An infinite set of infinite dynamic structures, infinitely many distribu5ons on them

.2 .4 p, q, … p, q, … !p, q, … p, !q, … a b a .8 !p, !q, … p, q, … !p, !q, … p, q, … a a b .6

29

slide-30
SLIDE 30

Language and seman+cs

  • Predicates of every arity, connec2ves
  • Constants

made of object and ac2on terms

  • For every wff
  • Ground atoms: constants applied to predicates
  • Worlds: Interps X ActSeq
  • Belief state: set of distribu3ons on worlds

30

slide-31
SLIDE 31

Proper&es (i.e., validi&es)

  • if

, then

  • 31
slide-32
SLIDE 32

Using the language

A probabilis+c Situa+on Calculus, as an ac#on theory Example: decremen.ng a counter

  • 32
slide-33
SLIDE 33

Regression

Because

33

slide-34
SLIDE 34

Proper&es of regression

  • Subsumes products of Gaussians
  • Distribu5on transforma5ons
  • Works for arbitrarily complex discrete-con5nuous noise models

Closed-form solu-on!

34

slide-35
SLIDE 35

Implementa)on in ac)on+

github.com/vaishakbelle/PREGO

> (regr (<= c 7) ((see 5) (dec -2))) ’(/ (INTEGRATE (c z) (* (UNIFORM c 2 12) (GAUSSIAN z -2 1.0) (GAUSSIAN 5 c 4.0) (if (<= (max 0 (- c z)) 7) 1.0 0.0))) (INTEGRATE (c w) (* (UNIFORM c 2 12) (GAUSSIAN w -2 1.0) (GAUSSIAN 5 c 4.0)))) > (eval (<= c 7) ((see 5) (dec -2))) 0.47595449413426844

+ Con&nuous counter, noisy ac&on, unique distribu&on assump&on

35

slide-36
SLIDE 36

Advanced topics

  • Handling non-unique distribu3ons (e.g., via linear programming

as upper and lower measures, credal networks)

  • Progressing belief states

36

slide-37
SLIDE 37

Ac#on-centric probabilis#c programming

Primi%ve programs = ac%ons

prim (begin prog1 . . . progn ) (if prog1 prog2) (let((var1 term1)...(varn termn)) prog) (until form prog)

github.com/vaishakbelle/ALLEGRO Sample worlds and updates them

37

slide-38
SLIDE 38

5 10 15 20

A simple program

(until (> (pr (and (>= c 2) (<= c 6))) .8) (until (> (conf c .4) .8) (see)) (let ((diff (- (exp c) 4))) (dec diff))) > (online-do prog) Execute action: (see) Enter sensed value: 4.1 Enter sensed value: 3.4 Execute action: (dec 1.0) ... > (pr (and (>= c 2) (<= c 6))) 0.8094620133032484

38

slide-39
SLIDE 39

Outlook

39

slide-40
SLIDE 40

Spectrum revisited

Formal languages increasingly becoming popular to declara4vely design ML pipelines But logic provides access to truth-theore3c reasoning, determinis3c constraints Discussed 2 extremes:

  • Symbolic analogue to graphical models
  • General framework for reasoning about probabili:es

40

slide-41
SLIDE 41

Outlook for symbolic analogues

State-of-the-art in some regards

  • 1. How closely can we .e learning techniques to mainstream ML

methods (e.g., deep nets), perhaps via a graphical model formula.on?

  • 2. How can we leverage symbolic + logical aspects to address

explainability + commonsensical reasoning?

41

slide-42
SLIDE 42

Outlook for general frameworks

O"en builds on and extends mainstream (logical) KR languages

  • 1. What type of seman0c jus0fica0on is needed to incorporate

them in ML pipelines (e.g., open-world assump0on)?

  • 2. Approaches to correctness oDen shy way from rigorous logical

formula0on: when are less rigorous formula0ons not acceptable?

  • 3. Tractability vs expressivity

42

slide-43
SLIDE 43

Mee#ng midway

Begin with proposi.onal formalisms and extend them to handle "first-order" features, e.g.:

  • BLOG, which allows Skolem constants
  • Open-world probabilis;c databases
  • Model coun;ng with infinite domains
  • Model coun;ng with existen;als and func;on symbols

43

slide-44
SLIDE 44

Conclusions

Many ways of tying logic and probability together

  • Can provide a declara/ve backbone to non-trivial ML pipelines
  • Full-blown probabilis/c logics provides the right language for

expressing probabilis/c informa/on

  • expressive enough to subsume, e.g., graphical models
  • informa/onally rich: allows probabilis/c informa/on that is

par/al, imprecise, acquired incrementally

44

slide-45
SLIDE 45

Reference

Much of the material for this tutorial can be found via the summary paper:

  • V. Belle. Logic meets Probability: Towards Explainable AI Systems for

Uncertain Worlds. IJCAI 2017. For related work, history on the development on unifying logic and probability, and technical results from the tutorial, see the references therein.

45