Logic or probability? An ERP study of defeasible reasoning Michiel - - PowerPoint PPT Presentation

logic or probability an erp study of defeasible reasoning
SMART_READER_LITE
LIVE PREVIEW

Logic or probability? An ERP study of defeasible reasoning Michiel - - PowerPoint PPT Presentation

Logic or probability? An ERP study of defeasible reasoning Michiel van Lambalgen ILLC/Dept of Philosophy University of Amsterdam As a modelling tool in cognitive science, logic is on the back foot: [M]uch of our reasoning with conditionals


slide-1
SLIDE 1

Logic or probability? An ERP study of defeasible reasoning

Michiel van Lambalgen ILLC/Dept of Philosophy University of Amsterdam

slide-2
SLIDE 2

As a modelling tool in cognitive science, logic is on the back foot:

[M]uch of our reasoning with conditionals is uncertain, and may be

  • verturned by future information; that is, they are non-monotonic. But

logic based approaches to inference are typically monotonic, and hence are unable to deal with this uncertainty. Moreover, to the extent that formal logical approaches embrace non-monotonicity, they appear to be unable to cope with the fact that it is the content of the rules, rather than their logical form, which appears to determine the inferences that people

  • draw. We now argue that perhaps by encoding more of the content of

people's knowledge, by probability theory, we may more adequately capture the nature of everyday human inference. This seems to make intuitive sense, because the problems that we have identified concern how uncertainty is handled in human inference, and probability is the calculus of uncertainty. (Oaksford & Chater, Bayesian Rationality, OUP 2007)

slide-3
SLIDE 3

The task which occasioned these remarks: the suppression effect (Byrne (1989))

(1) If Marian has an essay, she studies late in the library. (2) Marian has an essay. (a) Does Marian study late in the library? (3) If the library is open, Marian studies late in the library. (b) Does Marian study late in the library? The percentage of `yes’ responses to (a) is around 90%; for (b) it is around 60% -- one says that `MP is suppressed’ Some argue that therefore subjects do not reason `logically’; although it is safer to say they don’t use a monotonic logic The supposed inability of `logic’ to handle this phenomenon has given a boost to probabilistic analyses in which the conditional is represented by a conditional probability

slide-4
SLIDE 4

Topics

  • Logical and Bayesian explanations of the suppression task
  • Non-monotonicity: logical and probabilistic
  • We then address the question whether subjects’ reasoning in the

suppression task is Bayesian or closed world by means of an EEG study

slide-5
SLIDE 5

Formalising the suppression task

slide-6
SLIDE 6

Formalisation in logic programming with the Closed World Assumption

  • We consider first logic programs consisting of Horn clauses

p1∧.....∧pn⟶q, such that no other clauses have the same consequent q

  • if all pi are true, so is q
  • if some pi is false, so is q [unrestricted closed world assumption]
  • we thus get p1∧.....∧pn ↔ q [p1∧.....∧pn is a definition of q]
  • Using Kleene 3-valued semantics some propositional variables can be

released from the closed world aassumption

  • If the clause contains a negation, say

¬s∧p1∧.....∧pn⟶q, do the preceding for the clause α⟶s (where we may have α=⊥) and replace s by its definition

  • If there are several clauses pi1∧.....∧pin ⟶ q with q as consequent, the

definition of q is given by ⋁i (pi1∧.....∧pin) ↔ q [where the disjunction is taken over all such clauses]

slide-7
SLIDE 7

Logical analysis of the suppression task

  • Represent the conditional as A ∧ ¬ab → E: `if A and nothing abnormal

is the case, then E’

  • The meaning of the conditional is partially specified and depends on

what abnormalities there are

  • We’ll take CWA to apply to ab only
  • Suppose we know A but nothing else; then by closed world reasoning

¬ab and we can draw the modus ponens conclusion E

  • Now suppose a possible abnormality ¬C comes to light [`the library is

not open’], then ¬C → ab; but no other abnormalities

  • Then in fact C ↔ ¬ab, so that the conditional becomes A ∧ C → E
  • Now we can no longer infer anything from A
  • The logical analysis led to the prediction that subjects with autism would suppress

MP and MT significantly less often; prediction verified in Pijnacker & al, Neuropsychologia (2009)

slide-8
SLIDE 8

probability in a nutshell

  • For our purposes, a probability is a [0,1]-valued function P on a

classical propositional logic, satisfying

  • the probability of a tautology is 1, that of a contradiction is 0
  • logically equivalent formulas have the same probability
  • if ⊨ φ → ¬ψ, then P(φ∨ψ) = P(φ) + P(ψ)
  • the conditional probability P(E|A) is defined as P(E ∧ A)/P(A) if

P(A) > 0

  • probability is not truth functional, therefore tremendous storage

requirements

slide-9
SLIDE 9

(Non-)monotonicity in Bayesian probability

  • Bayesian probability = axioms of probability + rule of inference

Bayesian conditionalisation: if E summarises all our evidence and E

  • ccurs then for any S the a posteriori probability Pf(S) of S equals the a

priori conditional probability Pi(S|E) [‘probabilistic modus ponens’, but controversial]

  • In Bayesianism and some forms of formal semantics the conditional ‘if

E then S’ is represented as a conditional probability P(S|E)

  • In theory Bayesianism holds that probabilities are defined over all

variables of (possible) interest [recall: probability not truth functional]

  • In practice the sets of relevant variables grow and the challenge is to

find (rational) principles which govern the transfer of a probability from a set to an expansion of that set

  • Non-monotonicity lite: in general P(X|Y) ≠ P(X|Y, Z)
  • Problematic non-monotonicity Pi(X|Y) ≠ Pf(X|Y)
slide-10
SLIDE 10

Prior and posterior probability

  • Upon processing the first conditional, the subject sets the prior

conditional probability Pi(E|A) ≈ 1

  • The second conditional is supposed to lead to the posterior conditional

probability Pf(E|A) << Pi(E|A)

  • Is there a Bayesian explanation for this transition?
  • Bayesian orthodoxy assumes there is a prior probability defined over

all events; so we may assume there is a prior probability for the library being closed (C) Pi(E|A) = Pi(E|CA) Pi(C|A) + Pi(E|¬CA) Pi(¬C|A) = [independence

  • f A, C] = Pi(E|CA) Pi(C) + Pi(E|¬CA) Pi(¬C)
  • Hence Pf(E|A) << Pi(E|A) if Pf(C) >> Pi(C), and Pi(C) must be small to

get high Pi(E|A), i.e. the fact that C becomes salient increases its probability

  • Not very Bayesian, and assumption of universal prior imposes impossible

demands on storage, since based on knowledge not computation

slide-11
SLIDE 11

The trouble with novelty

  • Novel events: the validity of Bayesian conditionalisation requires that

P0 be defined on `all’ events

  • Cognitively this is an implausible assumption; a better model is

provided by having multiple algebras

  • However
  • if both E, S belong to two distinct algebras, there need not be unique Pf(S);

hence at each time there is single algebra of events

  • if there is a single algebra that grows over time, then we need Renyi’s Axiom:

Pi(S|AE)Pi(A|E) = Pi(AS|E) to ensure that Pi(S|E) is the same in the algebra with and without event A

  • in which case Bayesian conditionalisation is invariant under the addition of

novel events

  • We have seen that Renyi’s Axiom must be dropped if there is to be a

probabilistic model of the suppression effect

  • One question is whether there exists a rational justification for

Bayesianism thus modified

  • Another question is: do subjects actually engage in probabilistic

reasoning in the suppression task?

slide-12
SLIDE 12

The time course of defeasible reasoning

slide-13
SLIDE 13

Experimental comparison of Bayesian and closed world reasoning

  • Bayesian probability is explicitly proposed as a computational model of

higher cognitive phenomena (Oaksford & Chater (2007, 2009), Gopnik & al (2004), ..)

  • As such, they should lead both to behavioural and neuroimaging

predictions

  • As an example, we want to compare the expected EEG signatures of

Bayesian reasoning and closed world reasoning in a variant of the suppression task

  • (Pijnacker, Geurts, vL, Buitelaar, Hagoort: `Reasoning with

Exceptions: An Event-related Brain Potentials Study’, J. Cogn Neurosci 2010)

slide-14
SLIDE 14

a more explicit form of the suppression task

The difference with the standard suppression task is that the possible exception to the conditional is now given in more explicit form The results are as usual: in the congruent condition 90% endorses MP, in the disabling condition

  • nly 45%
slide-15
SLIDE 15

Bayesian model (1)

  • The `inhibitory event’ E:`Lisa has lost a contact lens’ now forms part
  • f the initial sample space and Pi(E) is high (E is said to be `probable’)
  • The conditional probability corresponding to the conditional `If Lisa is

going to play hockey (H), she will wear contact lenses (W)’ must be evaluated by taking E into account: Pi(W|H) = Pi(W|EH) Pi(E|H) + Pi(W|¬EH) Pi(¬E|H) = = Pi(W|EH) Pi(E) + Pi(W|¬EH) Pi(¬E) (assuming independence of E and H)

  • If the meaning of a conditional is in part given by a conditional

probability, then Pi(W|H) has to be computed while processing the 2nd premiss

  • The final probability of the conclusion Pf(W) is obtained by Bayesian

conditionalisation on H

slide-16
SLIDE 16

Bayesian model (2)

  • Pi(W|H) = Pi(W|EH) Pi(E|H) + Pi(W|¬EH) Pi(¬E|H) =

= Pi(W|EH) Pi(E) + Pi(W|¬EH) Pi(¬E)

  • The final probability of the conclusion Pf(W) is obtained by Bayesian

conditionalisation on H

  • This computation seems entirely monotonic; and the last step,

conditionalisation, does not involve heavy computation

  • One expects heavy computation after the 2nd premiss, but there

shouldn’t be any difference between the congruent and the disabling case because E has been given at the outset

  • Thus we do not expect the EEGs to show any difference for any of the

four sentences

slide-17
SLIDE 17

Closed World model (1)

  • The congruent or disabling condition is represented as a proposition,

say r

  • The first conditional premiss is represented as p∧¬ab ➝ q, together

with the recognition that the condition is a possibly disabling condition: r ∧¬ab’ ➝ ab, or congruent: no link between r and ab

  • Note that some 45% do not represent r as effectively disabling,

therefore the link between r and ab is mediated by ¬ab’

  • Concretely, the subject may consider the possibility that Lisa bought

new lenses before playing hockey

  • The second premiss is represented as p
slide-18
SLIDE 18

Closed World model (2)

  • The conclusion is represented as q, together with the procedure for

preparing the premisses for drawing a conclusion by closing the world: the completion

  • In the effectively disabling case the completion is ¬ab’, r ↔ ab,

p∧¬ab ↔ q, whence p∧¬r ➝ q, and the conclusion q cannot be drawn

  • If r is in the end not considered disabling, ab’ holds, whence ¬ab,

and q follows

  • One expects competition between these possibilities
  • In the congruent case closing the world gives ¬ab, whence p ➝ q

and q follows

slide-19
SLIDE 19

Behavioural responses

  • Reaction times in the disabling condition were significantly longer

than in the congruent condition

  • The Closed World model readily explains this
  • But what about Bayesian conditionalisation?
slide-20
SLIDE 20

Electrophysiological responses

  • 80 reasoning problems (40 disabling, 40 congruent), 80 fillers
  • At each of 29 scalp sites, measure brain signal starting at the onset of

the last word of first premiss (A), second premiss (B) and conclusion (C), and average over reasoning problems in a condition (40) and participants (18)

  • No significant difference between congruent and disabling condition

in A, B - so far in line with Bayesian prediction

  • Significant sustained negativity (SN) of the disabling condition in C -

contrary to Bayesian prediction, in line with closed world prediction

  • SN has been observed in e.g. the following circumstances
  • ambiguous anaphoric reference, which may lead to holding competing

interpretations in working memory

  • verriding a default inference, as in `The girl was writing a letter when her

friend spilled coffee on the tablecloth/paper’ (Baggio, vL, Hagoort, J. Mem.

  • Lang. 2008)
slide-21
SLIDE 21
slide-22
SLIDE 22

Discussion (1)

  • Cognitive scientists favouring the Bayesian approach extol its virtues

as a computational model in the sense of Marr

  • In Marr’s three-level scheme the computational level specifies an

algorithm, that should predict outcomes of an experiment (at least in a statistical sense)

  • (Given the high computational complexity of probabilistic reasoning, it is

conceivable that the algorithm is only a heuristic (Oaksford & Chater 1999))

  • The algorithm must in turn be neurally implementable; what one

picks up in the EEG is the neural correlate of the algorithm at work

  • But there exist neural networks computing models for logic programs:

Hölldobler (1996) for stable semantics, Stenning & vL (2008) for Kleene 3-valued semantics

slide-23
SLIDE 23

Discussion (2)

  • In essence the difference between the Bayesian and logical predictions

is that (according to the Bayesians) the conditional probability forms part of the meaning of the conditional, whereas closing the world is an inference principle governing the conditional

  • If in both the Bayesian and the logical model the neural computations

are a faithful image of the symbolic computations sketched earlier, then the EEG data favour closed world reasoning

  • since in closed world reasoning all computation is triggered by the

last step, the need to evaluate the conclusion, and all significant brain activity occurs at this last step

  • In the Bayesian case the computational load seems to be concentrated
  • n the second premiss; we do not see this reflected in the data
  • Therefore either Bayesian reasoning is implemented in such way that

all effortful computation occurs at the last step (but then it is unclear how), or closed world reasoning is a more accurate representation of the computations involved