Evaluating Cognitive Models and Architectures Marc Halbrgge, - - PowerPoint PPT Presentation

evaluating cognitive models and architectures
SMART_READER_LITE
LIVE PREVIEW

Evaluating Cognitive Models and Architectures Marc Halbrgge, - - PowerPoint PPT Presentation

Evaluating Cognitive Models and Architectures Marc Halbrgge, Dipl.-Psych. marc.halbruegge@unibw.de Human Factors Institute Universitt der Bundeswehr Mnchen July 2007 Cognitive architectures in general Evaluation Methods for Cognitive


slide-1
SLIDE 1

Evaluating Cognitive Models and Architectures

Marc Halbrügge, Dipl.-Psych.

marc.halbruegge@unibw.de

Human Factors Institute Universität der Bundeswehr München

July 2007

slide-2
SLIDE 2

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified?

Topics of this talk

Main points Evaluation of ... Models: Current approaches to quantify model complexity are not sufficient Architectures: Cognitive architectures cannot be falsified and are therefore too weak to be considered as theories

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-3
SLIDE 3

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified?

Topics of this talk

Main points Evaluation of ... Models: Current approaches to quantify model complexity are not sufficient Architectures: Cognitive architectures cannot be falsified and are therefore too weak to be considered as theories

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-4
SLIDE 4

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified?

Contents

1

Cognitive architectures in general

2

Evaluation Methods for Cognitive Models Model Complexity and Overfitting The Markov Chain Example

3

Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-5
SLIDE 5

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified?

Cognitive Modeling

The Goal The main goal of cognitive modeling is to simulate human cognition

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-6
SLIDE 6

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-7
SLIDE 7

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-8
SLIDE 8

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-9
SLIDE 9

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-10
SLIDE 10

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example

Contents

1

Cognitive architectures in general

2

Evaluation Methods for Cognitive Models Model Complexity and Overfitting The Markov Chain Example

3

Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-11
SLIDE 11

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example

Evaluating Cognitive Models

Goodness of Fit Concordance with human data – a high fit – is generally thought to be necessary, but not sufficient for model validation [Roberts and Pashler, 2000]. Complexity and Overfitting In order to assess the fit of a model or to compare fits between models, it is important to be aware of the complexity of the models. Models with many degrees of freedom usually achieve better fits, but bear the risk of overfitting

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-12
SLIDE 12

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example

Evaluating Cognitive Models

Goodness of Fit Concordance with human data – a high fit – is generally thought to be necessary, but not sufficient for model validation [Roberts and Pashler, 2000]. Complexity and Overfitting In order to assess the fit of a model or to compare fits between models, it is important to be aware of the complexity of the models. Models with many degrees of freedom usually achieve better fits, but bear the risk of overfitting

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-13
SLIDE 13

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example

Evaluating Cognitive Models

Goodness of Fit Concordance with human data – a high fit – is generally thought to be necessary, but not sufficient for model validation [Roberts and Pashler, 2000]. Complexity and Overfitting In order to assess the fit of a model or to compare fits between models, it is important to be aware of the complexity of the models. Models with many degrees of freedom usually achieve better fits, but bear the risk of overfitting

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-14
SLIDE 14

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-15
SLIDE 15

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-16
SLIDE 16

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-17
SLIDE 17

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-18
SLIDE 18

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-19
SLIDE 19

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example

Overfitting

Results for the three classifiers (N=269) df misclassifications

  • bs. data

true data too low 9 10 low 6 5 high 11

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-20
SLIDE 20

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example

Model Complexity and Overfitting

Complexity in Cognitive Modeling Baker et al. (2003): Count the number of free weights of the subsymbolic part of an ACT-R model, use this as degrees of freedom when computing the Bayesian information criterion (BIC). BUT: Measurements of complexity like the BIC aim at closed form mathematical models. Cognitive models are sequences of actions!

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-21
SLIDE 21

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example

Model Complexity and Overfitting

Complexity in Cognitive Modeling Baker et al. (2003): Count the number of free weights of the subsymbolic part of an ACT-R model, use this as degrees of freedom when computing the Bayesian information criterion (BIC). BUT: Measurements of complexity like the BIC aim at closed form mathematical models. Cognitive models are sequences of actions!

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-22
SLIDE 22

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example

Memory Retrieval in ACT-R

Two cognitive models that retrieve chunks from declarative memory. Which chunk is to be retrieved next depends on the production that currently fires. The models stop when the memory retrieval fails. Model 1 Model 2

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-23
SLIDE 23

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example

Addressing Complexity From the Baker et al. approach, model 2 (four states) should be considered the more complex one, because it has more numerical weights. But: Model 2 can only create sequences like abcdabcda Model 1 can output any sequence of a’s and b’s. The difference is present even if you use model run time as dependent variable. Imagine two of the four productions take a much longer time than the others.

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-24
SLIDE 24

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example

Addressing Complexity From the Baker et al. approach, model 2 (four states) should be considered the more complex one, because it has more numerical weights. But: Model 2 can only create sequences like abcdabcda Model 1 can output any sequence of a’s and b’s. The difference is present even if you use model run time as dependent variable. Imagine two of the four productions take a much longer time than the others.

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-25
SLIDE 25

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example

Addressing Complexity From the Baker et al. approach, model 2 (four states) should be considered the more complex one, because it has more numerical weights. But: Model 2 can only create sequences like abcdabcda Model 1 can output any sequence of a’s and b’s. The difference is present even if you use model run time as dependent variable. Imagine two of the four productions take a much longer time than the others.

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-26
SLIDE 26

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example

Model 1 (two states)

time (s) Frequency 2 4 6 8 1000 2000 3000 4000 Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-27
SLIDE 27

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example

Model 2 (four states)

time (s) Frequency 2 4 6 8 1000 2000 3000 4000 5000 Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-28
SLIDE 28

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example

Discussion

Discussion Very simple models (about 100 lines of code) Very hard to assess the model complexity when only the number of chunks and productions is known. Knowledge of the number of influential parameters is not sufficient for the estimation of model complexity

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-29
SLIDE 29

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example

Discussion

Discussion Very simple models (about 100 lines of code) Very hard to assess the model complexity when only the number of chunks and productions is known. Knowledge of the number of influential parameters is not sufficient for the estimation of model complexity

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-30
SLIDE 30

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example

Discussion

Discussion Very simple models (about 100 lines of code) Very hard to assess the model complexity when only the number of chunks and productions is known. Knowledge of the number of influential parameters is not sufficient for the estimation of model complexity

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-31
SLIDE 31

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? Model Complexity and Overfitting The Markov Chain Example

End of part 2

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-32
SLIDE 32

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

Contents

1

Cognitive architectures in general

2

Evaluation Methods for Cognitive Models Model Complexity and Overfitting The Markov Chain Example

3

Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-33
SLIDE 33

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

The Newell Test

Newell (1990, p. 507): “play a cooperative game of anything you can do, I can do better” The Newell Test Anderson and Lebiere (2003) proposed a list of 12 criteria for the evaluation of cognitive architectures, i.e. flexible behavior, natural language, learning and evolution. An architecture is considered valid if it meets all these criteria.

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-34
SLIDE 34

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

The Newell Test

Newell (1990, p. 507): “play a cooperative game of anything you can do, I can do better” The Newell Test Anderson and Lebiere (2003) proposed a list of 12 criteria for the evaluation of cognitive architectures, i.e. flexible behavior, natural language, learning and evolution. An architecture is considered valid if it meets all these criteria.

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-35
SLIDE 35

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

The Newell Test

The Newell Test Although Anderson and Lebiere make some important points, they miss the problem of falsifiability. Before we ask the question “How do we validate a cognitive architecture?” we should answer the question “Can we validate a cognitive architecture at all?”

  • r, according to Popper,

“Can we falsify a cognitive architecture?”

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-36
SLIDE 36

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

The Newell Test

The Newell Test Although Anderson and Lebiere make some important points, they miss the problem of falsifiability. Before we ask the question “How do we validate a cognitive architecture?” we should answer the question “Can we validate a cognitive architecture at all?”

  • r, according to Popper,

“Can we falsify a cognitive architecture?”

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-37
SLIDE 37

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

The Newell Test

The Newell Test Although Anderson and Lebiere make some important points, they miss the problem of falsifiability. Before we ask the question “How do we validate a cognitive architecture?” we should answer the question “Can we validate a cognitive architecture at all?”

  • r, according to Popper,

“Can we falsify a cognitive architecture?”

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-38
SLIDE 38

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

Can we falsify a Cognitive Architecture?

We use ACT-R as example of an architecture. The argumentation should be applicable to any other architecture as well. ACT-R has Turing power ACT-R is a Turing machine Therefore anything (that is computable) can be done with ACT-R ACT-R can not be falsified

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-39
SLIDE 39

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

Can we falsify a Cognitive Architecture?

We use ACT-R as example of an architecture. The argumentation should be applicable to any other architecture as well. ACT-R has Turing power ACT-R is a Turing machine Therefore anything (that is computable) can be done with ACT-R ACT-R can not be falsified

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-40
SLIDE 40

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

Can we falsify a Cognitive Architecture?

We use ACT-R as example of an architecture. The argumentation should be applicable to any other architecture as well. ACT-R has Turing power ACT-R is a Turing machine Therefore anything (that is computable) can be done with ACT-R ACT-R can not be falsified

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-41
SLIDE 41

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

Can we falsify a Cognitive Architecture?

We use ACT-R as example of an architecture. The argumentation should be applicable to any other architecture as well. ACT-R has Turing power ACT-R is a Turing machine Therefore anything (that is computable) can be done with ACT-R ACT-R can not be falsified

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-42
SLIDE 42

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

What can be done (with cognitive architectures)

Recursive programming with ACT-R 6 In order to show that ACT-R can do things easily that are very hard or even unfeasible for humans, we created a model that performs a quicksort on arbitrary lists.

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-43
SLIDE 43

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

The Quicksort algorithm

function quicksort(q) var list below, pivotList, above if length(q) <= 1 return q select a pivot value pivot from q for each x in q if x < pivot then add x to below if x = pivot then add x to pivotList if x > pivot then add x to above return concatenate(quicksort(below), pivotList, quicksort(above))

http://en.wikipedia.org/wiki/Quicksort Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-44
SLIDE 44

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

Recursion in ACT-R

The idea The goal chunk type can be both an element of a data list or of the goal list that emulates the goal stack. (chunk-type element data prev root) (chunk-type (qs-goal (:include element)) state goal-root prev-goal)

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-45
SLIDE 45

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

  • 20

40 60 80 10 20 30 40 50 n * log(n) time (s) Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-46
SLIDE 46

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

Discussion of the quicksort example

Anderson and Lebiere (2003) state that ACT-R is not a “general computational system that can be programmed to do anything” (page 597). The quicksort example shows ACT-R as such is not limited, it can be limited from another perspective: If ACT-R was a valid CA, then what about LISP , or C?

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-47
SLIDE 47

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

Conclusions

Cognitive architectures are not theories. They should be regarded as theory languages, similar to programming languages. Architectures are useful for model creation, the communication between modelers, and the combination of

  • models. But when it comes to validation, the focus should

not lie on architectures but on models. In order to evaluate and compare cognitive models, we need a way to assess their complexity. As long as we lack a formal methodology to do this, we should alternatively release the code of our models.

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-48
SLIDE 48

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

Conclusions

Cognitive architectures are not theories. They should be regarded as theory languages, similar to programming languages. Architectures are useful for model creation, the communication between modelers, and the combination of

  • models. But when it comes to validation, the focus should

not lie on architectures but on models. In order to evaluate and compare cognitive models, we need a way to assess their complexity. As long as we lack a formal methodology to do this, we should alternatively release the code of our models.

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-49
SLIDE 49

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

Conclusions

Cognitive architectures are not theories. They should be regarded as theory languages, similar to programming languages. Architectures are useful for model creation, the communication between modelers, and the combination of

  • models. But when it comes to validation, the focus should

not lie on architectures but on models. In order to evaluate and compare cognitive models, we need a way to assess their complexity. As long as we lack a formal methodology to do this, we should alternatively release the code of our models.

Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-50
SLIDE 50

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions

References

Anderson, J. R., Bothell, D., Byrne, M. D., Douglass, S., Lebiere, C., and Qin, Y. (2004). An integrated theory of the mind. Psychological Review, 111(4):1036–1060. Anderson, J. R. and Lebiere, C. (2003). The newell test for a theory of cognition. Behavioral and Brain Sciences, 26(5):587–601. Baker, R. S., Corbett, A. T., and Koedinger, K. R. (2003). Statistical techniques for comparing act-r models of cognitive performance. In Proceedings of the 10th Annual ACT-R Workshop. Newell, A. (1990). Unified Theories of Cognition – The William James Lectures, 1987. Harvard University Press, Cambridge, MA, USA. Popper, K. R. (1989). Logik der Forschung. Mohr, Tübingen, 9 edition. Roberts, S. and Pashler, H. (2000). How persuasive is a good fit? a comment on theory testing. Psychological Review, 107(2):358–367. Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-51
SLIDE 51

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions Marc Halbrügge Evaluating Cognitive Models and Architectures

slide-52
SLIDE 52

Cognitive architectures in general Evaluation Methods for Cognitive Models Can a Cognitive Architecture be Falsified? The Quicksort Example Conclusions Marc Halbrügge Evaluating Cognitive Models and Architectures