707.009 Foundations of Knowledge Management g g Topic Modeling - - PowerPoint PPT Presentation

707 009 foundations of knowledge management g g topic
SMART_READER_LITE
LIVE PREVIEW

707.009 Foundations of Knowledge Management g g Topic Modeling - - PowerPoint PPT Presentation

Knowledge Management Institute 707.009 Foundations of Knowledge Management g g Topic Modeling Markus Strohmaier Univ. Ass. / Assistant Professor Knowledge Management Institute Graz University of Technology, Austria e-mail:


slide-1
SLIDE 1

Knowledge Management Institute

707.009 Foundations of Knowledge Management g g „Topic Modeling“

Markus Strohmaier

  • Univ. Ass. / Assistant Professor

Knowledge Management Institute Graz University of Technology, Austria e-mail: markus.strohmaier@tugraz.at web: http://www kmi tugraz at/staff/markus

1

Markus Strohmaier 2011

web: http://www.kmi.tugraz.at/staff/markus

slide-2
SLIDE 2

Knowledge Management Institute

Acknowledgements: Acknowledgements: Course slides in part based on

…the following slide decks and papers:

  • “Probabilistic Topic Models and Associative Memory”
  • Mark Steyvers UC Irvine Tom Griffiths Brown University Josh Tenenbaum MIT
  • Mark Steyvers, UC Irvine, Tom Griffiths, Brown University, Josh Tenenbaum,MIT
  • “Topics in Semantic Representation”
  • Tom Griffiths, Brown University, Mark Steyvers, UC Irvine, Josh Tenenbaum,MIT
  • Semantic Representations with Probabilistic Topic Models

p p

  • Mark Steyvers, Joint work with: Tom Griffiths, UC Berkeley, Padhraic Smyth, UC Irvine
  • “Modeling Documents”
  • Amruta Joshi, Department of Computer Science, Stanford University

C iti M d li “

  • „Cognitive Modeling“
  • Lecture 14: Models of Semantic Processing, University of Edinburgh

2

Markus Strohmaier 2011

slide-3
SLIDE 3

Knowledge Management Institute

Overview

T d ‘ A d Today‘s Agenda: Topic Modeling

  • Associative memory
  • The topic model
  • Applications to associative memory
  • Applications in machine learning/text mining

3

Markus Strohmaier 2011

slide-4
SLIDE 4

Knowledge Management Institute

Wissensorganisation – Wissensorganisation – Zwei Herangehensweisen

Taxonomien, Ontologien

Formale vs. inhaltliche Struktur Viele Informationen liegen in unstrukturierten Freitexten (Inhaltliche

Ontologien, Semantische Netze

g ( Struktur) vor. Aussagekräftig aber schlecht auswertbar Zwei Herangehensweisen:

Verwendung einer standardisierten Sprache a priori (stark formalisiert)

Schlüsselwort- extraktion, Folksonomies

– Verwendung einer standardisierten Sprache a priori (stark formalisiert) – Interpretation der heterogenen Sprache a posteriori (NLP, …)

Folksonomies

Freitext sem antische Darstellung Code

4

Markus Strohmaier 2011

slide-5
SLIDE 5

Knowledge Management Institute

Was sind Konzeptsysteme?

Konzeptsystem e sind System e von unterscheidbaren Konzepten, die m ittels Relationen in Beziehung zueinander gesetzt w erden und in einer natürlicheren Sprache form uliert w erden können

Zielsetzung: Entwicklung und Festlegung eines gemeinsamen Verständnisses

Objekt

„Reale Welt“

g Repräsentationssysteme: menschliche Sprache, Logik, „Computersprachen“

Sem iotisches Dreieck

W ort Ausdruck Sym bol Begriff Konzept

Dreieck

Sprache Wissen

5

Markus Strohmaier 2011

slide-6
SLIDE 6

Knowledge Management Institute 6

Markus Strohmaier 2011

slide-7
SLIDE 7

Knowledge Management Institute

A third approach: Topic Modeling

7

Markus Strohmaier 2011

slide-8
SLIDE 8

Knowledge Management Institute

Overview

I Associative memory II The topic model III Applications to associative memory IV Applications in machine learning/text mining

8

Markus Strohmaier 2011

slide-9
SLIDE 9

Knowledge Management Institute

Example of associative memory: Example of associative memory: word association

CUE: RESPONSES: PLAY FUN, BALL, GAME, WORK, GROUND, MATE, CHILD, ENJOY, WIN, ACTOR

9

Markus Strohmaier 2011

slide-10
SLIDE 10

Knowledge Management Institute

Example of associative memory: Example of associative memory: free recall

STUDY THESE WORDS STUDY THESE WORDS: Bed, Rest, Awake, Tired, Dream, Wake, Snooze, Blanket Doze Slumber Snore Nap Peace Yawn Blanket, Doze, Slumber, Snore, Nap, Peace, Yawn, Drowsy RECALL WORDS ..... FALSE RECALL: “Sleep” 61% FALSE RECALL: Sleep 61%

10

Markus Strohmaier 2011

slide-11
SLIDE 11

Knowledge Management Institute

A theory for semantic association

S ti i ti b bili ti i f Semantic association as probabilistic inference Representation of semantic structure

11

Markus Strohmaier 2011

slide-12
SLIDE 12

Knowledge Management Institute

Infer g from w I f f Infer z from w Infer wn+1 from w

n+1

12

Markus Strohmaier 2011

slide-13
SLIDE 13

Knowledge Management Institute

GENERATIVE PROCESS GENERATIVE PROCESS

DOCUMENT 1: money1 bank1 bank1 loan1 river2 stream2 bank1 money1 river2 bank1 money1 bank1 loan1 money1 stream2 bank1 money1 bank1 bank1 loan1 river2 stream2 bank1

.8 TOPIC 1

money1 river2 bank1 money1 bank1 loan1 bank1 money1 stream2

.3 2

DOCUMENT 2: river2 stream2 bank2 stream2 bank2 money1 loan1 river2 stream2 loan1 bank2 river2 bank2 bank1 stream2 river2 loan1 bank2 stream2 bank2 money1 loan1 river2 stream2 bank2 stream2 bank2 money1 river2 stream2 loan1 bank2

.2 .7 TOPIC 2

y river2 bank2 money1 bank1 stream2 river2 bank2 stream2 bank2 money1

TOPIC 2

Mixture components Mixture weights

  • No notion of mutual exclusivity
  • Capturing polysemy

13

Markus Strohmaier 2011

components weights

Capturing polysemy

  • Bag of words
slide-14
SLIDE 14

Knowledge Management Institute

The probability of choosing a word: y g

T

( ) ( ) ( )

=

=

z

z P z w P w P

1

|

d b bilit word probability in topic j probability of topic j in document

14

Markus Strohmaier 2011

T…Number of Topics

slide-15
SLIDE 15

Knowledge Management Institute

Bayes‘ rule:

Latent Semantic Structure

Distribution over words Latent Structure l

=

l

l) , ( ) ( w w P P

Latent Structure

l

Inferring latent structure

l

Inferring latent structure

) ( ) | ( ) | ( w w P P P l l l =

Words w

) ( ) | ( w w P P l

Prediction

... ) | (

1

=

+

w

n

w P

15

Markus Strohmaier 2011

... ) | (

1 +

w

n

w P

slide-16
SLIDE 16

Knowledge Management Institute

Overview

I Associative memory II The topic model III Applications to associative memory IV Applications in machine learning/text mining

16

Markus Strohmaier 2011

slide-17
SLIDE 17

Knowledge Management Institute

The Big Idea

Topic Model

  • Model topics as distribution over words

Document Model

  • Model documents as distribution over words

Document / Topic Model

  • Probabilistic Model for both
  • Model topics as distribution over words
  • Model documents as distribution over topics

17

Markus Strohmaier 2011

slide-18
SLIDE 18

Knowledge Management Institute

Topic Model

Unsupervised learning of topics (“gist”) of documents: p g p ( g )

– articles/chapters – conversations il – emails – .... any verbal context

Topics are useful latent structures to explain semantic i ti association

18

Markus Strohmaier 2011

slide-19
SLIDE 19

Knowledge Management Institute

Probabilistic Generative Model

Each topic is a probability distribution over words Each topic is a probability distribution over words

From the TASA corpus, a collection of over 37,000 text passages from educational materials (e.g., language & arts, social studies, health, sciences) collected by Touchstone Applied Science Associates (see Landauer, Foltz, & Laham, 1998).

19

Markus Strohmaier 2011

slide-20
SLIDE 20

Knowledge Management Institute

  • bserved
  • bserved
  • bserved

20

Markus Strohmaier 2011

Taken from „Topics in Semantic Representation“, Thomas L. Griffiths, Mark Steyvers,Joshua B. Tenenbaum

slide-21
SLIDE 21

Knowledge Management Institute

Inference – Constructing Topic Models

Expectation Maximization

  • But poor results (local Maxima)

Gibbs Sampling Gibbs Sampling

– Parameters: φ, θ S – Start with initial random assignment – Update parameter using other parameters – Converges after ‘n’ iterations – Burn-in time

21

Markus Strohmaier 2011

slide-22
SLIDE 22

Knowledge Management Institute

INVERTING THE GENERATIVE PROCESS

DOCUMENT 1: A Play is written to be performed on a

INVERTING THE GENERATIVE PROCESS

DOCUMENT 1: A Play is written to be performed on a

stage before a live audience

  • r before motion

picture or television cameras ( for later viewing by large audiences ). A Play is written because

?

TOPIC 1

by large audiences ). A Play is written because playwrights have something ...

?

DOCUMENT 2: He was listening

to music coming

?

g g from a passing riverboat. The music had already captured his heart as well as his ear . It was jazz . Bix beiderbecke had already had music lessons .

TOPIC 2

He wanted to play the cornet. And he wanted to play jazz .......

W ti t th i t f t i t d

24

Markus Strohmaier 2011

We estimate the assignments of topics to words

slide-23
SLIDE 23

Knowledge Management Institute

INVERTING THE GENERATIVE PROCESS

DOCUMENT 1:

A Play082 is written082 to be

INVERTING THE GENERATIVE PROCESS

DOCUMENT 1:

A Play is written to be performed082

  • n

a stage082 before a live093 audience082

  • r

before motion270 picture004

  • r

television004 cameras004 ( for later054 viewing004 by

?

TOPIC 1

television cameras ( for later viewing by large202 audiences082). A Play082 is written082 because playwrights082 have something ...

?

DOCUMENT 2:

He was listening077 to music077

?

g coming009 from a passing043 riverboat. The music077 had already captured006 his heart157 as well as his

  • ear119. It was jazz077. Bix beiderbecke had already

TOPIC 2

had music077 lessons077. He wanted268 to play077 the

  • cornet. And he wanted268 to play077 jazz077.......

We estimate the assignments of topics to words

25

Markus Strohmaier 2011

We estimate the assignments of topics to words Blue words represent stopwords/words not used

slide-24
SLIDE 24

Knowledge Management Institute

Statistical Inference

Fix number of topics T p We estimate the posterior over topic assignments

= z w z w w z ) ( ) , ( ) | ( P P P

∑z

z w ) , ( P

Markov Chain Monte Carlo (MCMC) with Gibbs sampling

26

Markus Strohmaier 2011

slide-25
SLIDE 25

Knowledge Management Institute

Topic Model Inference: Topic Model Inference: Procedure

INPUT: word document counts word-document counts OUTPUT OUTPUT: topic assignments to each word

z

likely words in each topic

) | ( z w P

likely topics for a document (“gist”)

) | ( w z P

32

Markus Strohmaier 2011

slide-26
SLIDE 26

Knowledge Management Institute

Example: topics from an educational Example: topics from an educational corpus (TASA)

  • 37K docs 26K words

From the TASA corpus, a collection of over 37,000 text passages PRINTING G O S S S

  • 37K docs, 26K words
  • 1700 topics, e.g.:

p , , p g from educational materials (e.g., language & arts, social studies, health, sciences) collected by Touchstone Applied Science Associates (see Landauer, Foltz, & Laham, 1998). PRINTING PAPER PRINT PRINTED TYPE PROCESS PLAY PLAYS STAGE AUDIENCE THEATER TEAM GAME BASKETBALL PLAYERS PLAYER JUDGE TRIAL COURT CASE JURY HYPOTHESIS EXPERIMENT SCIENTIFIC OBSERVATIONS SCIENTISTS STUDY TEST STUDYING HOMEWORK NEED PROCESS INK PRESS IMAGE PRINTER ACTORS DRAMA SHAKESPEARE ACTOR THEATRE PLAY PLAYING SOCCER PLAYED BALL ACCUSED GUILTY DEFENDANT JUSTICE EVIDENCE EXPERIMENTS SCIENTIST EXPERIMENTAL TEST METHOD CLASS MATH TRY TEACHER WRITE PRINTS PRINTERS COPY COPIES FORM THEATRE PLAYWRIGHT PERFORMANCE DRAMATIC COSTUMES COMEDY BALL TEAMS BASKET FOOTBALL SCORE COURT EVIDENCE WITNESSES CRIME LAWYER WITNESS ATTORNEY METHOD HYPOTHESES TESTED EVIDENCE BASED OBSERVATION WRITE PLAN ARITHMETIC ASSIGNMENT PLACE STUDIED FORM OFFSET GRAPHIC SURFACE PRODUCED CHARACTERS COMEDY TRAGEDY CHARACTERS SCENES OPERA COURT GAMES TRY COACH GYM ATTORNEY HEARING INNOCENT DEFENSE CHARGE OBSERVATION SCIENCE FACTS DATA RESULTS STUDIED CAREFULLY DECIDE IMPORTANT NOTEBOOK

33

Markus Strohmaier 2011 CHARACTERS PERFORMED SHOT CRIMINAL EXPLANATION REVIEW

slide-27
SLIDE 27

Knowledge Management Institute

Polysemy

PRINTING PLAY TEAM JUDGE HYPOTHESIS STUDY PAPER PRINT PRINTED TYPE PROCESS PLAY PLAYS STAGE AUDIENCE THEATER TEAM GAME BASKETBALL PLAYERS PLAYER JUDGE TRIAL COURT CASE JURY HYPOTHESIS EXPERIMENT SCIENTIFIC OBSERVATIONS SCIENTISTS STUDY TEST STUDYING HOMEWORK NEED PROCESS INK PRESS IMAGE PRINTER PRINTS ACTORS DRAMA SHAKESPEARE ACTOR THEATRE PLAY PLAYING SOCCER PLAYED BALL ACCUSED GUILTY DEFENDANT JUSTICE EVIDENCE EXPERIMENTS SCIENTIST EXPERIMENTAL TEST METHOD CLASS MATH TRY TEACHER WRITE PRINTS PRINTERS COPY COPIES FORM THEATRE PLAYWRIGHT PERFORMANCE DRAMATIC COSTUMES COMEDY BALL TEAMS BASKET FOOTBALL SCORE COURT EVIDENCE WITNESSES CRIME LAWYER WITNESS ATTORNEY METHOD HYPOTHESES TESTED EVIDENCE BASED OBSERVATION WRITE PLAN ARITHMETIC ASSIGNMENT PLACE STUDIED OFFSET GRAPHIC SURFACE PRODUCED CHARACTERS COMEDY TRAGEDY CHARACTERS SCENES OPERA COURT GAMES TRY COACH GYM ATTORNEY HEARING INNOCENT DEFENSE CHARGE OBSERVATION SCIENCE FACTS DATA RESULTS STUDIED CAREFULLY DECIDE IMPORTANT NOTEBOOK

34

Markus Strohmaier 2011 CHARACTERS PERFORMED SHOT CRIMINAL EXPLANATION REVIEW

slide-28
SLIDE 28

Knowledge Management Institute

Overview

I Associative memory II The topic model III Applications to associative memory IV Applications in machine learning/text mining

35

Markus Strohmaier 2011

slide-29
SLIDE 29

Knowledge Management Institute

BAT

Example associative structure

BASEBALL BAT BALL GAME PLAY STAGE THEATER (A i ti b D N l t l 1998)

36

Markus Strohmaier 2011

(Association norms by Doug Nelson et al. 1998)

slide-30
SLIDE 30

Knowledge Management Institute

BAT

Explaining structure with topics

BASEBALL BAT BALL topic 1 GAME PLAY topic 2 STAGE THEATER topic 2

37

Markus Strohmaier 2011

slide-31
SLIDE 31

Knowledge Management Institute

Tasa corpus

Need a suitable corpus to model human associations Need a suitable corpus to model human associations TASA TASA

– an educational corpus of text – 37K documents – 26K words

38

Markus Strohmaier 2011

slide-32
SLIDE 32

Knowledge Management Institute

Modeling Word Association Modeling Word Association

Word association modeled as prediction Idea: The similarity between two words w1 + w2 can be measured by the extent that they share the same topics. Given that a single word is observed, what future other words might occur? U d i l t i ti Under a single topic assumption:

( ) ( ) ( )

= z P z w P w P w w | | |

( ) ( ) ( )

+ +

=

z n n

z P z w P w P w w | | |

1 1

Response Cue Probability that w1 indicates topic z Probability that w2 is generated by z

39

Markus Strohmaier 2011

What is the most likely topic z for w1, and what is the most likely word for this topic z. Multiply these two for all words w2.

slide-33
SLIDE 33

Knowledge Management Institute

Observed associates for the cue Observed associates for the cue “play”

HUMANS

Word P( word ) FUN .141

HUMANS

List generated by human subjects

FUN .141 BALL .134 GAME .074 WORK .067

subjects

GROUND .060 MATE .027 CHILD .020 ENJOY .020 WIN .020 ACTOR .013 FIGHT .013 HORSE .013 KID .013

40

Markus Strohmaier 2011

MUSIC .013

slide-34
SLIDE 34

Knowledge Management Institute

Model predictions

HUMANS TOPICS (T=500)

List generated by topic models

Word P( word ) Word P( word ) FUN 141 BALL 041

HUMANS TOPICS (T=500)

models

FUN .141 BALL .041 BALL .134 GAME .039 GAME .074 CHILDREN .019 WORK .067 ROLE .014 WORK .067 ROLE .014 GROUND .060 GAMES .014 MATE .027 MUSIC .009 CHILD .020 BASEBALL .009 ENJOY .020 HIT .008 WIN .020 FUN .008 ACTOR .013 TEAM .008

RANK 9

FIGHT .013 IMPORTANT .006 HORSE .013 BAT .006 KID .013 RUN .006

41

Markus Strohmaier 2011

MUSIC .013 STAGE .005

slide-35
SLIDE 35

Knowledge Management Institute

Median rank of first associate

35 40 Best LSA cosine Best LSA inner product 1700 topics 1500 topics 1300 topics 25 30 1300 topics 1100 topics 900 topics 700 topics 500 topics 300 topics 20 25 p

nk

10 15

an Ran

5

Media

42

Markus Strohmaier 2011

1

slide-36
SLIDE 36

Knowledge Management Institute

Latent Semantic Analysis Latent Semantic Analysis

(Landauer & Dumais, 1997)

high dimensional space Singular value

STREAM

word-document counts decomposition

RIVER STREAM BANK MONEY

Each word is a single point in semantic space Similarity measured by cosine of angle between word

43

Markus Strohmaier 2011

vectors

slide-37
SLIDE 37

Knowledge Management Institute

Median rank of first associate

35 40 Best LSA cosine Best LSA inner product 1700 topics 1500 topics 1300 topics

LSA

25 30 1300 topics 1100 topics 900 topics 700 topics 500 topics 300 topics 20 25 p

nk

10 15

an Ran

5

Media

44

Markus Strohmaier 2011

1

slide-38
SLIDE 38

Knowledge Management Institute

Recall: example study List

STUDY: Bed, Rest, Awake, Tired, Dream, Wake, Snooze, Blanket, Doze, Slumber, Snore, Nap, Peace, Yawn, Drowsy FALSE RECALL “Sl ” 61% FALSE RECALL: “Sleep” 61%

45

Markus Strohmaier 2011

slide-39
SLIDE 39

Knowledge Management Institute

Recall as a reconstructive process

R t t t d li t b d th t d “ i t” Reconstruct study list based on the stored “gist” The gist can be represented by a distribution over topics Under a single topic assumption:

( ) ( ) ( )

z P z w P w P w w | | |

( ) ( ) ( )

+ +

=

z n n

z P z w P w P w w | | |

1 1

Retrieved word Study list

46

Markus Strohmaier 2011

slide-40
SLIDE 40

Knowledge Management Institute

Predictions for Predictions for the “Sleep” list

47

Markus Strohmaier 2011

slide-41
SLIDE 41

Knowledge Management Institute

Word Word Sense Disambiguation

48

Markus Strohmaier 2011

slide-42
SLIDE 42

Knowledge Management Institute

Latent Semantic Analysis vs. Topics

Th t i d l d LSA th i t d d t The topic model and LSA use the same input—a word–document co-occurrence matrix—but they differ in how this input is analyzed and in the way that they represent the gist of documents and the meaning of words.

Quantitative differences Qualitative differences

– probabilistic generative models can work with more structured representations – Extensions of topic models:

  • hierarchies

49

Markus Strohmaier 2011

  • syntax-semantics
slide-43
SLIDE 43

Knowledge Management Institute

Overview

I Associative memory II The topic model III Applications to associative memory IV Applications in machine learning/text mining

50

Markus Strohmaier 2011

slide-44
SLIDE 44

Knowledge Management Institute

Applications in Machine Learning

A t ti ll l t i f l t t ll ti Automatically learn topics from large text collections

– NSF/NIH grant proposals – 18th century newspapers 18 century newspapers – Enron email

Topics provide quick overview of content

51

Markus Strohmaier 2011

slide-45
SLIDE 45

Knowledge Management Institute

Enron email data Enron email data

500,000 emails 500,000 emails 500,000 emails 500,000 emails 5000 authors 5000 authors 1999 1999-

  • 2002

2002

52

Markus Strohmaier 2011

slide-46
SLIDE 46

Knowledge Management Institute

Enron topics p

TEXANS WIN FOOTBALL FANTASY GOD LIFE MAN PEOPLE ENVIRONMENTAL AIR MTBE EMISSIONS FERC MARKET ISO COMMISSION POWER CALIFORNIA ELECTRICITY UTILITIES STATE PLAN CALIFORNIA DAVIS SPORTSLINE PLAY TEAM GAME SPORTS CHRIST FAITH LORD JESUS SPIRITUAL CLEAN EPA PENDING SAFETY WATER ORDER FILING COMMENTS PRICE CALIFORNIA PRICES MARKET PRICE UTILITY CUSTOMERS RATE BANKRUPTCY SOCAL POWER BONDS GAMES VISIT GASOLINE FILED ELECTRIC MOU

PERSON1 PERSON2 2000 2001 2002 2003

May 22, 2000 Start of California

53

Markus Strohmaier 2011 TIMELINE energy crisis

slide-47
SLIDE 47

Knowledge Management Institute

NSF & NIH grant abstracts

A l 22 000 ti t d i 2002 Analyze 22,000+ active grants during 2002

– NIH – NIMH, NCI – NSF – BIO SBE NSF BIO, SBE

What topics are funded? What topics are funded? Topic map of funding programs Topic map of funding programs

54

Markus Strohmaier 2011

slide-48
SLIDE 48

Knowledge Management Institute

Example topics

b i 101 hild 153 hi 121 hi h i 226 BRAIN IMAGING CHILD PARENT INTERACTION HIV INTERVENTION SCHIZOPHRENIA brain .101 children .153 hiv .121 schizophrenia .226 fmri .054 child .089 intervention .064 patients .067 imaging .054 parent .038 risk .050 deficits .054 functional .046 parents .032 sexual .043 schizophrenic .027 mri 033 family 032 prevention 037 psychosis 024 mri .033 family .032 prevention .037 psychosis .024 subjects .033 families .022 aids .024 subjects .023 magnetic .031 early .020 interventions .018 psychotic .022 resonance .029 problems .019 reduction .015 dysfunction .019 neuroimaging .028 mothers .017 behavior .015 abnormalities .017 structural .018 risk .017 men .013 clinical .015 VISUAL ALZHEIMER visual .075 memory .237

  • lder

.083 disease .102 processing .048 working .049 adults .071 ad .074 sensory 035 memories 022 age 066 alzheimer 043 VISUAL PROCESSING MEMORY AGING ALZHEIMER DISEASE sensory .035 memories .022 age .066 alzheimer .043 spatial .034 tasks .022 elderly .041 diabetes .025 information .022 retrieval .021 geriatric .041 cardiovascular .016 eye .020 encoding .020 life .039 insulin .015 stimuli .020 cognitive .019 aging .033 vascular .015

55

Markus Strohmaier 2011

  • bject .019

processing .019 late .032 blood .013

  • bjects .019

recognition .018 cognitive .028 clinical .012 perception .018 performance .016 aged .022 individuals .012

slide-49
SLIDE 49

Knowledge Management Institute

NSF – BIO NSF – SBE

INT INT Japan and Korea INT Western E rope INT Africa, Near East, and South Asia INT INT Central and Eastern Europe INT East Asia d P ifi INT International activities - Other Europe DEB Ecological DEB Environmental DEB S t ti MCB Molecular and cellular biosciences - Other BCS Archaeology, archeometry, and ... BCS Environmental social and behavioral science BCS Geography Americas and Pacific BIR Biological infrastructure - Other BIR Human resources BIR Instrumentation BIR Research resources Ecological studies biology - Other Systematic & population biology IBN MCB Biomolecular structure & function MCB PGR BCS Geography and regional science BCS Physical anthropology SES SES Science and technology studies SES Social and economic sciences - Other IBN Developmental mechanisms IBN Integrative biology and neuroscience - Other IBN Physiology and ethology MCB MCB Cell biology MCB Genetics PGR Plant genome research project BCS Behavioral and cognitive sciences - Other BCS Cultural anthropology BCS Instrumentation BCS Linguistics SES Methodology, measures, and statistics Ethics and values studies SES Innovation and organizational change SES Political science SES Research on science and technology SES SociologySES IBN Neuroscience MCB Biochemical and biomolecular processes NCI Cancer biology, detection and diagnosis NCI Cancer NCI Cancer NCI NCI Research manpower development NIMH Extramural research BCS Child learning BCS Human cognition and perception SES Decision, risk, and management science and statistics SES Law and social science SociologySES Transformations to quality organizations NCI AIDS Research Cancer Research Centers Cancer causation NCI Cancer prevention and control Cancer treatment NIMH Extramural research NIMH Intramural research g and development BCS Social psychology g SES Economics

56

Markus Strohmaier 2011

NIH

NIMH AIDS Research

slide-50
SLIDE 50

Knowledge Management Institute

C l i Conclusion

Semantic association as probabilistic inference Generative models are useful

– makes modeling assumptions explicit – flexible

Cognitive Science Machine Learning Cognitive Science Machine Learning

57

Markus Strohmaier 2011

slide-51
SLIDE 51

Knowledge Management Institute

Questions? See you!

58

Markus Strohmaier 2011