Help System H "able" "absent" "add" - - PowerPoint PPT Presentation

help system
SMART_READER_LITE
LIVE PREVIEW

Help System H "able" "absent" "add" - - PowerPoint PPT Presentation

Help System H "able" "absent" "add" "zoom" . . . The domain of H is the set of all help pages. The observations are the words in the query. What probabilities are needed? What pseudo-counts and counts


slide-1
SLIDE 1

Help System

H "able" "absent" "add" "zoom"

. . .

The domain of H is the set of all help pages. The observations are the words in the query. What probabilities are needed? What pseudo-counts and counts are used? What data can be used to learn from?

c

  • D. Poole and A. Mackworth 2010

Artificial Intelligence, Lecture 7.3b, Page 1

slide-2
SLIDE 2

Help System

Suppose the help pages are {h1, . . . , hk}. Words are {w1, . . . , wm}. Bayes net requires:

◮ P(hi), these sum to 1 (

i P(hi) = 1)

◮ P(wj | hi), these do not sum to one

Maintain “counts” (pseudo counts + observed cases):

◮ ci the number of times hi was the correct help page ◮ s =

i ci

◮ uij the number of times hi was the correct help page and

word wj was used in the query.

P(hi) = ci/s P(wj | hi) = uij/ci

c

  • D. Poole and A. Mackworth 2010

Artificial Intelligence, Lecture 7.3b, Page 2

slide-3
SLIDE 3

Learning + Inference

Q is the set of words in the query. Learning: if hi is the correct page: Increment s, ci and uij for each wj ∈ Q. Inference: P(hi | Q) ∝ P(hi)

  • wj∈Q

P(wj | hi)

  • wj∈Q

(1 − P(wj | hi)) expensive inference

  • =

ci s

  • wj∈Q

uij ci

  • wj∈Q

ci − uij ci

c

  • D. Poole and A. Mackworth 2010

Artificial Intelligence, Lecture 7.3b, Page 3

slide-4
SLIDE 4

Learning + Inference

Q is the set of words in the query. Learning: if hi is the correct page: Increment s, ci and uij for each wj ∈ Q. Inference: P(hi | Q) ∝ P(hi)

  • wj∈Q

P(wj | hi)

  • wj∈Q

(1 − P(wj | hi)) expensive inference

  • =

ci s

  • wj∈Q

uij ci

  • wj∈Q

ci − uij ci = ci s

  • wj

ci − uij ci

  • wj∈Q

uij ci − uij expensive learning

  • =

Ψi

  • wj∈Q

uij ci − uij

c

  • D. Poole and A. Mackworth 2010

Artificial Intelligence, Lecture 7.3b, Page 4

slide-5
SLIDE 5

Issues

What if the most likely page isn’t the correct page? What if the user can’t find the correct page? What if the user mistakenly thinks they have the correct page? Can some pages never be found? What about common words? What about words that affect the meaning, e.g. “not”? What about new words? What do we do with new help pages? How can we transfer the language model to a new help system?

c

  • D. Poole and A. Mackworth 2010

Artificial Intelligence, Lecture 7.3b, Page 5

slide-6
SLIDE 6

Simple Language Models

Sentence is a sequence of words: w1, w1, w3, . . . . Modeled as: Set of words Unigram (bag of words): Model P(Wi). Range of Wi is the set of all words. Bigram: Model P(Wi | Wi−1) Trigram: Model P(Wi | Wi−1, Wi−2)

c

  • D. Poole and A. Mackworth 2010

Artificial Intelligence, Lecture 7.3b, Page 6

slide-7
SLIDE 7

Logic, Probability, Statistics, Ontology over time

From: Google Books Ngram Viewer (http://books.google.com/ngrams/)

c

  • D. Poole and A. Mackworth 2010

Artificial Intelligence, Lecture 7.3b, Page 7

slide-8
SLIDE 8

Topic Model

tools food topics "nut" "tuna" "bolt" words

c

  • D. Poole and A. Mackworth 2010

Artificial Intelligence, Lecture 7.3b, Page 8

slide-9
SLIDE 9

Google’s rephil

900,000 topics "Aaron's beard" "zzz" "aardvark" 12,000,000 words ... 350,000,000 links ...

c

  • D. Poole and A. Mackworth 2010

Artificial Intelligence, Lecture 7.3b, Page 9