November 14, 2017 Administrative notes Reminder: In the news call - - PowerPoint PPT Presentation

november 14 2017 administrative notes
SMART_READER_LITE
LIVE PREVIEW

November 14, 2017 Administrative notes Reminder: In the news call - - PowerPoint PPT Presentation

November 14, 2017 Administrative notes Reminder: In the news call #3 individual component due November 22 . Reminder: In the news call #3 group sign up due November 24 Reminder: project deadlines coming up starting November 27


slide-1
SLIDE 1

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

November 14, 2017 Administrative notes

  • Reminder: In the news call #3 individual component

due November 22.

  • Reminder: In the news call #3 group sign up due

November 24

  • Reminder: project deadlines coming up starting

November 27

  • Reminder: In the news call #3 group component due

November 28

  • Reminder: Final exam: Tuesday, December 5

@noon in CIRS 1250

slide-2
SLIDE 2

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Okay, so that’s what AI is. But how did they do that?

  • There are LOTS of different parts involved
  • We’ll look at a few
  • Note that we’ll cover the general idea of how

things work, but not the specific details

  • We’ll start with looking at how Watson

understands language

  • Understanding how language is processed by

computers is called Natural Language Processing (NLP)

slide-3
SLIDE 3

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

How does Watson process language?

  • See: Building Watson: An Overview of the DeepQA

Project, by David Ferrucci et al., AI Magazine, 2010

  • One thing we won’t cover: they use classification

(remember decision trees?) to help group types of questions.

  • We’ll start by looking at traditional Natural Language

Processing (NLP) techniques

http://www.aaai.org/ojs/index.php/aimagazine/article/view/2303

slide-4
SLIDE 4

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Natural Language Processing (NLP)

  • Natural Language Processing (NLP): automatic

processing of language, e.g., by computers

  • NLP to infer meaning from natural languages is

challenging!

  • NLP draws on many disciplines: linguistics,

cognitive science, psychology, logic, computer science, philosophy, engineering, …

slide-5
SLIDE 5

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Group exercise

NLP is needed for many different things that computers do these days. List applications that you have used that need NLP and what they used it for.

Siri - set a 5 minute timer Google translate Ask alexa maps questions Call systems - "please say yes or no" Voice to text Grammar check

slide-6
SLIDE 6

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Typical NLP steps

  • 1. Recognize speech (Watson skipped this)
  • 2. Syntax analysis, or parsing: inferring parts of

speech and sentence structure, using a lexicon and grammar

  • 3. Semantic analysis: inferring meaning using

syntax and semantic rules

  • 4. Pragmatics: inferring meaning from contextual

information

slide-7
SLIDE 7

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Parsing: identifying parts of speech and sentence structure using lexicon and grammar Input:

Word Category Cat Noun Cheese Noun Ate Verb the Article

Lexicon

Sentence  NounPhrase, VerbPhrase VerbPhrase  Verb, NounPhrase NounPhrase  Article, Noun NounPhrase  Noun

Grammar

Output: a parse tree 

slide-8
SLIDE 8

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

How parsing helped Watson

The structure of some clues and certain keywords tells Watson what the form of the answer will be – without considering semantics. Consider the following clue that Watson can answer:

Category: Oooh....Chess Clue: Invented in the 1500s to speed up the game, this maneuver involves two pieces of the same color. Answer: Castling

Parsing is key in Watson’s ability to answer this question

slide-9
SLIDE 9

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

How parsing helped Watson

Parsing takes the sentence and shows how the words are assigned parts of speech and build up to form a sentence: Data mining showed that given this structure, the noun between the two verb phrases was the type of thing the answer is. In this case, the answer was a “maneuver.”

slide-10
SLIDE 10

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Group exercise: create a parse tree

Using the above lexicon and grammar, parse the sentence: “the large cat chased the rat” If you have a choice of rules, pick the one that works best. You don’t have to use all the rules.

Word Category Cat Noun Rat Noun Chased Verb Large Adjective the Article

Lexicon

Sentence  NounPhrase, VerbPhrase VerbPhrase  Verb, NounPhrase NounPhrase  Article, Noun NounPhrase Article, Adjective, Noun

Grammar

slide-11
SLIDE 11

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Parsing: identifying parts of speech and sentence structure using lexicon and grammar

Word Category Cat Noun Rat Noun Chased Verb Large Adjective the Article

Lexicon

Sentence  NounPhrase, VerbPhrase VerbPhrase  Verb, NounPhrase NounPhrase  Article, Noun NounPhrase Article, Adjective, Noun

Grammar

slide-12
SLIDE 12

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Parse “time flies like an arrow” Group exercise

Write down your tree structure and your algorithm. Note: you don’t have to use all the rules!

Word Category an article arrow noun flies noun flies verb time noun time verb like adverb like verb

Lexicon Grammar

Sentence  NounPhrase, VerbPhrase NounPhrase  Article, Noun NounPhrase  Article, Adjective, Noun NounPhrase  Noun NounPhrase  Noun, Noun VerbPhrase  Verb, Adverb, NounPhrase VerbPhrase  Verb, NounPhrase

slide-13
SLIDE 13

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Use your algorithm to parse “fruit flies like a banana” Group exercise

Did the algorithm work?

  • A. Yes
  • B. No
  • C. Kind of… but “flies” wasn’t quite right.
slide-14
SLIDE 14

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

The point: Parsing is hard!

  • Those were short, yet tricky examples – natural

languages are ambiguous!

  • Imagine trying to write a parsing algorithm that

works for a natural language… sentences of 20-30 words may have 10,000 possible syntactic structures!

  • Jeopardy makes the problem much easier, because

the structure of Jeopardy clues are relatively simple

slide-15
SLIDE 15

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

How good are computers at parsing?

  • A recent Google Parser – Parsey

McParseface – claims to have a record setting 94% accuracy for a newspaper dataset… but only 90% for web content

  • This sounds pretty good, but that means that

assuming accuracy is measured per word, you’d expect to have ~5 words parsed incorrectly on this slide.

https://research.googleblog.com/2016/05/announcing-syntaxnet- worlds-most.html

slide-16
SLIDE 16

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Final note on parsing: it’s the basis for computer programming

  • A computer has to "understand" programs in
  • rder to execute them
  • Programming languages are designed so that

they can be parsed unambiguously

  • A grammar specifies all the possible programs

that can be written in a language

  • Designing programming languages (and their

grammars) is a fun and important part of computer science

slide-17
SLIDE 17

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Recall: Typical NLP steps

  • 1. Recognize speech (Watson skipped this)
  • 2. Syntax analysis, or parsing: inferring parts of

speech and sentence structure, using a lexicon and grammar

  • 3. Semantic analysis: inferring meaning using

syntax and semantic rules

  • 4. Pragmatics: inferring meaning from contextual

information

slide-18
SLIDE 18

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Semantic analysis: inferring meaning using syntax and semantic rules

Syntax analysis/parsing can sometimes help determine semantics, or meaning Examples:

  • Knowing whether “flies” is a noun or a verb (the

syntax) tells us something about its meaning (the semantics)

  • Semantic rules provide additional information:
  • Word categories: e.g., a cat is a feline
  • Relationships between words, e.g., a semantic rule for

the word “like” can help us interpret “the boy likes the cat”

slide-19
SLIDE 19

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Semantic analysis: inferring meaning using syntax and semantic rules

Syntax describes a sentence’s structure. Semantics adds (limited) meaning that can be figured out using simple rules that don’t require much context. Examples:

  • Word categories: e.g., a cat is a feline
  • “gave” is the past tense of “give”
slide-20
SLIDE 20

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Recall: Typical NLP steps

  • 1. Recognize speech (Watson skipped this)
  • 2. Syntax analysis, or parsing: inferring parts of

speech and sentence structure, using a lexicon and grammar

  • 3. Semantic analysis: inferring meaning using

syntax and semantic rules

  • 4. Pragmatics: inferring meaning from

contextual information

slide-21
SLIDE 21

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Pragmatics: inferring meaning from contextual information

  • Most techniques to find semantic meaning of

words will look for clues in the surrounding text to disambiguate word meaning. For example, the real estate meaning of “lot” might have the words “vacant” or “square foot” near by.

  • Pragmatics becomes important also when

sentences contain pronouns

slide-22
SLIDE 22

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Pragmatics and Watson An example Watson can solve

Category: Decorating Clue: Though it sounds “harsh,” it’s just embroidery, often in a floral pattern, done with yarn on cotton cloth. Answer: crewel

  • Syntax parses the sentence and determines the parts of the

speech and the parse tree. It shows that the answer is what “it’s” refers to

  • Semantics provides definitions of terms such as “harsh” and

“crewel”

  • Pragmatics determines what “it’s” refers to and differentiates

between the different definitions of “Harsh” and “crewel/cruel”

slide-23
SLIDE 23

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Group Exercise

Write down a list of all the definitions of “bat” that you can think of and words that might be near by that would help you disambiguate the meaning

Baseball bat, baseball, hit, Bat flies at night, fruit, vampire, Bat as in your bat your eyelashes Up to bat.

slide-24
SLIDE 24

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Clicker exercise

Would your definitions disambiguate between a baseball bat and a baseball player being AT bat?

  • A. Yes
  • B. No
  • C. My head hurts
slide-25
SLIDE 25

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Clicker question

Given what we’ve covered so far, would you say the computer really understands what the sentence “the cat chases the rat” really means?

  • A. Yes
  • B. No
slide-26
SLIDE 26

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Summary: Watson, Jeopardy and NLP

  • The Jeopardy clues are again highly structured,

making NLP techniques.

  • Jeopardy also tends to use similar questions and

topics over again, so studying those narrows things down a lot.

  • The categories also help, but as shown in the video,

sometimes not enough!

slide-27
SLIDE 27

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Sooo….

Given what we’ve discussed about natural language processing, do you think that Watson can understand general language?

  • A. Yes
  • B. No
slide-28
SLIDE 28

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Let’s leave Watson behind for a minute

  • Next, let’s look at another application of NLP:

language translation

  • To translate languages, the computer needs

to be able to “learn” both languages and how to go between them.

  • We’ve covered how to learn a language. How

do you go between them?

slide-29
SLIDE 29

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Group activity

Go to Google translate. Pick some sentences or phrases that are tricky in a language you know. Translate it to another language. See if you can find

  • nes that Google gets wrong.

If you only know English, try doing a round trip: Step 1: Step 2:

slide-30
SLIDE 30

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

How many things did Google Translate get right?

  • A. It got all of them right
  • B. It got at some wrong, but less than half
  • C. It got at least some right, but more than half wrong
  • D. It got all of them wrong
slide-31
SLIDE 31

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Things that Google Translate got incorrect

slide-32
SLIDE 32

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Is this better or worse than you expected? Why?

slide-33
SLIDE 33

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Learning to translate between languages

  • Traditional method (~20 years ago):

1. Apply NLP to each sentence in each language 2. Follow a set of rules that define how to translate

  • Newer methods use machine learning techniques

plus the vast amounts of data on the web (and indeed Watson uses such techniques, too)

http://www.theverge.com/2016/9/27/13078138/google-translate-ai- machine-learning-gnmt

slide-34
SLIDE 34

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

This is why Google can translate from 102 languages

  • This is only possible because of the huge amount of data on

the web.

  • However, they have to make sure that the translations are

good! For a while, Google Translate had to stop learning from the web because there were so many bad Google Translations on the web.

  • It probably can’t find enough documents to translate directly

between languages, so it may translate through some others

  • first. (think of it as hops on an airplane).
  • Note that this can lead to hilarious mistakes – at one point

Google’s Ukrainian to Russian translation referred to “Russia” as “Mordor” (from Lord of the Rings)

http://www.bbc.com/news/technology-35251478

slide-35
SLIDE 35

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Limitations of traditional NLP

  • Natural language is structurally ambiguous, so

parsing alone cannot lead to understanding.

  • Synonyms for words can’t be used

interchangeably in every context, e.g., “minister of agriculture” isn't “priest of farming.”

  • Natural languages have many exceptions to

grammatical rules; there’s no agreed-upon grammar for all uses of a language.

slide-36
SLIDE 36

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Back to Watson

  • Watson used techniques from traditional NLP,

but also incorporated newer machine learning techniques.

  • Much of the following is from: Building

Watson: An Overview of the DeepQA Project, by David Ferrucci et al., AI Magazine, 2010

slide-37
SLIDE 37

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Back to Watson

“Early on in the project, attempts [...] failed to produce promising results […] We ended up

  • verhauling nearly everything we did, including
  • ur basic technical approach, the underlying

architecture, metrics, evaluation protocols, engineering practices, and even how we worked together as a team.”

slide-38
SLIDE 38

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Back to Watson

“The system we have built and are continuing to develop, called DeepQA, is a massively parallel probabilistic evidence-based

  • architecture. For the Jeopardy Challenge, we

use more than 100 different techniques for analyzing natural language, identifying sources, finding and generating hypotheses, finding and scoring evidence, and merging and ranking hypotheses.”

slide-39
SLIDE 39

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Back to Watson

“What is far more important than any particular technique we use is how we combine them in DeepQA such that

  • verlapping approaches can bring their

strengths to bear and contribute to improvements in accuracy, confidence, or speed.

slide-40
SLIDE 40

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

DeepQA architecture

slide-41
SLIDE 41

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Back to Watson Knowledge bases

  • The answer sources and evidence sources

are stored in Watson’s system; the internet is not used directly

  • These local data stores are called knowledge

bases; many applications use them.

slide-42
SLIDE 42

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

How are knowledge bases organized?

  • Some knowledge bases are structured databases –

the data is put in in a very specific format

  • Other knowledge bases are unstructured or semi-

structured – the data is not as rigidly organized

  • E.g., Google stores entire webpages (unstructured).

Wikipedia has some structure, but it’s not totally rigid (semi-structured)

slide-43
SLIDE 43

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Indexes help to locate information in a data store

  • Using an index, Watson can look in its

knowledge base for data

  • Unlike a book index, Watson’s index will also

return how many words into a page/document an item occurs

  • Then to look for quotations of multiple words,

it just has to see if the words are next to each

  • ther
slide-44
SLIDE 44

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Even after finding evidence, there are many more steps before Watson has an answer

  • Let’s take one last look at Watson behind the

scenes…

  • http://www.youtube.com/watch?v=lI-

M7O_bRNg&t=3m20s

slide-45
SLIDE 45

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Watson behind the scenes

slide-46
SLIDE 46

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Two somewhat contradictory takes on Watson. Who do you agree with most? Group discussion

  • "The illusion is that the computer is doing the same

thing that a very good jeopardy player would do. It's not. It's doing something sort of different that looks the same

  • n the surface. And every so often you see the

cracks." Ken Jennings, Jeopardy player

  • "When I do step back I think it is a very important

technical achievement that will reveal both really important applications but it will also reveal a deeper understanding of our intelligence, and that is fascinating." Dave Ferrucci

slide-47
SLIDE 47

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Two somewhat contradictory takes on Watson. Clicker question: Who do you agree with most?

  • "The illusion is that the computer is doing the same

thing that a very good jeopardy player would do. It's not." Ken Jennings, Jeopardy player

  • "When I do step back I think it is a very important

technical achievement that will reveal […] a deeper understanding of our intelligence" Dave Ferrucci

A. Jennings

  • C. Both

B. Ferrucci

  • D. Neither
slide-48
SLIDE 48

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Meanwhile, at Google Brain …

.. a different approach to language processing was being explored

  • Even young children can easily process/understand

natural language

  • Babies don't learn language by explicitly learning the

parts of speech, grammar and parsing...

  • Why not try to simulate the brain?

See: The great AI awakening by Gideon Lewis-Krauss https://www.nytimes.com/2016/12/14/magazine/the-great-ai- awakening.html

slide-49
SLIDE 49

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

The brain: a mass of highly interconnected neurons

  • dendrites receive inputs
  • axons carries output
  • output is a simple

function of the input

slide-50
SLIDE 50

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

The brain: a mass of highly interconnected neurons

“What’s important are less the individual neurons themselves than the manifold connections among them. This structure, in its simplicity, has afforded the brain a wealth of adaptive advantages.”

slide-51
SLIDE 51

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Artificial neurons

An artificial neuron with four inputs and one output. If the inputs from top to bottom are 1, 1, 1 and 0, then the weighted sum of inputs is 3+0−2+0=1. This is less than the threshold of 2, and so the output is 0.

slide-52
SLIDE 52

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Artificial neurons Clicker question

Is the sum of the weights of the input signals greater than or equal to the threshold in this example?

  • A. Yes
  • B. No
slide-53
SLIDE 53

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Artificial neural networks (ANNs)

Many interconnected artificial neurons: the outputs of some feed back into others

slide-54
SLIDE 54

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Artificial neural networks (ANNs)

McCulloch and Pitts (1943) showed that simple artificial neural networks could carry out basic logical functions.

slide-55
SLIDE 55

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Artificial neural networks (ANNs) can recognize patterns

slide-56
SLIDE 56

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Artificial neural networks (ANNs) can learn

"With life experience, depending on a particular person’s trials and errors, the synaptic connections among pairs of neurons get stronger or weaker. An artificial neural network could do something similar, by gradually altering, on a guided trial-and-error basis, the numerical relationships among artificial neurons. It wouldn’t need to be preprogrammed with fixed rules. It would, instead, rewire itself to reflect patterns in the data it absorbed."

slide-57
SLIDE 57

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

An early success at Google Brain: The cat paper

“... a neural network with more than a billion “synaptic” connections could observe raw, unlabeled data and pick out for itself a high-order human concept. The Brain researchers had shown the network millions of still frames from YouTube videos, and [...] network had isolated a stable pattern any toddler or chipmunk would recognize without a moment’s hesitation as the face of a cat.”

slide-58
SLIDE 58

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Let’s hear directly from Google researchers about artificial neural networks

https://www.youtube.com/watch?v=bHvf7Tagt18

slide-59
SLIDE 59

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

From cats to language Word embeddings

  • Word embeddings are multi-dimensional maps of

language

  • Words correspond to points in a multi-dimensional

space (say, 1000 dimensions)

  • Words that are "similar" should be close in this

space

slide-60
SLIDE 60

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

From cats to language Word embeddings

“If you took the thousand numbers that meant “king” and literally just subtracted the thousand numbers that meant “queen,” you got the same numerical result as if you subtracted the numbers for “woman” from the numbers for “man.”

slide-61
SLIDE 61

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

From cats to language Word embeddings

And if you took the entire space of the English language and the entire space of French, you could, at least in theory, train a network to learn how to take a sentence in one space and propose an equivalent in the other. You just had to give it millions and millions of English sentences as inputs on one side and their desired French outputs on the other, and over time it would recognize the relevant patterns in words the way that an image classifier recognized the relevant patterns in pixels.

slide-62
SLIDE 62

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

From cats to language Word embeddings

  • One more big idea was needed to address the

following problem:

  • “The major difference between words and pixels,

however, is that all of the pixels in an image are there at once, whereas words appear in a progression over time. You needed a way for the network to “hold in mind” the progression of a chronological sequence — the complete pathway from the first word to the last.”

slide-63
SLIDE 63

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Summary: ideas used to improve language translation at Google Brain

  • Train artificial neural networks to recognize patterns

across languages

  • Use word embeddings, where a word is a point in a

high-dimensional space, as the input to the networks

  • Handle the “chronological” aspects of language
slide-64
SLIDE 64

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Canadian researchers were at the forefront

  • f developing the underlying ideas
  • Geoff Hinton (U. Toronto, Google)
  • Yoshua Bengio (U. Montreal)
  • Yann LeCun (former student of Hinton’s, now at

Facebook)

slide-65
SLIDE 65

Computational Thinking www.ugrad.cs.ubc.ca/~cs100

Through Watson, we’ve covered a lot of Artificial Intelligence and related topics

  • Machine Learning
  • Natural Language Processing
  • Information Retrieval