Teaching the Basics of NLP and ML in an Introductory Course to - - PowerPoint PPT Presentation

teaching the basics of nlp and ml in an introductory
SMART_READER_LITE
LIVE PREVIEW

Teaching the Basics of NLP and ML in an Introductory Course to - - PowerPoint PPT Presentation

Teaching the Basics of NLP and ML in an Introductory Course to Information Science Apoorv Agarwal Columbia University Sunday, September 8, 13 COMS1001 Sunday, September 8, 13 COMS1001 Introductory course on information science to


slide-1
SLIDE 1

Teaching the Basics of NLP and ML in an Introductory Course to Information Science

Apoorv Agarwal Columbia University

Sunday, September 8, 13

slide-2
SLIDE 2

COMS1001

Sunday, September 8, 13

slide-3
SLIDE 3

COMS1001

  • Introductory course on information science to

undergraduates at Columbia University

Sunday, September 8, 13

slide-4
SLIDE 4

COMS1001

  • Introductory course on information science to

undergraduates at Columbia University

  • Mostly taken by freshmen and sophomores

Sunday, September 8, 13

slide-5
SLIDE 5

COMS1001

  • Introductory course on information science to

undergraduates at Columbia University

  • Mostly taken by freshmen and sophomores
  • Assumes no prior programming or math background

Sunday, September 8, 13

slide-6
SLIDE 6

COMS1001

  • Introductory course on information science to

undergraduates at Columbia University

  • Mostly taken by freshmen and sophomores
  • Assumes no prior programming or math background
  • 10% : what’s a programming language?

Sunday, September 8, 13

slide-7
SLIDE 7

Student demographics

Sunday, September 8, 13

slide-8
SLIDE 8

Student demographics

Sunday, September 8, 13

slide-9
SLIDE 9

Student demographics

Math and Engineering majors

Sunday, September 8, 13

slide-10
SLIDE 10

Student demographics

Math and Engineering majors Challenge 1: Cannot use

Math terminology: vector space, dot product, high- dimensional space etc.

Sunday, September 8, 13

slide-11
SLIDE 11

Traditionally taught topics

Sunday, September 8, 13

slide-12
SLIDE 12

Traditionally taught topics

  • About thirty 75 min lectures

Sunday, September 8, 13

slide-13
SLIDE 13

Traditionally taught topics

  • About thirty 75 min lectures
  • First half: Operating systems, WWW and the

Internet, Binary and Machine Language, Spreadsheets, Database systems

Sunday, September 8, 13

slide-14
SLIDE 14

Traditionally taught topics

  • About thirty 75 min lectures
  • First half: Operating systems, WWW and the

Internet, Binary and Machine Language, Spreadsheets, Database systems

  • Second half: Algorithms, Programming in Python

Sunday, September 8, 13

slide-15
SLIDE 15

Traditionally taught topics

  • About thirty 75 min lectures
  • First half: Operating systems, WWW and the

Internet, Binary and Machine Language, Spreadsheets, Database systems

  • Second half: Algorithms, Programming in Python

Challenge 2: Introduce NLP/ML in one lecture

Sunday, September 8, 13

slide-16
SLIDE 16

Overall Strategy

Sunday, September 8, 13

slide-17
SLIDE 17

Overall Strategy

  • Keep definitions simple

Sunday, September 8, 13

slide-18
SLIDE 18

Overall Strategy

  • Keep definitions simple
  • Use analogies and concrete examples (also
  • bserved by Reva Freedman 2005)

Sunday, September 8, 13

slide-19
SLIDE 19

Overall Strategy

  • Keep definitions simple
  • Use analogies and concrete examples (also
  • bserved by Reva Freedman 2005)
  • Take baby steps -- incremental learning

Sunday, September 8, 13

slide-20
SLIDE 20

Overall Strategy

  • Keep definitions simple
  • Use analogies and concrete examples (also
  • bserved by Reva Freedman 2005)
  • Take baby steps -- incremental learning
  • Introduce the core concepts in one lecture and

build on them using homework and exam problems

Sunday, September 8, 13

slide-21
SLIDE 21

Strategy

Sunday, September 8, 13

slide-22
SLIDE 22

Strategy

Sentiment analysis of tweets

Sunday, September 8, 13

slide-23
SLIDE 23

Strategy

Sentiment analysis of tweets Sentiment analysis of movie reviews

Sunday, September 8, 13

slide-24
SLIDE 24

Strategy

Sentiment analysis of tweets Sentiment analysis of movie reviews Email classification into Imp/Not-Imp

Sunday, September 8, 13

slide-25
SLIDE 25

Strategy

Sentiment analysis of tweets Sentiment analysis of movie reviews Email classification into Imp/Not-Imp Gear towards text processing

Sunday, September 8, 13

slide-26
SLIDE 26

Strategy

Sentiment analysis of tweets Sentiment analysis of movie reviews Email classification into Imp/Not-Imp Gear towards text processing Implement end-to-end SA pipeline

Sunday, September 8, 13

slide-27
SLIDE 27

Overview

  • Lecture organization
  • Questions asked in class
  • Performance on the mid-term examination
  • Final projects
  • Conclusion

Sunday, September 8, 13

slide-28
SLIDE 28

Lecture Organization

Sunday, September 8, 13

slide-29
SLIDE 29

Lecture Organization

  • General discussion on how to define intelligence

Sunday, September 8, 13

slide-30
SLIDE 30

Lecture Organization

  • General discussion on how to define intelligence
  • Introduce a concrete application: sentiment

analysis of Twitter data

Sunday, September 8, 13

slide-31
SLIDE 31

Lecture Organization

  • General discussion on how to define intelligence
  • Introduce a concrete application: sentiment

analysis of Twitter data

  • Demonstrate annotation process

Sunday, September 8, 13

slide-32
SLIDE 32

Lecture Organization

  • General discussion on how to define intelligence
  • Introduce a concrete application: sentiment

analysis of Twitter data

  • Demonstrate annotation process
  • Demonstrate feature extraction

Sunday, September 8, 13

slide-33
SLIDE 33

Lecture Organization

  • General discussion on how to define intelligence
  • Introduce a concrete application: sentiment

analysis of Twitter data

  • Demonstrate annotation process
  • Demonstrate feature extraction
  • Demonstrate a basic classification process

Sunday, September 8, 13

slide-34
SLIDE 34

Points we drive home

Sunday, September 8, 13

slide-35
SLIDE 35

Points we drive home

  • 1. The machine automatically learns the connotation of

words by looking at how often certain words appear in positive and negative tweets.

Sunday, September 8, 13

slide-36
SLIDE 36

Points we drive home

  • 1. The machine automatically learns the connotation of

words by looking at how often certain words appear in positive and negative tweets.

  • 2. The machine also learns more complex patterns

that have to do with the conjunction and disjunction

  • f features.

Sunday, September 8, 13

slide-37
SLIDE 37

Points we drive home

  • 1. The machine automatically learns the connotation of

words by looking at how often certain words appear in positive and negative tweets.

  • 2. The machine also learns more complex patterns

that have to do with the conjunction and disjunction

  • f features.
  • 3. The quality and amount of training data is important

– for if the training data fails to encode a substantial number of patterns important for classification, the machine is not going to learn well.

Sunday, September 8, 13

slide-38
SLIDE 38

Questions asked in class by students

Sunday, September 8, 13

slide-39
SLIDE 39

Questions asked in class by students

  • 1. Could we create and use a dictionary that lists the prior

polarity of commonly used words?

Sunday, September 8, 13

slide-40
SLIDE 40

Questions asked in class by students

  • 1. Could we create and use a dictionary that lists the prior

polarity of commonly used words?

  • 2. If the prediction score for the tweet is high, does that mean

the machine is more confident about the prediction?

Sunday, September 8, 13

slide-41
SLIDE 41

Questions asked in class by students

  • 1. Could we create and use a dictionary that lists the prior

polarity of commonly used words?

  • 2. If the prediction score for the tweet is high, does that mean

the machine is more confident about the prediction?

  • 3. In the unigram approach, the sequence of words does not
  • matter. But clearly, if “not” does not negate the words

containing opinion, then won’t the machine learn a wrong pattern?

Sunday, September 8, 13

slide-42
SLIDE 42

Questions asked in class by students

  • 1. Could we create and use a dictionary that lists the prior

polarity of commonly used words?

  • 2. If the prediction score for the tweet is high, does that mean

the machine is more confident about the prediction?

  • 3. In the unigram approach, the sequence of words does not
  • matter. But clearly, if “not” does not negate the words

containing opinion, then won’t the machine learn a wrong pattern?

  • 4. If we have too many negative tweets in our training data (as

compared to the positive tweets), then would the machine not be predisposed to predict the polarity of an unseen tweet as negative?

Sunday, September 8, 13

slide-43
SLIDE 43

Mid-term: Email classification

  • 53 students
  • Required to do only 2 out of the following 4 problems

Sunday, September 8, 13

slide-44
SLIDE 44

Mid-term: Email classification

Problem (25 points) NLP/ML Logic Gates Database design Machine Instructions

  • 53 students
  • Required to do only 2 out of the following 4 problems

Sunday, September 8, 13

slide-45
SLIDE 45

Mid-term: Email classification

Problem (25 points) Average NLP/ML 20.54 Logic Gates 16.94 Database design 13.63 Machine Instructions 12.8

  • 53 students
  • Required to do only 2 out of the following 4 problems

Sunday, September 8, 13

slide-46
SLIDE 46

Mid-term: Email classification

Problem (25 points) Average Std-dev NLP/ML 20.54 4.46 Logic Gates 16.94 6.48 Database design 13.63 6.48 Machine Instructions 12.8 6.81

  • 53 students
  • Required to do only 2 out of the following 4 problems

Sunday, September 8, 13

slide-47
SLIDE 47

Mid-term: Email classification

Problem (25 points) Average Std-dev Median NLP/ML 20.54 4.46 22 Logic Gates 16.94 6.48 20 Database design 13.63 6.48 14 Machine Instructions 12.8 6.81 14.5

  • 53 students
  • Required to do only 2 out of the following 4 problems

Sunday, September 8, 13

slide-48
SLIDE 48

Mid-term: Email classification

Problem (25 points) Average Std-dev Median # students attempted NLP/ML 20.54 4.46 22 51 Logic Gates 16.94 6.48 20 36 Database design 13.63 6.48 14 42 Machine Instructions 12.8 6.81 14.5 30

  • 53 students
  • Required to do only 2 out of the following 4 problems

Sunday, September 8, 13

slide-49
SLIDE 49

Student projects

Sunday, September 8, 13

slide-50
SLIDE 50

Student projects

  • Formulate your own task

Sunday, September 8, 13

slide-51
SLIDE 51

Student projects

  • Formulate your own task
  • Collect and annotate data

Sunday, September 8, 13

slide-52
SLIDE 52

Student projects

  • Formulate your own task
  • Collect and annotate data
  • Define the feature space

Sunday, September 8, 13

slide-53
SLIDE 53

Student projects

  • Formulate your own task
  • Collect and annotate data
  • Define the feature space
  • Train and test

Sunday, September 8, 13

slide-54
SLIDE 54

Student projects

  • Formulate your own task
  • Collect and annotate data
  • Define the feature space
  • Train and test

Incentive was low

Sunday, September 8, 13

slide-55
SLIDE 55

Student projects

  • Formulate your own task
  • Collect and annotate data
  • Define the feature space
  • Train and test

Incentive was low But still, 15/53 students decided to pursue the project and 11 actually managed to finish it

Sunday, September 8, 13

slide-56
SLIDE 56

The Bechdel Test (bechdeltest.com)

Sunday, September 8, 13

slide-57
SLIDE 57

The Bechdel Test (bechdeltest.com)

  • A test for movies (Allison Bechdel, 1985)

Sunday, September 8, 13

slide-58
SLIDE 58

The Bechdel Test (bechdeltest.com)

  • A test for movies (Allison Bechdel, 1985)
  • A movie passes this test if all 3 conditions are

met:

Sunday, September 8, 13

slide-59
SLIDE 59

The Bechdel Test (bechdeltest.com)

  • A test for movies (Allison Bechdel, 1985)
  • A movie passes this test if all 3 conditions are

met:

  • There are at least 2 named female characters

Sunday, September 8, 13

slide-60
SLIDE 60

The Bechdel Test (bechdeltest.com)

  • A test for movies (Allison Bechdel, 1985)
  • A movie passes this test if all 3 conditions are

met:

  • There are at least 2 named female characters
  • They talk to each other

Sunday, September 8, 13

slide-61
SLIDE 61

The Bechdel Test (bechdeltest.com)

  • A test for movies (Allison Bechdel, 1985)
  • A movie passes this test if all 3 conditions are

met:

  • There are at least 2 named female characters
  • They talk to each other
  • They talk about something other than a man

Sunday, September 8, 13

slide-62
SLIDE 62

The Bechdel Test (bechdeltest.com)

  • A test for movies (Allison Bechdel, 1985)
  • A movie passes this test if all 3 conditions are

met:

  • There are at least 2 named female characters
  • They talk to each other
  • They talk about something other than a man
  • Pass only 1 test: The Great Gatsby, Star trek into

Darkness, Now you see me, The Internship

Sunday, September 8, 13

slide-63
SLIDE 63

Conclusion

Sunday, September 8, 13

slide-64
SLIDE 64

Conclusion

  • We presented a strategy using which basic NLP/

ML concepts may be taught in an introductory course, in one lecture (supported by HW and exam problems)

Sunday, September 8, 13

slide-65
SLIDE 65

Conclusion

  • We presented a strategy using which basic NLP/

ML concepts may be taught in an introductory course, in one lecture (supported by HW and exam problems)

  • The basics can be taught without using math

terminology

Sunday, September 8, 13

slide-66
SLIDE 66

Conclusion

  • We presented a strategy using which basic NLP/

ML concepts may be taught in an introductory course, in one lecture (supported by HW and exam problems)

  • The basics can be taught without using math

terminology

  • Important outcome -- students find Watson

playing Jeopardy! and Google’s self-driving car less “magical”

Sunday, September 8, 13

slide-67
SLIDE 67

Thanks! Questions?

Sunday, September 8, 13