Lecture 1: Introduction Kai-Wei Chang CS @ University of Virginia - - PowerPoint PPT Presentation

lecture 1
SMART_READER_LITE
LIVE PREVIEW

Lecture 1: Introduction Kai-Wei Chang CS @ University of Virginia - - PowerPoint PPT Presentation

Lecture 1: Introduction Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/NLP16 CS6501 Natural Language Processing 1 Announcements Waiting list: Start attending the first few meetings


slide-1
SLIDE 1

Lecture 1: Introduction

Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/NLP16

1 CS6501– Natural Language Processing

slide-2
SLIDE 2

Announcements

 Waiting list: Start attending the first few meetings

  • f the class as if you are registered. Given that

some students will drop the class, some space will free up.  We will use Piazza as an online discussion

  • platform. Please enroll.

CS6501– Natural Language Processing 2

slide-3
SLIDE 3

Staff

 Instructor: Kai-Wei Chang

 Email: nlp16@kwchang.net  Office: R412 Rice Hall  Office hour: 2:00 – 3:00, Tue (after class).  Additional office hour: 3:00 – 4:00, Thu

 TA: Wasi Ahmad

 Email: wua4nw@virginia.edu  Office: R432 Rice Hall  Office hour: 4:00 – 5:00, Mon

3 CS6501– Natural Language Processing

slide-4
SLIDE 4

This lecture

 Course Overview

 What is NLP? Why it is important?  What will you learn from this course?

 Course Information  What are the challenges?  Key NLP components

CS6501– Natural Language Processing 4

slide-5
SLIDE 5

What is NLP

 Wiki: Natural language processing (NLP) is a field of computer science, artificial intelligence, and computational linguistics concerned with the interactions between computers and human (natural) languages.

CS6501– Natural Language Processing 5

slide-6
SLIDE 6

Go beyond the keyword matching

 Identify the structure and meaning of words, sentences, texts and conversations  Deep understanding of broad language  NLP is all around us

CS6501– Natural Language Processing 6

slide-7
SLIDE 7

Machine translation

CS6501– Natural Language Processing 7

Facebook translation, image credit: Meedan.org

slide-8
SLIDE 8

Statistical machine translation

CS6501– Natural Language Processing 8

Image credit: Julia Hockenmaier, Intro to NLP

slide-9
SLIDE 9

Dialog Systems

CS6501– Natural Language Processing 9

slide-10
SLIDE 10

Sentiment/Opinion Analysis

CS6501– Natural Language Processing 10

slide-11
SLIDE 11

Text Classification

 Other applications?

CS6501– Natural Language Processing 11

www.wired.com

slide-12
SLIDE 12

Question answering

CS6501– Natural Language Processing 12

credit: ifunny.com

'Watson' computer wins at 'Jeopardy'

slide-13
SLIDE 13

Question answering

 Go beyond search

CS6501– Natural Language Processing 13

slide-14
SLIDE 14

Natural language instruction

CS6501– Natural Language Processing 14

https://youtu.be/KkOCeAtKHIc?t=1m28s

slide-15
SLIDE 15

Digital personal assistant

 Semantic parsing – understand tasks  Entity linking – “my wife” = “Kellie” in the phone book

CS6501– Natural Language Processing 15

credit: techspot.com

More on natural language instruction

slide-16
SLIDE 16

Information Extraction

 Unstructured text to database entries

CS6501– Natural Language Processing 16

Yoav Artzi: Natural language processing

slide-17
SLIDE 17

Language Comprehension

 Q: who wrote Winnie the Pooh?  Q: where is Chris lived?

CS6501– Natural Language Processing 17

Christopher Robin is alive and well. He is the same person that you read about in the book, Winnie the Pooh. As a boy, Chris lived in a pretty home called Cotchfield

  • Farm. When Chris was three years old, his father wrote

a poem about him. The poem was printed in a magazine for others to read. Mr. Robin then wrote a book

slide-18
SLIDE 18

What will you learn from this course

 The NLP Pipeline

 Key components for understanding text

 NLP systems/applications

 Current techniques & limitation

 Build realistic NLP tools

CS6501– Natural Language Processing 18

slide-19
SLIDE 19

What’s not covered by this course

 Speech recognition – no signal processing  Natural language generation  Details of ML algorithms / theory  Text mining / information retrieval

CS6501– Natural Language Processing 19

slide-20
SLIDE 20

This lecture

 Course Overview

 What is NLP? Why it is important?  What will you learn from this course?

 Course Information  What are the challenges?  Key NLP components

CS6501– Natural Language Processing 20

slide-21
SLIDE 21

Overview

 New course, first time being offered

 Comments are welcomed  Aimed at first- or second- year PhD students

 Lecture + Seminar  No course prerequisites, but I assume

 programming experience (for the final project)  basics of probability calculus, and linear algebra (HW0)

CS6501– Natural Language Processing 21

slide-22
SLIDE 22

Grading

 No exam & HW -- hooray  Lectures & forum

 Participate in discussion (additional credits)

 Review quizzes (25%): 3 quizzes  Critical review report (10%)  Paper presentation (15%)  Final project (50%)

CS6501– Natural Language Processing 22

slide-23
SLIDE 23

Quizzes

 Format

 Multiple choice questions  Fill-in-the-blank  Short answer questions

 Each quiz: ~20 min in class  Schedule: see course website  Closed book, Closed notes, Closed laptop

CS6501– Natural Language Processing 23

slide-24
SLIDE 24

Critical review report

 1 page maximum  Pick one paper from the suggested list  Summarize the paper (use you own words)  Provide detailed comments

 What can be improved  Potential future directions  Other related work

 Some students will be selected to present their critical reviews

CS6501– Natural Language Processing 24

slide-25
SLIDE 25

Paper presentation

 Each group has 2~3 students  Picked one paper from the suggested readings, or your favorite paper

 Cannot be the same as critical review report  Can be related to your final project  Register your choice early

 15 min presentation + 2 mins Q&A  Will be graded by the instructor, TA, other students

CS6501– Natural Language Processing 25

slide-26
SLIDE 26

Final Project

 Work in groups (2~3 students)  Project proposal

 Written report, 2 page maximum

 Project report (35%)

 < 8 pages, ACL format  Due 2 days before the final presentation

 Project presentation (15%)

 5-min in-class presentation (tentative)

CS6501– Natural Language Processing 26

slide-27
SLIDE 27

Late Policy

 Credit of 48 hours for all the assignments

 Including proposal and final project  No accumulation  No more grace period

 No make-up exam

 unless under emergency situation

CS6501– Natural Language Processing 27

slide-28
SLIDE 28

Cheating/Plagiarism

 No. Ask if you have concerns  UVA Honor Code: http://www.virginia.edu/honor/

CS6501– Natural Language Processing 28

slide-29
SLIDE 29

Lectures and office hours

 Participation is highly appreciated!

 Ask questions if you are still confusing  Feedbacks are welcomed  Lead the discussion in this class  Enroll Piazza https://piazza.com/virginia/fall2016/cs6501004

CS6501– Natural Language Processing 29

slide-30
SLIDE 30

Topics of this class

 Fundamental NLP problems  Machine learning & statistical approaches for NLP  NLP applications  Recent trend in NLP

CS6501– Natural Language Processing 30

slide-31
SLIDE 31

What to Read?

 Natural Language Processing

ACL, NAACL, EACL, EMNLP, CoNLL, Coling, TACL aclweb.org/anthology

 Machine learning

ICML, NIPS, ECML, AISTATS, ICLR, JMLR, MLJ

 Artificial Intelligence

AAAI, IJCAI, UAI, JAIR

CS6501– Natural Language Processing 31

slide-32
SLIDE 32

Questions?

CS6501– Natural Language Processing 32

slide-33
SLIDE 33

This lecture

 Course Overview

 What is NLP? Why it is important?  What will you learn from this course?

 Course Information  What are the challenges?  Key NLP components

CS6501– Natural Language Processing 33

slide-34
SLIDE 34

Challenges – ambiguity  Word sense ambiguity

CS6501– Natural Language Processing 34

slide-35
SLIDE 35

Challenges – ambiguity  Word sense / meaning ambiguity

CS6501– Natural Language Processing 35

Credit: http://stuffsirisaid.com

slide-36
SLIDE 36

Challenges – ambiguity  PP attachment ambiguity

CS6501– Natural Language Processing 36

Credit: Mark Liberman, http://languagelog.ldc.upenn.edu/nll/?p=17711

slide-37
SLIDE 37

Challenges -- ambiguity

 Ambiguous headlines:

 Include your children when baking cookies  Hospitals are Sued by 7 Foot Doctors  Iraqi Head Seeks Arms  Safety Experts Say School Bus Passengers Should Be Belted

CS6501– Natural Language Processing 37

slide-38
SLIDE 38

Challenges – ambiguity  Pronoun reference ambiguity

CS6501– Natural Language Processing 38

Credit: http://www.printwand.com/blog/8-catastrophic-examples-of-word-choice-mistakes

slide-39
SLIDE 39

Challenges – language is not static

 Language grows and changes

 e.g., cyber lingo

CS6501– Natural Language Processing 39

LOL Laugh out loud G2G Got to go BFN Bye for now B4N Bye for now Idk I don’t know FWIW For what it’s worth LUWAMH Love you with all my heart

slide-40
SLIDE 40

Challenges--language is compositional

CS6501– Natural Language Processing 40

Carefully Slide

slide-41
SLIDE 41

Challenges--language is compositional

CS6501– Natural Language Processing 41

小心: Carefully Careful Take Care Caution 地滑: Slide Landslip Wet Floor Smooth

slide-42
SLIDE 42

Challenges – scale

 Examples:

 Bible (King James version): ~700K  Penn Tree bank ~1M from Wall street journal  Newswire collection: 500M+  Wikipedia: 2.9 billion word (English)  Web: several billions of words

CS6501– Natural Language Processing 42

slide-43
SLIDE 43

This lecture

 Course Overview

 What is NLP? Why it is important?  What will you learn from this course?

 Course Information  What are the challenges?  Key NLP components

CS6501– Natural Language Processing 43

slide-44
SLIDE 44

Part of speech tagging

CS6501– Natural Language Processing 44

slide-45
SLIDE 45

Syntactic (Constituency) parsing

CS6501– Natural Language Processing 45

slide-46
SLIDE 46

Syntactic structure => meaning

CS6501– Natural Language Processing 46

Image credit: Julia Hockenmaier, Intro to NLP

slide-47
SLIDE 47

Dependency Parsing

CS6501– Natural Language Processing 47

slide-48
SLIDE 48

Semantic analysis

 Word sense disambiguation  Semantic role labeling

CS6501– Natural Language Processing 48

Credit: Ivan Titov

slide-49
SLIDE 49

Christopher Robin is alive and well. He is the same person that you read about in the book, Winnie the Pooh. As a boy, Chris lived in a pretty home called Cotchfield Farm. When Chris was three years old, his father wrote a poem about him. The poem was printed in a magazine for others to read. Mr. Robin then wrote a book

49

Q: [Chris] = [Mr. Robin] ?

Slide modified from Dan Roth

slide-50
SLIDE 50

Christopher Robin is alive and well. He is the same person that you read about in the book, Winnie the Pooh. As a boy, Chris lived in a pretty home called Cotchfield Farm. When Chris was three years old, his father wrote a poem about him. The poem was printed in a magazine for others to read. Mr. Robin then wrote a book

50

Co-reference Resolution

slide-51
SLIDE 51

Questions?

CS6501– Natural Language Processing 51