Lecture 13 Midterm Review Julia Hockenmaier juliahmr@illinois.edu - - PowerPoint PPT Presentation

lecture 13 midterm review
SMART_READER_LITE
LIVE PREVIEW

Lecture 13 Midterm Review Julia Hockenmaier juliahmr@illinois.edu - - PowerPoint PPT Presentation

CS447: Natural Language Processing http://courses.engr.illinois.edu/cs447 Lecture 13 Midterm Review Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center Midterm Exam When: Friday, October 12, 2018 in class Where: DCL 1310 What: Closed


slide-1
SLIDE 1

CS447: Natural Language Processing

http://courses.engr.illinois.edu/cs447

Julia Hockenmaier

juliahmr@illinois.edu 3324 Siebel Center

Lecture 13 Midterm Review

slide-2
SLIDE 2

CS447: Natural Language Processing (J. Hockenmaier)

Midterm Exam

When: Friday, October 12, 2018 in class Where: DCL 1310 What: Closed book exam:

  • You are not allowed to use any cheat sheets, computers,

calculators, phones etc.
 (you shouldn’t have to anyway)

  • Only the material covered in lectures
  • Bring a pen (black/blue) or pencil
  • Short questions — we expect short answers!
  • Tip: If you can’t answer a question, move on to the next one.

You may not be able to complete the whole exam in the time given — there will be a lot of questions, so first do the ones you know how to answer!

2

slide-3
SLIDE 3

CS447: Natural Language Processing (J. Hockenmaier)

Today’s lecture

Quick run through the material we’ve covered, 
 with some example questions. (Not an exhaustive list of possible questions!)

3

slide-4
SLIDE 4

CS447: Natural Language Processing (J. Hockenmaier)

Question types

Define X:
 Provide a mathematical/formal definition of X Explain X; Explain what X is/does:
 Use plain English to define X and say what X is/does Compute X:
 Return X; Show the steps required to calculate it Draw X:
 Draw a figure of X Show/Prove that X is true/is the case/…:
 This may require a (typically very simple) proof. Discuss/Argue whether … Use your knowledge (of X,Y,Z) to argue your point

4

slide-5
SLIDE 5

CS447: Natural Language Processing (J. Hockenmaier)

Fundamentals

5

slide-6
SLIDE 6

CS447: Natural Language Processing (J. Hockenmaier)

Basics about language

Explain Zipf’s law and why it makes NLP difficult.
 
 Explain why we often use statistical models in NLP. 
 
 Give two examples of ambiguity and explain how they make natural language understanding difficult.

6

slide-7
SLIDE 7

CS447: Natural Language Processing (J. Hockenmaier)

Basics about language

Explain Zipf’s law and why it makes NLP difficult.

Zipf’s law says that a few words are very frequent, and most words are very rare. This makes NLP difficult because we will always come across rare/unseen words.

Explain why we often use statistical models in NLP.

To handle ambiguity (and make NLP systems more robust/to deal with the coverage problem).

Give two examples of ambiguity and explain why we have to resolve them.

POS ambiguity: back = noun or verb? Need to resolve this to understand the structure of sentences. Word sense ambiguity: bank = river bank or institution. Need to resolve this to understand the meaning of sentences.

7

slide-8
SLIDE 8

CS447: Natural Language Processing (J. Hockenmaier)

Morphology and finite-state transducers

8

slide-9
SLIDE 9

CS447: Natural Language Processing (J. Hockenmaier)

Morphology

Explain what we mean by derivational morphology, and given an example in a language of your choice. Draw a finite-state automaton for the language {anbm} Explain how we can use finite-state transducers 
 for the morphological analysis of irregular verbs in English.

9

slide-10
SLIDE 10

CS447: Natural Language Processing (J. Hockenmaier)

Draw a finite-state automaton for the language {anbm}

Morphology

10

q0

q1

a b b

slide-11
SLIDE 11

CS447: Natural Language Processing (J. Hockenmaier)

Language modeling

11

slide-12
SLIDE 12

CS447: Natural Language Processing (J. Hockenmaier)

Language modeling

Explain: What is a language model? 
 Explain and define: What is an n-gram language model? Discuss the advantages and disadvantages of bigram language models over unigram models Explain and define how to estimate the parameters of a bigram model Explain and define how evaluate the quality of a language model

12

slide-13
SLIDE 13

CS447: Natural Language Processing (J. Hockenmaier)

Smoothing

Explain what smoothing is, and why it is necessary. Define add-one smoothing and explain when it can be used. Discuss the advantages/disadvantages of add-one smoothing. Define how smoothing can done via linear interpolation and explain when this technique can be used.

13

slide-14
SLIDE 14

CS447: Natural Language Processing (J. Hockenmaier)

Hidden Markov Models and POS tagging

14

slide-15
SLIDE 15

CS447: Natural Language Processing (J. Hockenmaier)

POS tagging

Discuss how you would define a POS tag set. 
 Explain the differences between open and closed word classes. 
 Explain how to do a quantitative evaluation of a POS tagger.

15

slide-16
SLIDE 16

CS447: Natural Language Processing (J. Hockenmaier)

HMMs

Give the mathematical definition of a bigram HMM. Explain how to estimate the parameters of a bigram HMM from labeled data. Explain how the Viterbi algorithm is used for POS tagging with an HMM. Find the most likely tag sequence for the following sentence (given some HMM).

16

slide-17
SLIDE 17

CS447: Natural Language Processing (J. Hockenmaier)

Sequence labeling

17

slide-18
SLIDE 18

CS447: Natural Language Processing (J. Hockenmaier)

Sequence labeling

Define the BIO encoding for NP chunking. Define Maximum Entropy Markov Models. Explain why MEMMs may be more suitable for named entity recognition than HMMs. Draw the graphical model of MEMMs

18

slide-19
SLIDE 19

CS447: Natural Language Processing (J. Hockenmaier)

Syntax and 
 Context-Free Grammars

19

slide-20
SLIDE 20

CS447: Natural Language Processing (J. Hockenmaier)

Syntax basics

Explain how to determine whether a string is a constituent. Explain the distinction between arguments and adjuncts.

20

slide-21
SLIDE 21

CS447: Natural Language Processing (J. Hockenmaier)

CFG basics:

Convert the following PCFG rules to Chomsky Normal Form (and preserve the rule probabilities)

(Nonterminals: XP, YP, ZP, Terminals: X, Y, Z)

XP —> X YP YP 0.75 XP —> XP ZP 0.25 Explain how you can convert a CFG to dependencies.

21

slide-22
SLIDE 22

CS447: Natural Language Processing (J. Hockenmaier)

CFG basics:

Convert the following PCFG rules to Chomsky Normal Form (and preserve the rule probabilities)

(Nonterminals: XP, YP, ZP, Terminals: X, Y, Z)

XP —> X YP YP 0.75 XP —> XP ZP 0.25

Solution 1: XP —> A YP 0.75
 A —> X1 YP 1.00 X1 —> X 1.00 XP —> XP ZP 0.25 Solution 2: XP —> X1 B 0.75
 B —> YP YP 1.00 X1 —> X 1.00 XP —> XP ZP 0.25

22

slide-23
SLIDE 23

CS447: Natural Language Processing (J. Hockenmaier)

CFG basics:

Explain how you can convert a CFG to dependencies. Answer: For every rule XP —> L1…Ln X R1…Rn, identify the head child X among the RHS symbols. All other symbols on the RHS are dependents of the head child.

23

slide-24
SLIDE 24

CS447: Natural Language Processing (J. Hockenmaier)

Parsing with CFGs

24

slide-25
SLIDE 25

CS447: Natural Language Processing (J. Hockenmaier)

CKY Questions

Given the following grammar and the following input sentence, fill in the CKY parse chart: … (input sentence) … (CFG) How many parse trees does the input sentence have? What is the most likely parse tree for this sentence?

25

slide-26
SLIDE 26

CS447: Natural Language Processing (J. Hockenmaier)

More on PCFGs

Define how to compute the probability of a parse tree under a PCFG. Define how to compute the probability of a string under a PCFG.

26

slide-27
SLIDE 27

CS447: Natural Language Processing (J. Hockenmaier)

Statistical Parsing/Penn Treebank

Define the Parseval metrics for evaluating statistical (PCFG) parsers. Explain why basic PCFGs do not perform well, and describe one way to improve their performance.
 (NB: we’ve covered several such methods in class).

27

slide-28
SLIDE 28

CS447: Natural Language Processing (J. Hockenmaier)

Dependency Grammars

28

slide-29
SLIDE 29

CS447: Natural Language Processing (J. Hockenmaier)

Dependency Grammar Basics

Explain the difference between projective and nonprojective dependencies. Explain how dependency grammar represents syntactic structures. Draw the correct dependency tree for the following sentence: 
 …. (example sentence)

29

slide-30
SLIDE 30

CS447: Natural Language Processing (J. Hockenmaier)

Transition-based parsing

Define what we mean by a parser configuration in the context of transition-based parsing. Describe the actions that a transition-based parser can perform. Show the sequence of actions that a transition-based parser has to perform to return the correct dependency tree for the following sentence: … (short input sentence)

30

slide-31
SLIDE 31

CS447: Natural Language Processing (J. Hockenmaier)

Good luck!

31