Introduction to Ling571 Scott Farrar CLMA, University of Washington - - PowerPoint PPT Presentation

introduction to ling571
SMART_READER_LITE
LIVE PREVIEW

Introduction to Ling571 Scott Farrar CLMA, University of Washington - - PowerPoint PPT Presentation

Ling571 in the CLMA Program Linguistic Structure Introduction to Ling571 Scott Farrar CLMA, University of Washington farrar@u.washington.edu January 4, 2010 Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to


slide-1
SLIDE 1

Ling571 in the CLMA Program Linguistic Structure

Introduction to Ling571

Scott Farrar CLMA, University of Washington farrar@u.washington.edu January 4, 2010

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-2
SLIDE 2

Ling571 in the CLMA Program Linguistic Structure

Today’s lecture

1 Ling571 in the CLMA Program

Shallow processing Deep processing Cross-cutting themes

2 Linguistic Structure

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-3
SLIDE 3

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Shallow processing

Shallow processing means less reliance on linguistic structures, more reliance on surface (textual/signal) patterns in the data. Some tasks for shallow processing. speech recognition using hidden Markov models part-of-speech tagging using n-gram techniques information extraction based on text patterns (making minimal use of linguistic knowledge) Shallow = easy or simple (cf. Ling570).

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-4
SLIDE 4

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Shallow processing

Shallow processing means less reliance on linguistic structures, more reliance on surface (textual/signal) patterns in the data. Some tasks for shallow processing. speech recognition using hidden Markov models part-of-speech tagging using n-gram techniques information extraction based on text patterns (making minimal use of linguistic knowledge) Shallow = easy or simple (cf. Ling570).

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-5
SLIDE 5

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Shallow processing

Shallow processing means less reliance on linguistic structures, more reliance on surface (textual/signal) patterns in the data. Some tasks for shallow processing. speech recognition using hidden Markov models part-of-speech tagging using n-gram techniques information extraction based on text patterns (making minimal use of linguistic knowledge) Shallow = easy or simple (cf. Ling570).

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-6
SLIDE 6

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Shallow processing

Shallow processing means less reliance on linguistic structures, more reliance on surface (textual/signal) patterns in the data. Some tasks for shallow processing. speech recognition using hidden Markov models part-of-speech tagging using n-gram techniques information extraction based on text patterns (making minimal use of linguistic knowledge) Shallow = easy or simple (cf. Ling570).

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-7
SLIDE 7

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Shallow processing

Shallow processing means less reliance on linguistic structures, more reliance on surface (textual/signal) patterns in the data. Some tasks for shallow processing. speech recognition using hidden Markov models part-of-speech tagging using n-gram techniques information extraction based on text patterns (making minimal use of linguistic knowledge) Shallow = easy or simple (cf. Ling570).

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-8
SLIDE 8

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Shallow processing task

Morpheme identification testing, fling, going, bling, go, test

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-9
SLIDE 9

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Shallow processing task

Morpheme identification testing, fling, going, bling, go, test Morphemes test, *fl, go, *bl, ing

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-10
SLIDE 10

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Shallow processing task

Morpheme identification testing, fling, going, bling, go, test Morphemes test, *fl, go, *bl, ing In fact, shallow processing is often used to derive structure for further deeper processing.

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-11
SLIDE 11

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Deep processing

Deep processing means utilizing elaborated linguistic structures. Some tasks for deep processing: deriving structural descriptions of natural language sentences (NL parsing) deriving meaning representations from speech (NL understanding) generating accurate NL based on meaning representations (NL generation) clustering documents based on extracted meaning Deep processing requires more linguistic knowledge than shallow processing.

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-12
SLIDE 12

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Deep processing

Deep processing means utilizing elaborated linguistic structures. Some tasks for deep processing: deriving structural descriptions of natural language sentences (NL parsing) deriving meaning representations from speech (NL understanding) generating accurate NL based on meaning representations (NL generation) clustering documents based on extracted meaning Deep processing requires more linguistic knowledge than shallow processing.

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-13
SLIDE 13

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Deep processing

Deep processing means utilizing elaborated linguistic structures. Some tasks for deep processing: deriving structural descriptions of natural language sentences (NL parsing) deriving meaning representations from speech (NL understanding) generating accurate NL based on meaning representations (NL generation) clustering documents based on extracted meaning Deep processing requires more linguistic knowledge than shallow processing.

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-14
SLIDE 14

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Deep processing

Deep processing means utilizing elaborated linguistic structures. Some tasks for deep processing: deriving structural descriptions of natural language sentences (NL parsing) deriving meaning representations from speech (NL understanding) generating accurate NL based on meaning representations (NL generation) clustering documents based on extracted meaning Deep processing requires more linguistic knowledge than shallow processing.

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-15
SLIDE 15

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Deep processing

Deep processing means utilizing elaborated linguistic structures. Some tasks for deep processing: deriving structural descriptions of natural language sentences (NL parsing) deriving meaning representations from speech (NL understanding) generating accurate NL based on meaning representations (NL generation) clustering documents based on extracted meaning Deep processing requires more linguistic knowledge than shallow processing.

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-16
SLIDE 16

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Deep processing

Deep processing means utilizing elaborated linguistic structures. Some tasks for deep processing: deriving structural descriptions of natural language sentences (NL parsing) deriving meaning representations from speech (NL understanding) generating accurate NL based on meaning representations (NL generation) clustering documents based on extracted meaning Deep processing requires more linguistic knowledge than shallow processing.

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-17
SLIDE 17

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Linguistic structure

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-18
SLIDE 18

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

End-to-end system

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-19
SLIDE 19

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Focus of Ling571

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-20
SLIDE 20

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Deep and shallow processing: similarities

Both require and can benefit from stochastic and rule-based techniques Both require lots of data Each has its own core set of algorithms The end goal is the same: deriving useful information from natural language

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-21
SLIDE 21

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

What about the CLMA themes?

1 ambiguity resolution 2 evaluation 3 multilingual processing Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-22
SLIDE 22

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Ambiguity resolution

Language is inherently ambiguous, at every linguistic level (phonological, morphological, syntactic, etc.): phon /aiskrim/ ice-cream or I scream morph un-doable or undo-able synt Flying planes can be dangerous. sem Every boy kissed a girl.

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-23
SLIDE 23

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Ambiguity resolution

Language is inherently ambiguous, at every linguistic level (phonological, morphological, syntactic, etc.): phon /aiskrim/ ice-cream or I scream morph un-doable or undo-able synt Flying planes can be dangerous. sem Every boy kissed a girl.

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-24
SLIDE 24

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Ambiguity resolution

Language is inherently ambiguous, at every linguistic level (phonological, morphological, syntactic, etc.): phon /aiskrim/ ice-cream or I scream morph un-doable or undo-able synt Flying planes can be dangerous. sem Every boy kissed a girl.

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-25
SLIDE 25

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Ambiguity resolution

Language is inherently ambiguous, at every linguistic level (phonological, morphological, syntactic, etc.): phon /aiskrim/ ice-cream or I scream morph un-doable or undo-able synt Flying planes can be dangerous. sem Every boy kissed a girl.

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-26
SLIDE 26

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Evaluation

For each NLP task, we require some measure of success. Consider an information retrieval system: TREC competition, Ask.com Precision = #of correct answers given by the system

# of answers given by system

Recall =

#of correct answers given by the system total # of possible correct answers given by system

What about a parser?

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-27
SLIDE 27

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Evaluation

For each NLP task, we require some measure of success. Consider an information retrieval system: TREC competition, Ask.com Precision = #of correct answers given by the system

# of answers given by system

Recall =

#of correct answers given by the system total # of possible correct answers given by system

What about a parser?

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-28
SLIDE 28

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Parsing [sent. 2 len. 8]: There is a fly in my soup . (ROOT (S (NP (EX There)) (VP (VBZ is) (NP (NP (DT a) (VB fly)) (PP (IN in) (NP (PRP$ my) (NN soup))))) (. .))) (ROOT (S (NP (EX There)) (VP (VBZ is)) (NP (NP (DT a) (NN fly)) (PP (IN in) (NP (PRP$ my) (NN soup))))) (. .))

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-29
SLIDE 29

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Parsing [sent. 2 len. 8]: There is a fly in my soup . (ROOT (S (NP (EX There)) (VP (VBZ is) (NP (NP (DT a) (VB fly)) (PP (IN in) (NP (PRP$ my) (NN soup))))) (. .))) (ROOT (S (NP (EX There)) (VP (VBZ is)) (NP (NP (DT a) (NN fly)) (PP (IN in) (NP (PRP$ my) (NN soup))))) (. .))

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-30
SLIDE 30

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Multilingual processing

Linguistic structure, to some extent, applies to all languages. But each language has its own particular structures:

Word order varies. What’s a word? How does the language carve up the semantic space?

We’ll look at other languages when necessary, but mostly stick to English.

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-31
SLIDE 31

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Multilingual processing

Linguistic structure, to some extent, applies to all languages. But each language has its own particular structures:

Word order varies. What’s a word? How does the language carve up the semantic space?

We’ll look at other languages when necessary, but mostly stick to English.

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-32
SLIDE 32

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Multilingual processing

Linguistic structure, to some extent, applies to all languages. But each language has its own particular structures:

Word order varies. What’s a word? How does the language carve up the semantic space?

We’ll look at other languages when necessary, but mostly stick to English.

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-33
SLIDE 33

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Multilingual processing

Linguistic structure, to some extent, applies to all languages. But each language has its own particular structures:

Word order varies. What’s a word? How does the language carve up the semantic space?

We’ll look at other languages when necessary, but mostly stick to English.

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-34
SLIDE 34

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Multilingual processing

Linguistic structure, to some extent, applies to all languages. But each language has its own particular structures:

Word order varies. What’s a word? How does the language carve up the semantic space?

We’ll look at other languages when necessary, but mostly stick to English.

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-35
SLIDE 35

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

Multilingual processing

Linguistic structure, to some extent, applies to all languages. But each language has its own particular structures:

Word order varies. What’s a word? How does the language carve up the semantic space?

We’ll look at other languages when necessary, but mostly stick to English.

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-36
SLIDE 36

Ling571 in the CLMA Program Linguistic Structure Shallow processing Deep processing Cross-cutting themes

What’s not included in Ling571

word-level processing and below (570) machine learning (572) dialogue processing (573) machine translation (575) speech processing (575) information extraction/retrieval, Q/A (575)

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571

slide-37
SLIDE 37

Ling571 in the CLMA Program Linguistic Structure

[see next set of slides]

Scott Farrar CLMA, University of Washington farrar@u.washington.edu Introduction to Ling571