Computational Models of Discourse: Introduction to Discourse: - - PowerPoint PPT Presentation

computational models of discourse introduction to
SMART_READER_LITE
LIVE PREVIEW

Computational Models of Discourse: Introduction to Discourse: - - PowerPoint PPT Presentation

Computational Models of Discourse: Introduction to Discourse: Coherence and Cohesion, Lexical Chains Caroline Sporleder Universit at des Saarlandes Sommersemester 2009 29.04.2009 Caroline Sporleder csporled@coli.uni-sb.de Computational


slide-1
SLIDE 1

Computational Models of Discourse: Introduction to Discourse: Coherence and Cohesion, Lexical Chains

Caroline Sporleder

Universit¨ at des Saarlandes

Sommersemester 2009 29.04.2009

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-2
SLIDE 2

New Schedule

29.04.2009 Introduction Discourse: Coherence and Cohesion 06.05.2009 Cohesion and Local Coherence

  • Lexical Cohesion, Lexical Chains
  • Focus, Centering

13.05.2009 Text Segmentation

  • TextTiling
  • Preparatory Meeting “Essay Scoring”

20.05.2009 Applications (1)

  • Automatic Essay Scoring
  • Preparatory Meeting “Information Ordering”

27.05.2009 Applications (2)

  • Information Ordering for Text Generation
  • Preparatory Meeting “Generating Referring Expressions”

03.06.2009 Generating Referring Expressions

  • rule-based
  • machine learning

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-3
SLIDE 3

New Schedule, cont’d

10.06.2009 Co-reference Resolution

  • rule-based
  • supervised machine learning
  • unsupervised machine learning

17.06.2009 Discourse Parsing

  • Discourse Parsing with RST
  • Machine Learning

24.06.2009 Temporal Ordering 01.07.2009 Text Summarisation

  • lexical chains
  • RST-based
  • multi-document
  • argumentative zoning

08.07.2009 Sentiment Analysis 15.07.2009 Dialogue Processing

  • classification of dialogue acts
  • dialogue planning

22.07.2009 Speech?, Psycholinguistic Models?, Recap

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-4
SLIDE 4

Discourse Structure

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-5
SLIDE 5

Background Reading

Jurafsky & Martin (2000)

  • Ch. 18 (Discourse)
  • Ch. 19 (Dialogue)
  • Ch. 20 (Generation)

Jurafsky & Martin (2008)

  • Ch. 21 (Discourse)
  • Ch. 23 (Summarisation)
  • Ch. 24 (Dialogue)

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-6
SLIDE 6

What is a Discourse?

a sequence of utterances but: an arbitraty collection of well-formed utterances is not necessarily a “discourse” ⇒ sequence of utterances has to be coherent

topics which are related events which are connected (e.g. cause-result, temporal succession) utterances have to fulfil a purpose in discourse

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-7
SLIDE 7

Coherence

Temporal sequence of events often not enough: At 5pm a train arrived in Munich At 6pm Angela Merkel gave a press conference

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-8
SLIDE 8

Coherence

Temporal sequence of events often not enough: At 5pm a train arrived in Munich At 6pm Angela Merkel gave a press conference Topical relatedness often also not enough: Like most bears, polar bears have 42 teeth. Polar bears are perfectly adapted to living in the polar regions. At the beginning of June polar bear Knut turned one and started to discover his predatory side.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-9
SLIDE 9

Coherence

Temporal sequence of events often not enough: At 5pm a train arrived in Munich At 6pm Angela Merkel gave a press conference Topical relatedness often also not enough: Like most bears, polar bears have 42 teeth. Polar bears are perfectly adapted to living in the polar regions. At the beginning of June polar bear Knut turned one and started to discover his predatory side. ⇒ a discourse is coherent if a plausible discourse structure can be found ⇒ interpreting a discourse means finding the connections between individual sentences (discourse relations, co-reference chains, etc.)

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-10
SLIDE 10

Linguistic Models of Discourse Structure

Many different models of discourse. Typically it is assumed that a discourse consists of: segments relations between segments (discourse/rhetorical relations) Discourse is structured hierarchically. A minimal/elementary discourse segment is often a clause/sentence: ∀w, e minimal segment(w, e) ⇒ segment(w, e) ∀w1, w2, e1, e2, e segment(w1, e1) ∧ segment(w2, e2) ∧ DiscourseRel(e1, e2, e) ⇒ segment(w1, w2, e) (w is a sequence of words; e is the described event or state) To interpret a discourse, one has to show that it is a valid segment: ∃e Segment(W ,e)

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-11
SLIDE 11

Linguistic Models of Discourse Structure

Example: Simplified RST

Peter failed the exam because he didn’t study hard enough. the holidays preparing for the re−sit while his friends enjoyed themselves at the beach He had to spend explanation contrast result

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-12
SLIDE 12

Linguistic Models of Discourse Structure

Example: Real RST

but the tragic and too−common tableaux of hundreds or even thousands

  • f people

snake−lining up for any task with a paycheck illustrates a lack

  • f jobs,

Every rule has exceptions. The people waiting in line carried a message, a refutation, of the claims that the jobless could be employed if only they showed enough ambition. The hotel’s help−wanted announcement for 300 openings was a rare

  • pportunity for

many unemployed when hundreds of people lined up to be among the first applying for jobs at the yet−to−open Mariott Hotel. Famington police had to help control traffic recently not laziness. Antithesis Concession Evidence Circumstance Volitional Result Background Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-13
SLIDE 13

Linguistic Models of Discourse Structure

How do we know that there are segements and relations? ⇒ there are linguistic cues for the existence of both

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-14
SLIDE 14

Linguistic reality of segments

John went to the bank to cash a cheque. Then he took the bus to his friend Bill who is a car dealer. He had to buy a car. The company for which he had just started working could not be reached by public transport. He also wanted to talk to bill about the upcoming football match.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-15
SLIDE 15

Linguistic reality of segments

John went to the bank to cash a cheque. Then he took the bus to his friend Bill who is a car dealer. He had to buy a car. The company for which he had just started working could not be reached by public transport. He also wanted to talk to Bill about the upcoming football match.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-16
SLIDE 16

Linguistic reality of segments

Discourse segments can be referred to (Webber, 1988): It’s always been presumed that when the glaciers receded, the area got very hot. The Folsum men couldn’t adapt, and they died out. That is what is supposed to have happened.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-17
SLIDE 17

Linguistic reality of segments

Discourse segments can be referred to (Webber, 1988): It’s always been presumed that when the glaciers receded, the area got very hot. The Folsum men couldn’t adapt, and they died out. That is what is supposed to have happened.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-18
SLIDE 18

Linguistic reality of segments

Discourse segments can be referred to (Webber, 1988): It’s always been presumed that when the glaciers receded, the area got very hot. The Folsum men couldn’t adapt, and they died out. That is what is supposed to have happened.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-19
SLIDE 19

Linguistic reality of discourse relations

Discourse relations can be signalled by cue words (discourse markers): John hid Peter’s car keys because he was drunk. Max helped Peter up again after he had fallen. Tom drinks coffee but Sue prefers tea.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-20
SLIDE 20

Linguistic reality of discourse relations

Discourse relations influence linguistic interpretation (anaphora resolution, temporal ordering etc.):

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-21
SLIDE 21

Linguistic reality of discourse relations

Discourse relations influence linguistic interpretation (anaphora resolution, temporal ordering etc.): John can open Bill’s safe. He knows the combination.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-22
SLIDE 22

Linguistic reality of discourse relations

Discourse relations influence linguistic interpretation (anaphora resolution, temporal ordering etc.): John can open Bill’s safe. He knows the combination.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-23
SLIDE 23

Linguistic reality of discourse relations

Discourse relations influence linguistic interpretation (anaphora resolution, temporal ordering etc.): John can open Bill’s safe. He knows the combination.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-24
SLIDE 24

Linguistic reality of discourse relations

Discourse relations influence linguistic interpretation (anaphora resolution, temporal ordering etc.): John can open Bill’s safe. He knows the combination. ⇒ Explanation relation (John can open Bill’s safe because John knows the combination.)

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-25
SLIDE 25

Linguistic reality of discourse relations

Discourse relations influence linguistic interpretation (anaphora resolution, temporal ordering etc.): John can open Bill’s safe. He knows the combination. ⇒ Explanation relation (John can open Bill’s safe because John knows the combination.) John can open Bill’s safe. He will have to change the combination.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-26
SLIDE 26

Linguistic reality of discourse relations

Discourse relations influence linguistic interpretation (anaphora resolution, temporal ordering etc.): John can open Bill’s safe. He knows the combination. ⇒ Explanation relation (John can open Bill’s safe because John knows the combination.) John can open Bill’s safe. He will have to change the combination.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-27
SLIDE 27

Linguistic reality of discourse relations

Discourse relations influence linguistic interpretation (anaphora resolution, temporal ordering etc.): John can open Bill’s safe. He knows the combination. ⇒ Explanation relation (John can open Bill’s safe because John knows the combination.) John can open Bill’s safe. He will have to change the combination.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-28
SLIDE 28

Linguistic reality of discourse relations

Discourse relations influence linguistic interpretation (anaphora resolution, temporal ordering etc.): John can open Bill’s safe. He knows the combination. ⇒ Explanation relation (John can open Bill’s safe because John knows the combination.) John can open Bill’s safe. He will have to change the combination. ⇒ Result relation (John can open Bill’s safe therefore Bill has to change the combination.)

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-29
SLIDE 29

Linguistic reality of discourse relations

Discourse relations influence linguistic interpretation (anaphora resolution, temporal ordering etc.): John can open Bill’s safe. He knows the combination. ⇒ Explanation relation (John can open Bill’s safe because John knows the combination.) John can open Bill’s safe. He will have to change the combination. ⇒ Result relation (John can open Bill’s safe therefore Bill has to change the combination.) John fell. Max pushed him.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-30
SLIDE 30

Linguistic reality of discourse relations

Discourse relations influence linguistic interpretation (anaphora resolution, temporal ordering etc.): John can open Bill’s safe. He knows the combination. ⇒ Explanation relation (John can open Bill’s safe because John knows the combination.) John can open Bill’s safe. He will have to change the combination. ⇒ Result relation (John can open Bill’s safe therefore Bill has to change the combination.) John fell. Max pushed him. push <t fall

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-31
SLIDE 31

Linguistic reality of discourse relations

Discourse relations influence linguistic interpretation (anaphora resolution, temporal ordering etc.): John can open Bill’s safe. He knows the combination. ⇒ Explanation relation (John can open Bill’s safe because John knows the combination.) John can open Bill’s safe. He will have to change the combination. ⇒ Result relation (John can open Bill’s safe therefore Bill has to change the combination.) John fell. Max pushed him. push <t fall ⇒ Explanation relation (John fell because Max pushed him.)

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-32
SLIDE 32

Linguistic reality of discourse relations

Discourse relations influence linguistic interpretation (anaphora resolution, temporal ordering etc.): John can open Bill’s safe. He knows the combination. ⇒ Explanation relation (John can open Bill’s safe because John knows the combination.) John can open Bill’s safe. He will have to change the combination. ⇒ Result relation (John can open Bill’s safe therefore Bill has to change the combination.) John fell. Max pushed him. push <t fall ⇒ Explanation relation (John fell because Max pushed him.) John fell. He broke a leg.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-33
SLIDE 33

Linguistic reality of discourse relations

Discourse relations influence linguistic interpretation (anaphora resolution, temporal ordering etc.): John can open Bill’s safe. He knows the combination. ⇒ Explanation relation (John can open Bill’s safe because John knows the combination.) John can open Bill’s safe. He will have to change the combination. ⇒ Result relation (John can open Bill’s safe therefore Bill has to change the combination.) John fell. Max pushed him. push <t fall ⇒ Explanation relation (John fell because Max pushed him.) John fell. He broke a leg. fall <t breaking a leg

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-34
SLIDE 34

Linguistic reality of discourse relations

Discourse relations influence linguistic interpretation (anaphora resolution, temporal ordering etc.): John can open Bill’s safe. He knows the combination. ⇒ Explanation relation (John can open Bill’s safe because John knows the combination.) John can open Bill’s safe. He will have to change the combination. ⇒ Result relation (John can open Bill’s safe therefore Bill has to change the combination.) John fell. Max pushed him. push <t fall ⇒ Explanation relation (John fell because Max pushed him.) John fell. He broke a leg. fall <t breaking a leg ⇒ Result relation (John fell and therefore broke his leg.)

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-35
SLIDE 35

Coherence vs. Cohesion (Halliday & Hasan, 1976)

If a text is cohesive it hangs together well. If a text is coherent it makes sense.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-36
SLIDE 36

Coherence vs. Cohesion (Halliday & Hasan, 1976)

If a text is cohesive it hangs together well. ⇒ Cohesion is about linguistic form. If a text is coherent it makes sense.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-37
SLIDE 37

Coherence vs. Cohesion (Halliday & Hasan, 1976)

If a text is cohesive it hangs together well. ⇒ Cohesion is about linguistic form. If a text is coherent it makes sense. ⇒ Coherence is about underlying structure.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-38
SLIDE 38

Example: Cohesion

Peter failed the exam because he didn’t study hard enough. He had to spend the holidays preparing for the re-sit while his friends enjoyed themselves at the beach.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-39
SLIDE 39

Example: Cohesion

Peter failed the exam because he didn’t study hard enough. He had to spend the holidays preparing for the re-sit while his friends enjoyed themselves at the beach.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-40
SLIDE 40

Example: Cohesion

Peter failed the exam because he didn’t study hard enough. He had to spend the holidays preparing for the re-sit while his friends enjoyed themselves at the beach.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-41
SLIDE 41

Example: Cohesion

Peter failed the exam because he didn’t study hard enough. He had to spend the holidays preparing for the re-sit while his friends enjoyed themselves at the beach.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-42
SLIDE 42

Example: Coherence with little cohesion

Yesterday Peter passed his driving test. Afterwards Peter went to see Klaus. Klaus was happy about the visit because Klaus hadn’t seen Peter for a while. Later Peter and Klaus went to the pub.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-43
SLIDE 43

Example: Coherence with little cohesion

Yesterday Peter passed his driving test. Afterwards Peter went to see Klaus. Klaus was happy about the visit because Klaus hadn’t seen Peter for a while. Later Peter and Klaus went to the pub.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-44
SLIDE 44

Example: Cohesion with little Coherence

Yesterday Peter flew to Australia. This country is well-known for its kangaroos. The kangaroos in Cologne Zoo are a favorite of Carla’s. She likes to travel. Gnus are nice animals.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-45
SLIDE 45

Example: Cohesion with little Coherence

Yesterday Peter flew to Australia. This country is well-known for its kangaroos. The kangaroos in Cologne Zoo are a favorite of Carla’s. She likes to travel. Gnus are nice animals.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-46
SLIDE 46

Example: Cohesion with little Coherence

Yesterday Peter flew to Australia. This country is well-known for its kangaroos. The kangaroos in Cologne Zoo are a favorite of Carla’s. She likes to travel. Gnus are nice animals.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-47
SLIDE 47

Example: Cohesion with little Coherence

Yesterday Peter flew to Australia. This country is well-known for its kangaroos. The kangaroos in Cologne Zoo are a favorite of Carla’s. She likes to travel. Gnus are nice animals.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-48
SLIDE 48

Example: Cohesion with little Coherence

Yesterday Peter flew to Australia. This country is well-known for its kangaroos. The kangaroos in Cologne Zoo are a favorite of Carla’s. She likes to travel. Gnus are nice animals.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-49
SLIDE 49

Modelling Discourse Structure

. . . is not just about coherence and cohesion.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-50
SLIDE 50

Dimensions of Discourse Structure

Four interdependent aspects/dimensions of discourse structure: (Para-)Linguistic Structure: linguistic manifestation of discourse structure, e.g., lexical cohesions, cue words, intonation, gesture, referring expressions etc. Intentional Structure: each discourse segment fulfils a purpose (why does a speaker/write make a given utterance in a given form?) Informational Structure: how do the different segments of a discourse relate to each other (which segments are directly related and which discourse relations hold)? Focus/Attentional Structure: which entities are salient at a given point in discourse?

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-51
SLIDE 51

Dimensions of Discourse Structure

Linguistic structure is about cohesion. Intentional, informational, and focus structure are about coherence.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-52
SLIDE 52

Linguistic Structure

Linguistic form

  • ften an indicator of discourse structure:

discourse connectives (but, because): ⇒ reflect how sentences are related to each other (contrast, explanation etc.) referring expressions (she, Mary, a girl, the girl who likes ice-cream . . . ) ⇒ reflect the status of an entity in the discourse (salient, not-salient, new, old, inferred etc.) semantically related words (flooding . . . torrential rain . . . storm): ⇒ reflect lexical cohesion

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-53
SLIDE 53

Informational Structure

John hid Peter’s car keys. He was drunk.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-54
SLIDE 54

Informational Structure

John hid Peter’s car keys. He was drunk. ⇒ The fact that John was drunk explains why he hid Peter’s car keys.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-55
SLIDE 55

Informational Structure

John hid Peter’s car keys. He was drunk. ⇒ The fact that John was drunk explains why he hid Peter’s car keys. Mary likes chocolate, Maggie likes crisps

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-56
SLIDE 56

Informational Structure

John hid Peter’s car keys. He was drunk. ⇒ The fact that John was drunk explains why he hid Peter’s car keys. Mary likes chocolate, Maggie likes crisps ⇒ The fact that Maggie likes crisps contrasts with Mary’s liking of chocolate.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-57
SLIDE 57

Intentional Structure

John hid Peter’s car keys. He was drunk.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-58
SLIDE 58

Intentional Structure

John hid Peter’s car keys. He was drunk. Possible intention: explain to listener why John hid Peter’s keys (and why Peter was consequently late for work)

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-59
SLIDE 59

Intentional Structure

John hid Peter’s car keys. He was drunk. Possible intention: explain to listener why John hid Peter’s keys (and why Peter was consequently late for work) Another Possible intention: outline to listener what consequences John’s drunkenness has (and why something must be done about his binge drinking)

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-60
SLIDE 60

Focus Structure

Susan would like to go on a holiday. But she needs to find somebody to do her work while she’s away. She can’t think of anybody to do that. She considered Mike but he’s a bit unreliable. Yesterday he forgot to turn up for an important meeting with a

  • client. The client was very annoyed and said she would never do

business with Susan’s company again.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-61
SLIDE 61

Focus Structure

Susan would like to go on a holiday. But she needs to find somebody to do her work while she’s away. She can’t think of anybody to do that. She considered Mike but he’s a bit unreliable. Yesterday he forgot to turn up for an important meeting with a

  • client. The client was very annoyed and said she would never do

business with Susan’s company again.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-62
SLIDE 62

Focus Structure

Susan would like to go on a holiday. But she needs to find somebody to do her work while she’s away. She can’t think of anybody to do that. She considered Mike but he’s a bit unreliable. Yesterday he forgot to turn up for an important meeting with a

  • client. The client was very annoyed and said she would never do

business with Susan’s company again.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-63
SLIDE 63

Focus Structure

Susan would like to go on a holiday. But she needs to find somebody to do her work while she’s away. She can’t think of anybody to do that. She considered Mike but he’s a bit unreliable. Yesterday he forgot to turn up for an important meeting with a

  • client. The client was very annoyed and said she would never do

business with Susan’s company again.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-64
SLIDE 64

Modelling Discourse Structure (for Analysis)

Not all four aspects of discourse structure are equally easy to model

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-65
SLIDE 65

Modelling Discourse Structure (for Analysis)

Not all four aspects of discourse structure are equally easy to model linguistic structure: easy to observe

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-66
SLIDE 66

Modelling Discourse Structure (for Analysis)

Not all four aspects of discourse structure are equally easy to model linguistic structure: easy to observe focus structure: relatively strong correlation with linguistic form (pronouns, salient positions in a sentence (subject vs.

  • bject)), fairly easy to model

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-67
SLIDE 67

Modelling Discourse Structure (for Analysis)

Not all four aspects of discourse structure are equally easy to model linguistic structure: easy to observe focus structure: relatively strong correlation with linguistic form (pronouns, salient positions in a sentence (subject vs.

  • bject)), fairly easy to model

informational structure: relatively weak correlation with linguistic form (discourse connectives), difficult to model

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-68
SLIDE 68

Modelling Discourse Structure (for Analysis)

Not all four aspects of discourse structure are equally easy to model linguistic structure: easy to observe focus structure: relatively strong correlation with linguistic form (pronouns, salient positions in a sentence (subject vs.

  • bject)), fairly easy to model

informational structure: relatively weak correlation with linguistic form (discourse connectives), difficult to model intentional structure: barely visible in linguistic form, extremely difficult to model on analysis side

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-69
SLIDE 69

How would you go about modelling discourse structure?

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-70
SLIDE 70

How would you go about modelling discourse structure?

On the analysis side all we typically have is linguistic form

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-71
SLIDE 71

How would you go about modelling discourse structure?

On the analysis side all we typically have is linguistic form ⇒ model those aspects of discourse structure which correlate strongest with linguistic form, use linguistic form as cues for structure

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-72
SLIDE 72

How would you go about modelling discourse structure?

On the analysis side all we typically have is linguistic form ⇒ model those aspects of discourse structure which correlate strongest with linguistic form, use linguistic form as cues for structure → focus structure: Centering Theory → linguistic structure, lexical cohesion: Lexical Chains

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-73
SLIDE 73

How would you go about modelling discourse structure?

On the analysis side all we typically have is linguistic form ⇒ model those aspects of discourse structure which correlate strongest with linguistic form, use linguistic form as cues for structure → focus structure: Centering Theory → linguistic structure, lexical cohesion: Lexical Chains We’ll talk about informational structure (discourse parsing) later

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-74
SLIDE 74

Lexical Chains

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-75
SLIDE 75

Lexical Chains

. . . are sequences of semantically related words

From Nineteen Eighty-Four [abridged]: ... the book that he had just taken out of the drawer. It was a peculiarly beautiful book . Its smooth creamy paper was of a kind that had not been manufactured for at least forty years past. He could guess, however, that the book was much older than that. He had seen it lying in the window

  • f a frowsy little junk-shop and had been stricken immediately by an
  • verwhelming desire to possess it . Party members were supposed not to

go into ordinary shops . He had slipped inside and bought the book for two dollars fifty. Even with nothing written in it , it was a compromising possession .

{ The thing that he was about to do was to open a diary . Winston

fitted a nib into the penholder and sucked it to get the grease off. The pen was an archaic instrument , seldom used even for signatures , and he had procured one, furtively and with some difficulty, simply because of a feeling that the beautiful creamy paper deserved to be written on with a real nib instead of being scratched with an ink - pencil . Actually he was not used to writing by hand. He dipped the pen into the ink .

(Source: Graeme Hirst & Alexander Budanitsky, Eurolan-2001 presentation)

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-76
SLIDE 76

Computing Lexical Chains

We need: a measure of semantic relatedness between words a chain building algorithm

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-77
SLIDE 77

Computing Semantic Relatedness

. . . a hot research topic. Two basic methods: concept distance in a hierarchical lexicon (e.g. WordNet) distributional similarity computed from a corpus

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-78
SLIDE 78

WordNet

Structure synsets: collections of words with the same sense (e.g., {bank, depository financial institution, banking concern, banking company} vs. {bank, river bank}) relations between synsets

hyponym (e.g., Federal Reserve Bank) hypernym (e.g., financial organisation) member holonym (e.g., banking industry) antonyms etc.

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-79
SLIDE 79

Relatedness based on WordNet

Simple approach count path length between two concepts/synsets possibly normalise by overall depth of hierarchy etc. But: not all paths are equal, e.g. changes in direction weaken similarity

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-80
SLIDE 80

Relatedness based on WordNet (Hirst & St-Onge, 1998)

Three types of relations:

1 extra-strong: literal repetition of a word 2 strong:

concepts are in the same synset (e.g. human and person), or the concept’s synsets are connected by a horizontal link (antonymy or similarity relation, e.g. precursor and successor),

  • r
  • ne of the words is a compound that includes the other (e.g.

school and private school)

3 medium-strong: there exists an allowable path between the

concept’s synsets Medium-strong paths are weighted by: C − path length − k × number of changes of direction (C and k are empirically set constants)

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-81
SLIDE 81

Relatedness based on WordNet (Hirst & St-Onge, 1998)

An allowable path: contains no more than five links and conforms to one of eight patterns, definable by the following rules

no other direction may precede an upward link at most one change of direction is allowed it is permitted to use a horizontal link to make a transition from an upward to a downward direction

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-82
SLIDE 82

Relatedness based on WordNet (Hirst & St-Onge, 1998)

(b) (a)

Figure 2: (a) Patterns of paths allowable in medium-strong relations and (b) patterns of paths not

  • allowable. (Each vector denotes one or more links in the same direction.)
  • Caroline Sporleder

csporled@coli.uni-sb.de Computational Models of Discourse

slide-83
SLIDE 83

Relatedness based on WordNet (Hirst & St-Onge, 1998)

produce green_goods fruit vegetable veggie apple carrot apple carrot ~ @ ~ @

Figure 3: Example of a regular relation between two words. (@ = hypernymy,

= hyponymy)

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-84
SLIDE 84

Relatedness based on WordNet

Problems/Disadvantages

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-85
SLIDE 85

Relatedness based on WordNet

Problems/Disadvantages availability of hierarchical lexicon coverage of lexicon sparse and dense areas in hierarchy not comparable (normalisation necessary, not a solved problem)

  • nly “classical” relations (hypernymy, hyponymy, antonymy

etc., can’t model fuzzy relations, e.g. fire and coals)

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-86
SLIDE 86

Relatedness based on distributional similarity

Possible measures: Pointwise Mutual Information (PMI): I(x, y) = log2

P(x,y) P(x)P(y)

PMI over-inflates low-frequency events, better: Icorrected(x, y) = log2

P(x,y) P(x)P(y) × min(freq(x),freq(y)) min(freq(x),freq(y))+1

cosine of the angle between the co-occurrence vectors of two words

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-87
SLIDE 87

Relatedness based on distributional similarity

Problems/Disadvantages

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-88
SLIDE 88

Relatedness based on distributional similarity

Problems/Disadvantages conflation of word senses sometimes unpredictable results (corpus size, domain, similarity measure etc. play a role)

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-89
SLIDE 89

Chain-Building Algorithms

Basic Idea: place two words in the same chain if their relatedness is above threshold t

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-90
SLIDE 90

Chain-Building Algorithms

Basic Idea: place two words in the same chain if their relatedness is above threshold t Design decisions can a word be placed in several chains? does a word have to be related to all other words in the chain

  • r just to one other word? If it has to be related to all words

does the avg. similarity have to be above the threshold or the minimum/maximum? greedy vs. non-greedy chain building (and its interaction with word-sense disambiguation)

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-91
SLIDE 91

Chain-Building Algorithms (Hirst & St-Onge 1998)

man person individual someone man mortal human soul homo man human_being human man piece world human_race humanity humankind mankind man man adult_male soldier man

Figure 5: A word starting a new chain. (The word man has six synsets.)

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-92
SLIDE 92

Chain-Building Algorithms (Hirst & St-Onge 1998)

man man person individual someone man mortal human soul homo man human_being human man piece world human_race humanity humankind mankind man man adult_male soldier man person individual someone man mortal human soul homo man human_being human man piece world human_race humanity humankind mankind man man adult_male soldier man EXTRA-STRONG

Figure 6: Push the same word

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-93
SLIDE 93

Chain-Building Algorithms (Hirst & St-Onge 1998)

man woman person individual someone man mortal human soul homo man human_being human man piece world human_race humanity humankind mankind man man adult_male soldier man man person individual someone man mortal human soul homo man human_being human man piece world human_race humanity humankind mankind man man adult_male soldier man woman adult_female STRONG EXTRA-STRONG

Figure 7: Push an antonym

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-94
SLIDE 94

Chain-Building Algorithms (Hirst & St-Onge 1998)

man woman man adult_male man man adult_male woman adult_female STRONG X-STRONG

Figure 8: Updated chain after insertion

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse

slide-95
SLIDE 95

Lexical Chains. So what are they good for?

Caroline Sporleder csporled@coli.uni-sb.de Computational Models of Discourse