Using Discourse Information for Paraphrase Extraction Michaela - - PowerPoint PPT Presentation
Using Discourse Information for Paraphrase Extraction Michaela - - PowerPoint PPT Presentation
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang Saarland University DFKI GmbH (Saarbrcken, Germany) EMNLP-CoNNL 2012, Jeju, Korea Paraphrase Resources - ...are important. (RTE, Machine
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Paraphrase Resources
- ...are important. (RTE, Machine Translation,
Question Answering, ...)
- many approaches create paraphrase resources
from monolingual parallel corpora
- hardly any approach exploits discourse
information
- we show that discourse information helps to
extract sentential paraphrases and phrase-level paraphrase fragments
2
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Paraphrasing & Discourse Knowledge
3
She gives Foreman one shot. Cuddy agrees to give him one chance to prove himself.
- distributional hypothesis applied to
sentences & discourse context
- coreference resolution
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Paraphrasing & Discourse Knowledge
4
When House leaves, Foreman pushes for his job. She gives Foreman one shot. Foreman meets with Thirteen and Chris Taub. Once he goes, Foreman asks to take over as head of diagnostics. Cuddy agrees to give him one chance to prove himself. Foreman, Hadley, and Taub get the conference room ready and Foreman explains that he'll be in charge.
- distributional hypothesis applied to
sentences & discourse context
- coreference resolution
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Outline
5
√
- Paraphrasing & Discourse Knowledge
- System Overview
- Evaluation
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
System Overview
6
recaps
- f House
M.D.
The psychiatrist suggests him to get a hobby Nolan tells House to take up a hobby. get a hobby take up a hobby
parallel corpus with parallel discourse structures
sentence-level paraphrases paraphrase fragments + Multiple Sequence Alignment + semantic similarity
Discourse Information
+ word alignments + coreference resolution + dependency trees
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
A Parallel Corpus
- different summaries of House MD episodes
- entirely parallel discourse structure (linear
sequential order, like events on screen)
- intermediate length, lots of sources on the web
- We’re working on Season 6: 20 episodes x 8
recaps (14735 sentences)
- easy to extend (2 hours for data collection)
- Preprocessing: sentence splitting, parsing
7
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Sequence Alignment
8
- Sequence Alignment arranges two
sequences so as to align as many similar (equal) elements as possible
- compute the alignment with the
lowest cost, given costs / scores for
- gap introduction
- matching two items
- Multiple Sequence Alignment (MSA)
generalizes this task for arbitrarily many sequences
sequences gaps alignment
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Sentence Matching with MSA
(cf Regneri & al. 2010)
- recaps = sequences of
sentences
- alignment score for two
sentences = vector-based semantic similarity
- constant gap costs
- aligned sentences =
paraphrases
- high context similarity +
high semantic similarity = alignment
9
+
s1.1 s1.2 s1.3 s2.1 s2.2 s2.3 s3.1 s3.2 s3.3
recap 1 recap 3 recap 3 sentence 1.1 ∅ ∅ sentence 1.2 sentence 2.1 sentence 3.1 ∅ ∅ sentence 3.2 sentence 1.3 ∅ sentence 3.3
sequential discourse information semantic sentence similarity
s2.1 s3.3 s1.1
MSA with Paraphrases
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Sample Results of the MSA
10
recap 1 recap 2 recap 3 recap 4 She gives Foreman one shot. Cuddy agrees to give him one chance to prove himself. Foreman insists he deserves a chance and Cuddy gives in, warning him he gets one shot. Foreman meets with Thirteen and Chris Taub. They decide that it might be CRPS and Foreman
- rders a spinal
stimulation. Thirteen and Taub go to see the patient, who thinks he has mercury poisoning from eating too much fish. He suggests they give him a blood test for mercury poisoning. The millionaire has checked
- n the Internet and believes
that he has mercury poisoning caused by sushi. Vince disagrees, checks
- n the Internet, and
suggests mercury poisoning brought on by the sushi he eats constantly. He's also researching his case on the internet and asks for a blood test to rule out the diagnosis. Foreman is upset Thirteen and Taub did the blood test (which does not reveal any poisoning) without consulting him. He argues that his symptoms don't match up exactly with CRPS and asks them to give him a blood test for heightened mercury levels. He asks them to run one blood test to check for mercury.
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Sample Results of the MSA
10
recap 1 recap 2 recap 3 recap 4 She gives Foreman one shot. Cuddy agrees to give him one chance to prove himself. Foreman insists he deserves a chance and Cuddy gives in, warning him he gets one shot. Foreman meets with Thirteen and Chris Taub. They decide that it might be CRPS and Foreman
- rders a spinal
stimulation. Thirteen and Taub go to see the patient, who thinks he has mercury poisoning from eating too much fish. He suggests they give him a blood test for mercury poisoning. The millionaire has checked
- n the Internet and believes
that he has mercury poisoning caused by sushi. Vince disagrees, checks
- n the Internet, and
suggests mercury poisoning brought on by the sushi he eats constantly. He's also researching his case on the internet and asks for a blood test to rule out the diagnosis. Foreman is upset Thirteen and Taub did the blood test (which does not reveal any poisoning) without consulting him. He argues that his symptoms don't match up exactly with CRPS and asks them to give him a blood test for heightened mercury levels. He asks them to run one blood test to check for mercury.
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Sample Results of the MSA
10
recap 1 recap 2 recap 3 recap 4 She gives Foreman one shot. Cuddy agrees to give him one chance to prove himself. Foreman insists he deserves a chance and Cuddy gives in, warning him he gets one shot. Foreman meets with Thirteen and Chris Taub. They decide that it might be CRPS and Foreman
- rders a spinal
stimulation. Thirteen and Taub go to see the patient, who thinks he has mercury poisoning from eating too much fish. He suggests they give him a blood test for mercury poisoning. The millionaire has checked
- n the Internet and believes
that he has mercury poisoning caused by sushi. Vince disagrees, checks
- n the Internet, and
suggests mercury poisoning brought on by the sushi he eats constantly. He's also researching his case on the internet and asks for a blood test to rule out the diagnosis. Foreman is upset Thirteen and Taub did the blood test (which does not reveal any poisoning) without consulting him. He argues that his symptoms don't match up exactly with CRPS and asks them to give him a blood test for heightened mercury levels. He asks them to run one blood test to check for mercury.
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Paraphrase Fragments
- Most aligned sentence pairs overlap, but they
don’t cover exactly the same content
- We want to extract smaller sentence parts (of
different sizes) that match
- Test advantages from Coreference Resolution
11
He argues that his symptoms don't match up exactly with CRPS and asks them to give him a blood test for heightened mercury levels. He asks them to run
- ne blood test to
check for mercury.
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Paraphrase Fragments
- Most aligned sentence pairs overlap, but they
don’t cover exactly the same content
- We want to extract smaller sentence parts (of
different sizes) that match
- Test advantages from Coreference Resolution
11
He argues that his symptoms don't match up exactly with CRPS and asks them to give him a blood test for heightened mercury levels. He asks them to run
- ne blood test to
check for mercury.
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Basic Fragment Extraction
(cf Wang & Callison-Burch 2011)
- aligned recaps as parallel corpora
for Machine Translation (“translate” EN -> EN)
- compute word alignments for
aligned sentences (Giza++)
- a fragment pair is a sequence of
aligned word pairs
- do smoothing & different heuristics
to determine fragment boundaries (-> minimal enclosing chunks)
- discard trivial fragments
12
s2.1 s2.2 ! s2.3 s1.1 ! s1.2 s1.3 s3.1 ! s3.2 s3.3 s1.1 ! s1.2 s1.3
Vince tells them to give him a blood test for heightened mercury levels. He asks them to run a blood test to check for mercury.
sentence alignments word alignments
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Basic Fragment Extraction
(cf Wang & Callison-Burch 2011)
- aligned recaps as parallel corpora
for Machine Translation (“translate” EN -> EN)
- compute word alignments for
aligned sentences (Giza++)
- a fragment pair is a sequence of
aligned word pairs
- do smoothing & different heuristics
to determine fragment boundaries (-> minimal enclosing chunks)
- discard trivial fragments
12
s2.1 s2.2 ! s2.3 s1.1 ! s1.2 s1.3 s3.1 ! s3.2 s3.3 s1.1 ! s1.2 s1.3
Vince tells them to give him a blood test for heightened mercury levels. He asks them to run a blood test to check for mercury.
sentence alignments word alignments
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Basic Fragment Extraction
(cf Wang & Callison-Burch 2011)
- aligned recaps as parallel corpora
for Machine Translation (“translate” EN -> EN)
- compute word alignments for
aligned sentences (Giza++)
- a fragment pair is a sequence of
aligned word pairs
- do smoothing & different heuristics
to determine fragment boundaries (-> minimal enclosing chunks)
- discard trivial fragments
12
s2.1 s2.2 ! s2.3 s1.1 ! s1.2 s1.3 s3.1 ! s3.2 s3.3 s1.1 ! s1.2 s1.3
Vince tells them to give him a blood test for heightened mercury levels. He asks them to run a blood test to check for mercury.
sentence alignments word alignments
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
VP/PP Fragment Extraction
- two types of fragments:
- phrases with a verb &
same syntactic category
- prepositional phrases
- discard complete sentences
and trivial fragments
13
Vince tells them to give him a blood test for heightened mercury levels. He asks them to run a blood test to check for mercury. give him a blood test for heightened mercury levels to run a blood test to check for mercury.
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
VP/PP Fragment Extraction
- two types of fragments:
- phrases with a verb &
same syntactic category
- prepositional phrases
- discard complete sentences
and trivial fragments
13
Vince tells them to give him a blood test for heightened mercury levels. He asks them to run a blood test to check for mercury.
VP VP
give him a blood test for heightened mercury levels to run a blood test to check for mercury.
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
VP/PP Fragment Extraction
- two types of fragments:
- phrases with a verb &
same syntactic category
- prepositional phrases
- discard complete sentences
and trivial fragments
13
Vince tells them to give him a blood test for heightened mercury levels. He asks them to run a blood test to check for mercury.
VP VP
give him a blood test for heightened mercury levels to run a blood test to check for mercury.
PP PP
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Outline
14
√ √
- Paraphrasing & Discourse Knowledge
- System Overview
- Evaluation
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Evaluation: Sentence Matching
- Baselines to measure contribution of semantic
similarity & MSA:
- MSA with BLEU as score function
- Clustering (no sequential information) with
Vector Similarities
- Clustering with BLEU scores
15
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Evaluation: Sentence Matching
- Evaluation Set: from each baseline and the
system, pick 400 pairs labelled as paraphrase; add 400 completely random pairs
- 2 annotators label each pair as paraphrase,
containment, related or unrelated
- conflicts resolved by 3rd annotator
- for the final evaluation, we divide the set into
unrelated pairs and good matches
16
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Evaluation: Sentence Matching
17
Precision Recall F-Score Accuracy
0,25 0,50 0,75 1,00
Random Cluster + Bleu Cluster+ Vector MSA + Bleu MSA + Vector
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Evaluation: Sentence Matching
17
Precision Recall F-Score Accuracy
0,25 0,50 0,75 1,00
Random Cluster + Bleu Cluster+ Vector MSA + Bleu MSA + Vector
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Evaluation: Sentence Matching
17
Precision Recall F-Score Accuracy
0,25 0,50 0,75 1,00
Random Cluster + Bleu Cluster+ Vector MSA + Bleu MSA + Vector
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Evaluation: Sentence Matching
17
Precision Recall F-Score Accuracy
0,25 0,50 0,75 1,00
Random Cluster + Bleu Cluster+ Vector MSA + Bleu MSA + Vector
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Evaluation: Fragment Extraction
- evaluation of 3 main configurations: Basic
(=Alignments + Chunker), VP/PP (clauses + PPs), VP/PP + Coreference Resolution (preprocessing)
- Gold Standard: 150 pairs per configuration
(~same labeling scheme as for sentences)
- Precision is evaluated against gold standard
- Recall is hard to determine, we note productivity
instead (= #fragments per sentence pair)
18
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Evaluation: Fragment Extraction
19
Precision Productivity 0,25 0,50 0,75 1,00 Basic VP/PP VP /PP + Coref
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Evaluation: Fragment Extraction
19
Precision Productivity 0,25 0,50 0,75 1,00 Basic VP/PP VP /PP + Coref
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Evaluation: Fragment Extraction
19
Precision Productivity 0,25 0,50 0,75 1,00 Basic VP/PP VP /PP + Coref
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Evaluation: Fragment Extraction
19
Precision Productivity 0,25 0,50 0,75 1,00 Basic VP/PP VP /PP + Coref
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Influence of Discourse Information
- n Fragment Extraction
20
Precision Productivity 0,25 0,50 0,75 1,00 Cluster + VP MSA + VP
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Influence of Discourse Information
- n Fragment Extraction
20
Precision Productivity 0,25 0,50 0,75 1,00 Cluster + VP MSA + VP
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Influence of Discourse Information
- n Fragment Extraction
20
Precision Productivity 0,25 0,50 0,75 1,00 Cluster + VP MSA + VP
30x more good fragment pairs per sentence pair
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Conclusion
- Discourse Knowledge for Paraphrase Extraction
- a new, highly parallel corpus
- Multiple Sequence Alignment for sentence
matching
- (grammatical) paraphrase fragments
- discourse information gives big advantages in all
processing stages
21
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
Future Work
- use MSA with clauses instead of sentences
- with a temporal classifier as preprocessing, use
arbitrary comparable corpora
- align actual discourse trees (e.g. in RST or SDRT
style) Dataset in supplementary material:
http://www.aclweb.org/supplementals/D/D12/D12-1084.Attachment.zip
22
Using Discourse Information for Paraphrase Extraction Michaela Regneri & Rui Wang
23