Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
Computer Science
Sudipta Singha Roy, Robert E. Mercer, Felipe Urra
(ssinghar@uwo.ca)
The University of Western Ontario
1
Investigating Citation Linkage as a Sentence Similarity Measurement - - PowerPoint PPT Presentation
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning Sudipta Singha Roy, Robert E. Mercer, Felipe Urra (ssinghar@uwo.ca) The University of Western Ontario 1 Computer Science Overview Introduction:
Computer Science
Sudipta Singha Roy, Robert E. Mercer, Felipe Urra
(ssinghar@uwo.ca)
The University of Western Ontario
1
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
2
3
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
4
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
relatedness task.
sentence with each sentence from the reference document and then tries to determine which sentence pair is semantically similar and which pair is not.
sentence and the citation linkage task has been designed as a semantic relatedness measurement task at the sentence level.
classification which operates on sentence pairs.
5
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
sentences from a cited paper given a citation sentence.
sixty thousand sentence pairs from the biomedical domain.
sentences from different biomedical domains.
6
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
7
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
8
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
9
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
10
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
11
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
12
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
cited sentences given a citation sentence.
domain expert.
23 citation sentences and 3857 candidate cited sentences.
the accuracy they achieved was low.
13
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
approaches to determine the citation linkage between citation and cited sentence pairs.
sentence level.
compute the textual semantic similarity.
Jaccard similarity
Network and cosine similarity.
14
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
15
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
linkage task for the biomedical domain with 3857 sentence pairs.
positive.
model biased.
16
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
pairs over three biomedical topics: cell biology, biochemistry, and chemical biology.
rather by Sent2Vec
resulting sentence vectors as a measure of semantic similarity of the two sentences in each pair.
negative.
validation and test purposes.
17
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
wide spectrum of biomedical journals
vectors.
portion of the human annotated dataset from Houngbo and Mercer's work.
18
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
the reference articles
these reference articles are manually collected from the web
19
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
Source: PubMed 28,310 Research Papers 112 Reference Papers 2,289 Citing Papers Manually Collected 475,807 Sentence Pairs Pretrained Sent2Vec Cosine Similarity > 0.57 ? 31,624 +ve Sentence Pairs 37,274 -ve Sentence Pairs (using –ve algorithm) Yes No
20
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
For each citing sentence Ci: if number of +ve samples is n and n>0: then choose n –ve samples randomly where the citing sentence is Ci else if n==0: then chose 5 –ve samples randomly where the citing sentence is Ci
21
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
22
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
Source: PubMed 28,310 Research Papers 112 Reference Papers 2,289 Citing Papers Manually Collected 475,807 Sentence Pairs Pretrained Sent2Vec Cosine Similarity > 0.57 ? 31,624 +ve Sentence Pairs 37,274 -ve Sentence Pairs (as described in the text) Final Corpus: 68,898 Sentence Pairs Yes No
portion of the human annotated corpus created by Houngbo and Mercer.
positive samples.
23
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
24
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
Names and Citations
25
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
26
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
27
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
28
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
29
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
semantic relatedness between citing and cited sentence pairs.
30
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
Infersent (Modified from Conneau et al., 2017)
31
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
Infersent with Bi-LSTM and Max-pooling (Conneau et al., 2017)
32
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
scited /sciting
33
Infersent with Inner & Hierarchical Attention (Conneau et al., 2017)
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
scited /sciting
Inner Attention Hierarchical Attention
34
Infersent with Hierarchical ConvNet (Conneau et al., 2017)
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
scited /sciting
Synthetic Corpus Annotated by: Sent2Vec Partition 3 Partition 2 Partition 1 Model Annotated 1 Partition 1 Train
35
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
Partition 3 Partition 2 Annotated 1 Model Annotated 2 Partition 2 Train Annotated 1
36
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
Partition 3 Annotated 2 Annotated 1 Model Annotated 3 Partition 3 Train Annotated 2 Annotated 1
37
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
Annotated 3 Annotated 2 Annotated 1 Model Human Annotated Test Data Train Test Annotated 3 Annotated 2 Annotated 1 Performance Analysis
38
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
39
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
FastText:
40
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
Sent2Vec:
41
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
accuracy was divided by 5.
42
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
M1: Hier. ConvNet; Bi-LSTM M2: Max-pool; M3: Inner Attn.; M4: Hier. Attn.; Boot: Bootstrapped.
43
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
44
sentences from a cited paper given a citation sentence.
sixty thousand sentence pairs from the biomedical domain.
sentences from different biomedical domains.
45
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
sentence and sub sentence spans.
be applied.
for method citation sentences. It would be appropriate to human annotate a test set with a variety of citation types and see how good the proposed method performs on this expanded test set.
46
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning
47
Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning