Generating Links to Background Knowledge:
A Case Study Using Narrative Radiology Reports Jiyin He1, Maarten de Rijke2, Merlijn Sevenster3, Rob van Ommering3, Yuechen Qian3
1 CWI; 2 University of Amsterdam; 3 Philips Research
1
Generating Links to Background Knowledge: A Case Study Using - - PowerPoint PPT Presentation
Generating Links to Background Knowledge: A Case Study Using Narrative Radiology Reports Jiyin He 1 , Maarten de Rijke 2 , Merlijn Sevenster 3 , Rob van Ommering 3 , Yuechen Qian 3 1 CWI; 2 University of Amsterdam; 3 Philips Research 1 Medical
A Case Study Using Narrative Radiology Reports Jiyin He1, Maarten de Rijke2, Merlijn Sevenster3, Rob van Ommering3, Yuechen Qian3
1 CWI; 2 University of Amsterdam; 3 Philips Research
1
2
explanation or background information - Anchor detection
explanation or background information - Target finding
3
findings, diagnoses and recommendations for followup actions
generation with Wikipedia in general domain
data
MeSH, ICD-9, ICD-10
4
Miner (Milne and Witten 2008)
content?
a manually annotated test collection
5
anchor text, the more likely it will be used as an anchor text again.
an anchor text and the target page
target pages by measuring the relatedness of a WP page and the context of the phrase
detection
6
engine
concept that reasonably covers the topic was sought
7
8
System Anchor detection Target finding Overall P R F P R F P R F Wikify! (Lesk) 0.35 0.16 0.22 0.4 0.4 0.4 0.14 0.07 0.09 Wikify! (ML) 0.35 0.16 0.22 0.69 0.69 0.69 0.25 0.12 0.16 WM 0.35 0.36 0.36 0.84 0.84 0.84 0.29 0.3 0.3
9
nouns, 32% are nouns with one or more modifiers
usually short and with less complicated structure
10
Occurrences in WP links Coverage Example Exact match 923 14.3 “brain” (Report) & “brain” (WP) Partial match 1,038 16.1 “infarction” (Report) & “cerebellar infarction”(WP) Sub-exact match 5,257 81.6 “acute cerebral infarction” (Report) & “cerebral infarction” (WP)
11
anchor (EA); Outside-anchor (OA); Single-word- anchor (SWA)
12
based approach
sequences Sa
white matter disease- {white, matter, disease, white matter, matter disease, white matter disease}
candidates c based on their target probability:
The more often a page is linked to a phrase, the more likely it should be linked to it again.
13
(a, c) as “link” or “non-link”
candidate page; weighted by the similarity of the sub-anchor to the original anchor
page about neuroradiology?
Avg
14
lowest confidence score is chosen
highest confidence score is chosen
15
16
System P R F LiRa 0.9 0.8 0.85 Wikify! 0.35 0.16 0.22 WM 0.35 0.36 0.36 Results of anchor detection LiRa: system using our proposed approach
17
System P R F LiRa 0.8 0.8 0.8 Wikify! (Lesk) 0.4 0.4 0.4 Wkify! (ML) 0.69 0.69 0.69 Results of target finding for anchors identified by Wikify! System P R F LiRa 0.89 0.89 0.89 WM 0.84 0.84 0.84 Results of target finding for annotated anchors System P R F LiRa 0.68 0.68 0.68 Wikify! (Lesk) 0.13 0.13 0.13 Wikify! (ML) 0.26 0.26 0.26 Results of target finding for annotated anchors
18
System P R F LiRa 0.65 0.58 0.61 Wikify! (Lesk) 0.14 0.07 0.09 Wikify! (ML) 0.25 0.12 0.16 WM 0.29 0.3 0.3
2 4 6 8 1 2 3 4 5 6 7 8 log(rank) log(frequency)
mass vestibular nerves brain Virchow-Robin space meningioma Warthin’s tumor frontal Wegner’s granulomatosis white matter xanthogranulomas
19
Group 1 2 3 4 5 6
>100 51-100 11-50 6-10 2-5 1 #Anchors 116 108 527 482 1,399 2,149
20
21
22