NERank: Bringing Order to Named Entities from Texts
Chengyu Wang1, Rong Zhang1, Xiaofeng He1, Guomin Zhou2, Aoying Zhou1
1) Institute for Data Science and Engineering,
East China Normal University
2) Zhejiang Police College
NERank: Bringing Order to Named Entities from Texts Chengyu Wang 1 , - - PowerPoint PPT Presentation
NERank: Bringing Order to Named Entities from Texts Chengyu Wang 1 , Rong Zhang 1 , Xiaofeng He 1 , Guomin Zhou 2 , Aoying Zhou 1 1) Institute for Data Science and Engineering, East China Normal University 2) Zhejiang Police College Outline
1) Institute for Data Science and Engineering,
East China Normal University
2) Zhejiang Police College
2
3
4
5
6
7
8
– 𝜄2,4: probability of topic 𝑢
4 in document 𝑒2
– 𝜒 82,4: probability of normalized entity 𝑓
4 in topic 𝑢2
– Prior probability 𝑞𝑠 𝑢2 = 1 |𝐸| ; 𝜄2,4
|<| 4=>
– Entity richness 𝑓𝑠 𝑢2 = 1 𝑎.@ ; 𝜒 82,4
|/| 4=>
– Topic specificity 𝑢𝑡 𝑢2 = B 0, (𝑞𝑠 𝑢2 < 𝜁)
> EFG ∑
𝜄2,4 logK 𝜄2,4
< 4=>
(𝑞𝑠 𝑢2 ≥ 𝜁)
9
M 𝑢2 = 𝑋O P 𝐺(𝑢2)
– 𝐺 𝑢2 =< 𝑞𝑠 𝑢2 ,𝑓𝑠 𝑢2 ,𝑢𝑡 𝑢2 > – ∑ 𝑥2
2
= 1
– For two topics 𝑢2 and 𝑢4, if 𝑢2 is a more important topic than 𝑢4, we have 𝑠M 𝑢2 > 𝑠M 𝑢
4
– Optimization objective: 𝑋
K K + 𝐷 P ∑
𝜊2,4
2,4
– Constraints: 𝑋O P 𝐺 𝑢2 − 𝑋O P 𝐺 𝑢4 ≥ 1 − 𝜊2,4 – Train a linear SVM classifier to learn the weights
10
11
– 𝑠 𝑢2 = 𝑠M 𝑢2
– Following TDT (Topic-Doc-Topic) meta path (with prob. 𝛽 > 0)
YZ,[
∑
Y
\,[
]\∈^
Y
[,\
– Following TET (Topic-Entity-Topic) meta path (with prob. 𝛾 > 0) 𝑢2
b cZ,[ ∑ b cZ,\
d\∈e
𝑓
4 b c\,[ ∑ b c f,[
Ff∈g
𝑢` – Random jump (with prob. 1 − 𝛽 − 𝛾 > 0)
𝑈
i = 𝛽 P Θk OΘ P 𝑈il> + 𝛾 P Φ
noΦ nk
O P 𝑈il> + (1 − 𝛽 − 𝛾)𝑈M
𝑈
i = 𝑁i𝑈M + (1 − 𝛽 − 𝛾) ; 𝑁2𝑈M il> 2=M
– where 𝑁 = 𝛽 P Θk
OΘ + 𝛾 P Φ
noΦ nk
O
i
– lim
i→s𝑈 i = lim i→s𝑁i𝑈M + (1 − 𝛽 − 𝛾) lim i→s∑
𝑁2𝑈M
il> 2=M
– lim
i→s𝑁i𝑈M = 0 (because Θk OΘ and Φ
noΦ nk
O are transition matrices
with 0 < 𝛽 + 𝛾<1) – lim
i→s∑
𝑁2𝑈M
il> 2=M
= (𝐽 − 𝑁)l>𝑈M
12
i
i→s𝑈 i = (1 − 𝛽 − 𝛾)(𝐽 − 𝑁)l>𝑈 M
i
OΘ + 𝛾 P Φ
O)l>𝑈M
O(𝐽 − 𝛽 P Θk OΘ + 𝛾 P Φ
O)l>𝑈 M
13
14
15
16
17
18
19