SLIDE 10 Proximity Matrix
Terms Concepts Terms 1 · · · 1 · · · . . . . . . ... . . . · · · 1 · · · · · · . . . . . . ... . . . · · · Concepts · · · · · · . . . . . . ... . . . · · · 1 a · · · b a 1 · · · c . . . . . . ... . . . b c · · · 1
S = λ1SBOW + λ2SOLC + (1 − λ1 − λ2)(1 − Dcat)
Content- based Outlink category-based Distance- based
Pij = 1 if ci and cj are synonyms; µ−depth if ci and cj are hyponyms; S if ci and cj are associative concepts;
Building Semantic Kernels
<machine:1, statistical:1, learn:2, data:1, mine:1, relate:1, subject:1> <relate:1, subject:1; machine learning:1, statistical learning:1, data mining:1; artificial intelligence:0.3252> . . . Machine Learning Statistical Learning Data Mining Artificial Intelligence . . . Machine Learning 1 0.6276 0.4044 0.1314 . . . Statistical Learning 0.6276 1 0.2839 0.1146 . . . Data Mining 0.4044 0.2839 1 0.0792 . . . Artificial Intelligence 0.1314 0.1146 0.0792 1 . . . . . . . . .
×
<relate:1, subject:1; machine learning:1, statistical learning:1, data mining:1; ... >
=
Original BOW Vector Find Wikipedia Concepts and Keep as it is Enriched Document Vector Representation Machine learning, statistical learning and data mining are related subjects.
φ(d) P ˜ φ(d) = φ(d)P