s e c n e i c S n o i t a m CareerMap: Visualizing - - PowerPoint PPT Presentation

s e c n e i c s n o i t a m
SMART_READER_LITE
LIVE PREVIEW

s e c n e i c S n o i t a m CareerMap: Visualizing - - PowerPoint PPT Presentation

s e c n e i c S n o i t a m CareerMap: Visualizing Career Trajectory r o f n I a Kan Wu, Jie Tang, Zhou Shao, Xinyi Xu, Bo Gao & Shu Zhao n i h Dept. Of Computer Science, Tsinghua University C e c n e i c S s e


slide-1
SLIDE 1

CareerMap: Visualizing Career Trajectory

  • Dept. Of Computer Science, Tsinghua University

Kan Wu, Jie Tang, Zhou Shao, Xinyi Xu, Bo Gao & Shu Zhao

S c i e n c e C h i n a I n f

  • r

m a t i

  • n

S c i e n c e s

slide-2
SLIDE 2

Data incompletion

Challenge 1:

Name ambiguity

Challenge 2:

Challenges

Challenge 3:

Visualize many scholars’ merged trajectories on the map, e.g. 100 people move from Boston to NewYork

solution Unified Probabilistic Models [1]

1.Jie Tang, A.C.M. Fong, Bo Wang, and Jing Zhang. A Unified Probabilistic Framework for Name Disambiguation in Digital Library. IEEE TKDE, 2012

solution solution Spatial-Temporal Factor Graph Model (STFGM) Hotspot detection algorithm

S c i e n c e C h i n a I n f

  • r

m a t i

  • n

S c i e n c e s

slide-3
SLIDE 3

Architecture

Affiliation Extraction Smoothing

Career Trajectory Extraction

Hotspot Detection

Analytic Visualization

Analysis Visualization

S c i e n c e C h i n a I n f

  • r

m a t i

  • n

S c i e n c e s

slide-4
SLIDE 4

Spatial-Temporal Factor Graph Model (continued)

Each green point with common t outside, representing a tuple of <Time t, Author ai1 , Author ai2 >, is an observation instance where ai1 is the target author and ai2 is a coauthor with known affiliation at t. Associated with each observation instance is a hidden binary-valued variable representing the affiliation similarity between the two authors. If they belong to the same affiliation at that time, the hidden value is 1, otherwise 0.

  • The general idea

– try to find the affiliation-known coauthor who has the same affiliation as the target author with missing affiliation.

S c i e n c e C h i n a I n f

  • r

m a t i

  • n

S c i e n c e s

slide-5
SLIDE 5
  • Attribute factor

– captures the features of each tuple <Time t, Author ai1 , Author ai2 >,

  • Space factor

– captures the correlation between the hidden variables in the same time – NS denotes all the space relations –

  • Time factor

– captures the correlation between the hidden variables in the same time – NT denotes all the time relations

Spatial-Temporal Factor Graph Model (continued)

S c i e n c e C h i n a I n f

  • r

m a t i

  • n

S c i e n c e s

slide-6
SLIDE 6
  • Model Learning

– Maximize the likelihood of the observed data – θ ≜ (ωT, 𝛾T, 𝛿T)T is the parameters to be learned of the model

Spatial-Temporal Factor Graph Model

S c i e n c e C h i n a I n f

  • r

m a t i

  • n

S c i e n c e s

slide-7
SLIDE 7
  • The general idea

– Use weight to reflect confidence of an affiliation at a time. – Leverage the number of papers with the affiliation at time t as the weight. – Denoting the weights at t1 and t2 are w1 and w2 respectively, the weight center tc can be computed from: – If information between t1 and t2 is missing, – ∀t (t1 < t < tc), Affiliation(a, t) = Affiliation(a, t1) – ∀t (tc < t < t2), Affiliation(a, t) = Affiliation(a, t2)

Smooting

S c i e n c e C h i n a I n f

  • r

m a t i

  • n

S c i e n c e s

slide-8
SLIDE 8

Example of scholar career trajectory extraction

S c i e n c e C h i n a I n f

  • r

m a t i

  • n

S c i e n c e s

slide-9
SLIDE 9

Hotspot detectionalgorithm

  • The general idea

– The heat centers have more neighbors than surrounding points. – The heat centers ”absorb” their surrounding points as their

  • neighbors. If a point is ”absorbed”

by a heat center, then its neigh- bors are emptied. – Finally, the points left out with nonempty neighbors are heat centers.

S c i e n c e C h i n a I n f

  • r

m a t i

  • n

S c i e n c e s

slide-10
SLIDE 10

Trajectory map generated by Career Map

S c i e n c e C h i n a I n f

  • r

m a t i

  • n

S c i e n c e s

slide-11
SLIDE 11

Analytic Visualization

S c i e n c e C h i n a I n f

  • r

m a t i

  • n

S c i e n c e s

slide-12
SLIDE 12

Some Interesting Case Study (continued)

S c i e n c e C h i n a I n f

  • r

m a t i

  • n

S c i e n c e s

slide-13
SLIDE 13

Some Interesting Case Study (continued)

S c i e n c e C h i n a I n f

  • r

m a t i

  • n

S c i e n c e s

slide-14
SLIDE 14

Some Interesting Case Study

S c i e n c e C h i n a I n f

  • r

m a t i

  • n

S c i e n c e s

slide-15
SLIDE 15

We introducethe challenges of building CareerMap, a system for visualizing scholars’ career trajectory Architecture,technologies and main features of the system

Some interesting case studies

2 1 3

Summary

S c i e n c e C h i n a I n f

  • r

m a t i

  • n

S c i e n c e s

slide-16
SLIDE 16

Thanks

Q&A

S c i e n c e C h i n a I n f

  • r

m a t i

  • n

S c i e n c e s