Learning Urban Community Structures: A Collective Embedding - - PowerPoint PPT Presentation
Learning Urban Community Structures: A Collective Embedding - - PowerPoint PPT Presentation
Learning Urban Community Structures: A Collective Embedding Perspective with Periodic Spatial-temporal Mobility Graphs Pengyang Wang, Yanjie Fu, Jiawei Zhang, Xiaolin Li, Dan Lin Outline 1 Background and Motivation Definition and
1
Outline
¨ Background and Motivation
¨ Definition and Problem Statement ¨ Methodology ¨ Application ¨ Evaluation ¨ Conclusion
Background and Motivation
¨ Urban life is getting more diverse and vibrant
2
Urban community
Why we study urban communities?
§ Spatial Imbalance
- ---vibrancy differencesbetween communities
3
Challenges & Insights
¨ Challenge I – Graph construction
How to unify and represent the POIs and human periodic mobility records as a set of mobility graphs?
¨ Insight I
a set of periodic spatial-temporal mobility graphs
4
Challenges & Insights
6
¨ Challenge II – Collective embedding
How to collectively learn the embeddings of POIs from multiple periodic mobility graphs?
¨ Insight II
Collective deep auto-encoder
Challenges & Insights
7
¨ Challenge III - Embedding aggregation
How to align and aggregate POI embeddings for community structure representation learning?
¨ Insight III
unsupervised graph-based weighting method
8
Outline
¨ Background and Motivation
¨ Definition and Problem Statement
¨ Methodology ¨ Application ¨ Evaluation ¨ Conclusion
Definition I
¨ Urban communities
9 r a d i u s = 1 k m
residential complex neighborhood area
Definition II
¨ Mobility Graph
10
Definition III
¨ Periodic Mobility Graphs
11
Problem Statement
¨ Given
- Residential communities (locations, POIs)
- Human mobility (e.g., taxi GPS traces)
¨ Objective
- Learning representations about static spatial configurations
- Learning representations about dynamic human mobility
connectivity of POIs in the community
¨ Core tasks
- Construction of the periodic mobility graph set for a
community
- Collectively embedding
- Aggregating and aligning POI embedding into community
embedding.
11
Framework Overview
12
13
Outline
¨ Background and Motivation ¨ Problem Statement
¨ Methodology
¨ Application ¨ Evaluation ¨ Conclusion
Methodology
¨ Periodic Mobility Graph Construction ¨ Collective POI Embedding ¨ Aligning and Aggregating POI Embeddings to
Community Embeddings
14
Periodic Mobility Graph Construction
18
Propagate visit probability
100 200 300 400 500 600 700 0.0 0.2 0.4 0.6 0.8 Distance to desination (m) Probablity
the closer, the more likely to visit?
Collective POI Embedding
17
Collective POI Embedding
18
8 > > > > < > > > > : y(k),1
i,t
= σ(W(k),1
i,t
p(k)
i,t + b(k),1 i,t
), ∀t ∈ {1, 2, · · · , 7}, y(k),r
i,t
= σ(W(k),r
i,t
p(k)
i,t + b(k),r i,t
), ∀r ∈ {2, 3, · · · , o}, y(k),o+1
i
= σ(P
t W(k),o+1 t
y(k),o
i,t
+ b(k),o+1
t
), z(k)
i
= σ(W(k),o+2y(k),o+1
i
+ b(k),o+2), ˆ y(k),o+1
i
= σ( ˆ W(k),o+2z(k)
i
+ ˆ b(k),o+2), ˆ y(k),o
i,t
= σ( ˆ W(k),o+1
t
ˆ y(k),o+1
i
+ ˆ b(k),o+1
t
), ˆ y(k),r−1
i,t
= σ( ˆ W(k),r
i,t
ˆ y(k),r
i,t
+ ˆ b(k),r
i,t
), ∀r ∈ {2, 3, · · · , o}, ˆ p(k)
i,t
= σ( ˆ W(k),1
i,t
ˆ y(k),1
i,t
+ ˆ b(k),1
i,t
),
Encoder Decoder
Loss Function:
L(k) = X
t∈{1,2,...,7}
X
i
k(p(k)
i,t ˆ
p(k)
i,t ) v(k) i,t k2 2
Aligning and Aggregating POI Embeddings to Community Embeddings
¨ Graph based weighting method
19
POI similarity graph
POI #1 POI #2 POI #3 POI #4 POI #5
Similarity1,2 Similarity1,5 Similarity3,4 Similarity2,3 Similarity2,5 Similarity3,5 Similarity4,5 Similarity1,4 Similarity2,4 Similarity2,3
Graph based weighting method
¨ Weight Calculation
20
ˆ G(k)[s, l] = X
pi∈Φs
˜ G(k)[i, l] × w(k)
l
w(k)
l
= P
i∈ck
P
j∈ck simi,j × | ˜
G(k)[i, l] − ˜ G(k)[j, l]| M
if the l-th dimension of the latent feature makes more sense, when POI 𝑞" and 𝑞# are very similar, the difference of 𝑞" and 𝑞# on the l-th dimension should be very small. Therefore, if the l-th dimension of the latent feature does not make much sense, will increase; if 𝑞" and 𝑞# are very similar, 𝑇𝑗𝑛",#will further penalize | ˜ G(k)[i, l] − ˜ G(k)[j, l]|
|g[i, l] − g[j, l]| |g[i, l] − g[j, l]|
8
Outline
¨ Background and Motivation ¨ Definition and Problem Statement ¨ Methodology
¨ Application
¨ Evaluation ¨ Conclusion
Application I
¨ Predicting Willing to Pay (WTP)
22
r = Pf − Pi Pi
Final Price Initial Price
Application II
¨ Spotting vibrant urban communities
23
uk = 2 × freq(k) × div(k) freq(k) × div(k)
Urban Vibrancy Value Density of Consumer Activities Diversity of Consumer Activities
25
Outline
¨ Background and Motivation ¨ Definition and Problem Statement ¨ Methodology ¨ Application
¨ Evaluation
¨ Conclusion and Future Work
Evaluation
26
¨ Data Description
From Beijing City
The Application of WTP Prediction
¨ Baselines
v
Explicit Features (EF): (i) POI numbers per category; (ii) Average commute distance; (iii) Average commute speed; (iiii) Average commute time; (v) Number
- f mobilities; (vi) Average distance between POIs.
v
Latent Features (LF): Specifically, the latent features are learned from the proposed collective embedding method.
v
The combination of EF and LF (ELF).
v
Variation of step1 (V-1): using distance-based matching of the records.
v
Variation of step2 (V-2): computing the POI embedding as an average of the embeddings.
v
Variation of step3 (V-3): averaging over the POI embeddings.
¨ Evaluation Metric
v
Root-Mean-Square Error (RMSE)
26
The Application of WTP Prediction
¨ Results
27
Spotting vibrant urban communities
27
¨ Baselines
v Learning to Rank
(1)MART: it is a boosted tree model, specifically, a linear combination of the outputs of a set of regression trees. (2)RankBoost (RB): it is a boosted pairwise ranking method, which trains multiple weak rankers and combines their outputs as final ranking. (3)LambdaMART (LM): it is the boosted tree version of LambdaRank. (4)ListNet (LN): It is a listwise ranking model with permutation top-k ranking likelihood as objective function. (5) RankNet (RN): it uses a neural network to model the underlying probabilistic cost function.
v Feature Set
(1)Explicit Features (2)Latent features (3)Explicit&Latent features
Evaluation
29
¨ Evaluation Metrics
v Root-Mean-Square Error (RMSE) v Normalized Discounted Cumulative Gain(NDCG@N)
- Evaluate the rankingperformance at TopN
v Kendall’s Tau Coefficient(Tau)
- Measure the overall ranking accuracy.
v F-measure@N
- “high-vibrancy” and the rating > 3
- “low-vibrancy” and the rating < 3
- measure the rankingprecision and recall @ TopN
Overall performance
30
@5 @10 @15 @20 NDCG
0.0 0.2 0.4 0.6 0.8 1.0 1.2
ELF−MART LF−MART EF−MART V−1−MART V−2−MART V−3−MART ELF−RN LF−RN EF−RN V−1−RN V−2−RN V−3−RN ELF−RB LF−RB EF−RB V−1−RB V−2−RB V−3−RB
@5 @10 @15 @20 Fmeasure
0.0 0.2 0.4 0.6 0.8 1.0 1.2
ELF−MART LF−MART EF−MART V−1−MART V−2−MART V−3−MART ELF−RN LF−RN EF−RN V−1−RN V−2−RN V−3−RN ELF−RB LF−RB EF−RB V−1−RB V−2−RB V−3−RB
Tau
−1.0 −0.5 0.0 0.5
ELF−MART LF−MART EF−MART V−1−MART V−2−MART V−3−MART ELF−RN LF−RN EF−RN V−1−RN V−2−RN V−3−RN ELF−RB LF−RB EF−RB V−1−RB V−2−RB V−3−RB
Comparison with Representation Learning Algorithms
31 @5 @10 @15 @20 NDCG
0.0 0.2 0.4 0.6 0.8 1.0 1.2
Our Model NMF RBM Skip−gram
@5 @10 @15 @20 NDCG
0.0 0.2 0.4 0.6 0.8 1.0 1.2
Our Model NMF RBM Skip−gram
@5 @10 @15 @20 NDCG
0.0 0.2 0.4 0.6 0.8 1.0 1.2
Our Model NMF RBM Skip−gram
@5 @10 @15 @20 NDCG
0.0 0.2 0.4 0.6 0.8 1.0 1.2
Our Model NMF RBM Skip−gram
Investigation of Community Structure Properties
¨ Community Connectivities.
32
Investigation of Community Structure Properties
¨ The Learned Representation of the Community
Structure
33 Community 1 Community 2
Visualization of the learned structure representations of two similar communities
35
Outline
¨ Background and Motivation ¨ Definition and Problem Statement ¨ Methodology ¨ Application ¨ Evaluation
¨ Conclusion
Conclusion
36
¨ We formulate the problem as a learning task over multiple
mobility graphs of POIs and propose a novel collective embedding framework.
¨ We started with a probabilistic propagation method to unify
and represent static POIs and dynamic human mobility records as periodic spatial-temporal mobility graphs.
¨ We then developed a collective embedding method to learn
the embeddings of POIs from the obtained mobility graphs.
¨ Based on the POIs embeddings, we further proposed an
unsupervised graph based weighted aggregation method to identify community embeddings.
¨ The method is effective.
36 Thanks!