Characterizing Individual Behavior from Interaction History Patrick - - PowerPoint PPT Presentation

▶

Apr 05, 2024 442 likes •768 views

Characterizing Individual Behavior from Interaction History Patrick Perry NYU Stern Case Study: UCI Online Network Online community for University of California, Irvine (Opsahl & Panzarasa, 2009) Dataset covers seven-month period:

SLIDE 1

Characterizing Individual Behavior from Interaction History

Patrick Perry NYU Stern

SLIDE 2

Case Study: UCI Online Network

Online community for University of California, Irvine (Opsahl & Panzarasa, 2009) Dataset covers seven-month period: April - October 2004 2000 users, 60K messages Goal: Characterize user messaging behavior

SLIDE 3

Degrees Are Not Enough

●
Out Degree

In Degree 1 5 10 50 100 500 1 5 10 50 100 500

In Degree Out Degree Can we do better?

SLIDE 4

Agenda

1. Framework for studying interaction histories
2. Macroscopic behavior
3. Microscopic behavior

SLIDE 5

Events, Not Links

Messages

Time Sender Receiver t1 i1 j1 t2 i2 j2 tN iN jN

t1 i1 j1 t2 i2 j2 . . . . . . . . . tn in jn

SLIDE 6

Point Process Model

Model via intensity, : λt(i, j)

λt(i, j) dt = Prob{i sends to j in [t, t + dt)}

Messages from to :

i j

SLIDE 7

Key Insight: Use Past History

If you send me a message, I am likely to respond If I have sent you a message in the past, I am likely to repeat this action in the future These effects all decay with time. Hypotheses:

SLIDE 8

History-Dependent Covariates

send(k)

(i, j) = #{i → j in I(k)

}, receive(k)

(i, j) = #{j → i in I(k)

};

I(1)

I(2)

I(3)

1 day 2 days 4 days

SLIDE 9

Cox Proportional Intensity Model

λt(i, j) = ¯ λt(i) exp{βTxt(i, j)} Prob{i sends j a message in time [t,t+dt)} Vector of time-varying covariates Baseline intensity for sender i Vector of coefficients λt(i, j) dt ¯ λt(i) xt(i, j) β (Butts 2008 , Vu et al. 2011, POP & Wolfe 2013)

SLIDE 10

Interpretation

¯ λt(i)

Treated as a nuisance parameter, estimated non-parametrically

λt(i, j) = ¯ λt(i) exp{βTxt(i, j)}

βk

Increasing [[[[x_t(i,j]]]]]]]]k by one unit while holding all other covariates constant is associated with multiplying the message rate by bekk units.

[xt(i, j)]k eβk

SLIDE 11

Example: Self-Reinforcing Send

[xt(i, j)]1 = #{i → j in [t − 1 day, t)} [xt(i, j)]2 = #{i → j in [t − 1 week, t − 1 day)} λt(i, j) = ¯ λt(i) exp{1.8[xt(i, j)]1 + 0.7[xt(i, j)]2}

Every sent message is associated with an e1.8-fold increase for 1 day, followed by an e0.7-fold increase for 6 days (relative to the baseline). After one week, the message is not associated with a change in rate

SLIDE 12

Example: Response Model

Every received message is associated with an e1.8-fold increase for 1 day, followed by an e0.3-fold decrease for 6 days (relative to the baseline). After one week, the message is not associated with a change in rate

[xt(i, j)]1 = #{j → i in [t − 1 day, t)} [xt(i, j)]2 = #{j → i in [t − 1 week, t − 1 day)} λt(i, j) = ¯ λt(i) exp{1.8[xt(i, j)]1 − 0.3[xt(i, j)]2}

SLIDE 13

Users Respond to Messages

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.8 1 2 3 4 5 6

Coefficient of receive(k)

(i, j) = #{j → i in I(k)

}

SLIDE 14

Users Repeat Past Behavior

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.8 1 2 3 4 5 6

Coefficient of send(k)

(i, j) = #{i → j in I(k)

}

SLIDE 15

(1) receiving is associated with responding (2) users repeat their past behaviors (3) effect (2) decays faster than effect (1)

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.8 1 2 3 4 5 6

receive

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.8 1 2 3 4 5 6

send

SLIDE 16

Same behavior for each user?

SLIDE 17

Micro-level Model

λt(i, j) = ¯ λt(i) exp{βT

i xt(i, j)}

λt(i, j) = ¯ λt(i) exp{βTxt(i, j)}

Old Model: New Model:

(Related model: DuBois et al. 2013)

βi ∼ Normal(µ, Σ)

SLIDE 18

Estimating User-Specific Coefficients

Fitting time: 3 CPU hours 2000 sets of coefficients (one set for each user) Need summarization method to visualize

SLIDE 19

Visualize by Factor Analysis

2000 sets of coefficients (one set for each user) Reduce dimensionality via principle components First 2 components explain 87% of variance

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 Component Variance Explained (%) 10 20 30 40 50 60

SLIDE 20

User-specific Principle Component Scores

●
●
−10

−8 −6 −4 −2 2 4 −4 −2 2 4 6 Component 1 Component 2

22% 12% 43% 9% 2% 12%

SLIDE 21

Variation in Response

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.05 0.1 0.25 0.5 1 2 5 10 20 50

receive

SLIDE 22

Variation in Repetition

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.05 0.1 0.25 0.5 1 2 5 10 20 50

send

SLIDE 23

send

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.05 0.1 0.25 0.5 1 2 5 10 20 50

receive

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.05 0.1 0.25 0.5 1 2 5 10 20 50

(1) two dimensions of behavior (2) large range of response rates, similar qualitative patterns (3) some users repeat, others innovate; big effects in both directions

SLIDE 24

Comparing Macro and Micro

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.05 0.1 0.25 0.5 1 2 5 10 20 50 Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.05 0.1 0.25 0.5 1 2 5 10 20 50

receive

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.8 1 2 3 4 5 6 Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.8 1 2 3 4 5 6

send

SLIDE 25

Theory for Macro Case

Theorem (POP & Wolfe): Under regularity conditions, MPLE satisfies:

1. 2. √n(ˆ βn − β)

→ Normal

0, Σ(β)
ˆ

βn

→ β

Related results: Cox (1975): heuristic argument (“under mild conditions implying some degree of independence... and that the information values are not too disparate”) Andersen & Gill (1982): survival analysis, fixed time interval

SLIDE 26

Implementation

PLtn(β) = Y

tm≤tn

eβTxtm(im,jm) P

j eβTxtm(im,j)

Loop over all messages Loop over all receivers Naïve: O(messages × receivers) With bookkeeping: O(messages + receivers)

SLIDE 27

Implementation Trick: Sparsity

Inner sum:

eβTxt(i,j) = X

eβTx0(i,j) +  X

eβTxt(i,j) − eβTx0(i,j)

xt(i, j) = x0(i, j) + dt(i, j)

Note!

SLIDE 28

Implementation Trick: Structure

eβTx0(i,j)

Initial sum: Redundancy in

n x0(i, 1), x0(i, 2), . . . , x0(i, J)

i=1

SLIDE 29

More Details

Computing Self-loops Similar tricks for gradient, Hessian Numerical overflow

dt(i, j)

R package forthcoming

SLIDE 30

Summary

1. Events, not links
2. Point process model captures behavior
3. User-specific coefficients allow for heterogeneity