Characterizing Individual Behavior from Interaction History Patrick - - PowerPoint PPT Presentation

characterizing individual behavior from interaction
SMART_READER_LITE
LIVE PREVIEW

Characterizing Individual Behavior from Interaction History Patrick - - PowerPoint PPT Presentation

Characterizing Individual Behavior from Interaction History Patrick Perry NYU Stern Case Study: UCI Online Network Online community for University of California, Irvine (Opsahl & Panzarasa, 2009) Dataset covers seven-month period:


slide-1
SLIDE 1

Characterizing Individual Behavior from Interaction History

Patrick Perry NYU Stern

slide-2
SLIDE 2

Case Study: UCI Online Network

Online community for University of California, Irvine (Opsahl & Panzarasa, 2009) Dataset covers seven-month period: April - October 2004 2000 users, 60K messages Goal: Characterize user messaging behavior

slide-3
SLIDE 3

Degrees Are Not Enough

  • Out Degree

In Degree 1 5 10 50 100 500 1 5 10 50 100 500

In Degree Out Degree Can we do better?

slide-4
SLIDE 4

Agenda

  • 1. Framework for studying interaction histories
  • 2. Macroscopic behavior
  • 3. Microscopic behavior
slide-5
SLIDE 5

Events, Not Links

Messages

Time Sender Receiver t1 i1 j1 t2 i2 j2 tN iN jN

t1 i1 j1 t2 i2 j2 . . . . . . . . . tn in jn

slide-6
SLIDE 6

Point Process Model

Model via intensity, : λt(i, j)

λt(i, j) dt = Prob{i sends to j in [t, t + dt)}

Messages from to :

i j

t

slide-7
SLIDE 7

Key Insight: Use Past History

If you send me a message, I am likely to respond If I have sent you a message in the past, I am likely to repeat this action in the future These effects all decay with time. Hypotheses:

slide-8
SLIDE 8

History-Dependent Covariates

send(k)

t

(i, j) = #{i → j in I(k)

t

}, receive(k)

t

(i, j) = #{j → i in I(k)

t

};

I(1)

t

I(2)

t

I(3)

t

t

1 day 2 days 4 days

slide-9
SLIDE 9

Cox Proportional Intensity Model

λt(i, j) = ¯ λt(i) exp{βTxt(i, j)} Prob{i sends j a message in time [t,t+dt)} Vector of time-varying covariates Baseline intensity for sender i Vector of coefficients λt(i, j) dt ¯ λt(i) xt(i, j) β (Butts 2008 , Vu et al. 2011, POP & Wolfe 2013)

slide-10
SLIDE 10

Interpretation

¯ λt(i)

Treated as a nuisance parameter, estimated non-parametrically

λt(i, j) = ¯ λt(i) exp{βTxt(i, j)}

βk

Increasing [[[[x_t(i,j]]]]]]]]k by one unit while holding all other covariates constant is associated with multiplying the message rate by bekk units.

[xt(i, j)]k eβk

slide-11
SLIDE 11

Example: Self-Reinforcing Send

[xt(i, j)]1 = #{i → j in [t − 1 day, t)} [xt(i, j)]2 = #{i → j in [t − 1 week, t − 1 day)} λt(i, j) = ¯ λt(i) exp{1.8[xt(i, j)]1 + 0.7[xt(i, j)]2}

Every sent message is associated with an e1.8-fold increase for 1 day, followed by an e0.7-fold increase for 6 days (relative to the baseline). After one week, the message is not associated with a change in rate

slide-12
SLIDE 12

Example: Response Model

Every received message is associated with an e1.8-fold increase for 1 day, followed by an e0.3-fold decrease for 6 days (relative to the baseline). After one week, the message is not associated with a change in rate

[xt(i, j)]1 = #{j → i in [t − 1 day, t)} [xt(i, j)]2 = #{j → i in [t − 1 week, t − 1 day)} λt(i, j) = ¯ λt(i) exp{1.8[xt(i, j)]1 − 0.3[xt(i, j)]2}

slide-13
SLIDE 13

Users Respond to Messages

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.8 1 2 3 4 5 6

Coefficient of receive(k)

t

(i, j) = #{j → i in I(k)

t

}

slide-14
SLIDE 14

Users Repeat Past Behavior

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.8 1 2 3 4 5 6

Coefficient of send(k)

t

(i, j) = #{i → j in I(k)

t

}

slide-15
SLIDE 15

(1) receiving is associated with responding (2) users repeat their past behaviors (3) effect (2) decays faster than effect (1)

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.8 1 2 3 4 5 6

receive

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.8 1 2 3 4 5 6

send

slide-16
SLIDE 16

Same behavior for each user?

slide-17
SLIDE 17

Micro-level Model

λt(i, j) = ¯ λt(i) exp{βT

i xt(i, j)}

λt(i, j) = ¯ λt(i) exp{βTxt(i, j)}

Old Model: New Model:

(Related model: DuBois et al. 2013)

βi ∼ Normal(µ, Σ)

slide-18
SLIDE 18

Estimating User-Specific Coefficients

Fitting time: 3 CPU hours 2000 sets of coefficients (one set for each user) Need summarization method to visualize

slide-19
SLIDE 19

Visualize by Factor Analysis

2000 sets of coefficients (one set for each user) Reduce dimensionality via principle components First 2 components explain 87% of variance

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 Component Variance Explained (%) 10 20 30 40 50 60

slide-20
SLIDE 20

User-specific Principle Component Scores

  • −10

−8 −6 −4 −2 2 4 −4 −2 2 4 6 Component 1 Component 2

22% 12% 43% 9% 2% 12%

slide-21
SLIDE 21

Variation in Response

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.05 0.1 0.25 0.5 1 2 5 10 20 50

receive

slide-22
SLIDE 22

Variation in Repetition

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.05 0.1 0.25 0.5 1 2 5 10 20 50

send

slide-23
SLIDE 23

send

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.05 0.1 0.25 0.5 1 2 5 10 20 50

receive

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.05 0.1 0.25 0.5 1 2 5 10 20 50

(1) two dimensions of behavior (2) large range of response rates, similar qualitative patterns (3) some users repeat, others innovate; big effects in both directions

slide-24
SLIDE 24

Comparing Macro and Micro

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.05 0.1 0.25 0.5 1 2 5 10 20 50 Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.05 0.1 0.25 0.5 1 2 5 10 20 50

receive

Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.8 1 2 3 4 5 6 Time Elapsed (Days) Effect 1 2 4 8 16 32 64 128 0.8 1 2 3 4 5 6

send

slide-25
SLIDE 25

Theory for Macro Case

Theorem (POP & Wolfe): Under regularity conditions, MPLE satisfies:

1. 2. √n(ˆ βn − β)

d

→ Normal

  • 0, Σ(β)
  • ˆ

βn

P

→ β

Related results: Cox (1975): heuristic argument (“under mild conditions implying some degree of independence... and that the information values are not too disparate”) Andersen & Gill (1982): survival analysis, fixed time interval

slide-26
SLIDE 26

Implementation

PLtn(β) = Y

tm≤tn

eβTxtm(im,jm) P

j eβTxtm(im,j)

Loop over all messages Loop over all receivers Naïve: O(messages × receivers) With bookkeeping: O(messages + receivers)

slide-27
SLIDE 27

Implementation Trick: Sparsity

Inner sum:

X

j

eβTxt(i,j) = X

j

eβTx0(i,j) +  X

j

eβTxt(i,j) − eβTx0(i,j)

  • xt(i, j) = x0(i, j) + dt(i, j)

Note!

slide-28
SLIDE 28

Implementation Trick: Structure

X

j

eβTx0(i,j)

Initial sum: Redundancy in

n x0(i, 1), x0(i, 2), . . . , x0(i, J)

  • I

i=1

slide-29
SLIDE 29

More Details

Computing Self-loops Similar tricks for gradient, Hessian Numerical overflow

dt(i, j)

R package forthcoming

slide-30
SLIDE 30

Summary

  • 1. Events, not links
  • 2. Point process model captures behavior
  • 3. User-specific coefficients allow for heterogeneity