The Dynamics of Repeat Consumption Ashton Anderson Stanford - - PowerPoint PPT Presentation

the dynamics of repeat consumption
SMART_READER_LITE
LIVE PREVIEW

The Dynamics of Repeat Consumption Ashton Anderson Stanford - - PowerPoint PPT Presentation

The Dynamics of Repeat Consumption Ashton Anderson Stanford University Ravi Kumar, Andrew Tomkins, Sergei Vassilvitskii Google Thursday, April 10, 14 repeat consumption a lot of consumption is repeat consumption what factors determine what


slide-1
SLIDE 1

The Dynamics of Repeat Consumption

Ashton Anderson

Stanford University

Ravi Kumar, Andrew Tomkins, Sergei Vassilvitskii

Google

Thursday, April 10, 14

slide-2
SLIDE 2

repeat consumption

2

a lot of consumption is repeat consumption what factors determine what we reconsume? given a set of previously-consumed candidates, predict which item a user will choose to reconsume

Thursday, April 10, 14

slide-3
SLIDE 3

consumption data

3

BrightKite: location checkins G+: public location checkins MapClicks: clicks on Google Maps businesses MapClicks-Food: clicks on Google Maps restaurants

Thursday, April 10, 14

slide-4
SLIDE 4

consumption data

4

WikiClicks: all clicks on English Wikipedia pages by Google users Y

  • uTube: last 10K video watches of users

Y

  • uTube-Music: Y
  • uTube restricted to

music videos

Thursday, April 10, 14

slide-5
SLIDE 5

baselines

5

Y es: radio playlists from hundreds of US radio stations* (to compare against non-individual consumption data) Shakespeare: full text of Shakespeare’s works, with each letter considered an item (to compare against data with repetitions)

* available at http://www.cs.cornell.edu/~shuochen/

Thursday, April 10, 14

slide-6
SLIDE 6

6

the dynamics of repeat consumption

  • 1. empirical analysis
  • 2. models
  • 3. experiments

Thursday, April 10, 14

slide-7
SLIDE 7

7

the dynamics of repeat consumption

  • 1. empirical analysis
  • 2. models
  • 3. experiments

Thursday, April 10, 14

slide-8
SLIDE 8

empirical analysis

8

what are the empirical traits of reconsumed items?

Thursday, April 10, 14

slide-9
SLIDE 9

9

individual popularity: are users generally exploiting or exploring?

popularity

Thursday, April 10, 14

slide-10
SLIDE 10

popularity

10

more frequently consumed items are more likely to be reconsumed

Thursday, April 10, 14

slide-11
SLIDE 11

recency

11

how does the recency of consumption affect the likelihood of reconsumption? to answer this question, we use a cache-based analysis technique

Thursday, April 10, 14

slide-12
SLIDE 12

recency

12

consider a cache of size k=3:

Thursday, April 10, 14

slide-13
SLIDE 13

recency

13

process a consumption history using

  • ptimal offline caching (replace item

that occurs furthest in the future)

Thursday, April 10, 14

slide-14
SLIDE 14

recency

14

a b b c d e b a c d c d

consumption history:

Thursday, April 10, 14

slide-15
SLIDE 15

recency

15

a b b c d e b a c d c d

consumption history:

a

Hits: 0 Misses: 1

Thursday, April 10, 14

slide-16
SLIDE 16

recency

16

a b b c d e b a c d c d

consumption history:

a b

Hits: 0 Misses: 2

Thursday, April 10, 14

slide-17
SLIDE 17

recency

17

a b b c d e b a c d c d

consumption history:

a b

Hits: 1 Misses: 2

Thursday, April 10, 14

slide-18
SLIDE 18

recency

18

a b b c d e b a c d c d

consumption history:

a b

Hits: 1 Misses: 3

c

Thursday, April 10, 14

slide-19
SLIDE 19

recency

19

a b b c d e b a c d c d

consumption history:

a b

Hits: 1 Misses: 4

d

Thursday, April 10, 14

slide-20
SLIDE 20

recency

20

a b b c d e b a c d c d

consumption history:

e b

Hits: 1 Misses: 5

d

Thursday, April 10, 14

slide-21
SLIDE 21

recency

21

a b b c d e b a c d c d

consumption history:

e b

Hits: 2 Misses: 5

d

Thursday, April 10, 14

slide-22
SLIDE 22

recency

22

a b b c d e b a c d c d

consumption history:

e b

Hits: 3 Misses: 5

d

Thursday, April 10, 14

slide-23
SLIDE 23

recency

23

a b b c d e b a c d c d

consumption history:

a b

Hits: 3 Misses: 6

d

Thursday, April 10, 14

slide-24
SLIDE 24

recency

24

a b b c d e b a c d c d

consumption history:

a c

Hits: 3 Misses: 7

d

Thursday, April 10, 14

slide-25
SLIDE 25

recency

25

a b b c d e b a c d c d

consumption history:

a c

Hits: 4 Misses: 7

d

Thursday, April 10, 14

slide-26
SLIDE 26

recency

26

a b b c d e b a c d c d

consumption history:

a c

Hits: 5 Misses: 7

d

Thursday, April 10, 14

slide-27
SLIDE 27

recency

27

the hit ratio is an indication of the degree to which recency is displayed in a consumption history

Thursday, April 10, 14

slide-28
SLIDE 28

recency

28

Real consumption sequences display a significant amount of recency

Thursday, April 10, 14

slide-29
SLIDE 29

recency

29

Baseline datasets don’t display recency (Y es even shows anti-recency)

Thursday, April 10, 14

slide-30
SLIDE 30

empirical analysis

30

user-level item popularity generally positive predictor recency is the strongest effect

Thursday, April 10, 14

slide-31
SLIDE 31

31

the dynamics of repeat consumption

  • 1. empirical analysis
  • 2. models
  • 3. experiments

Thursday, April 10, 14

slide-32
SLIDE 32

models

32

goal: develop a simple mathematical framework powerful enough to explain patterns

  • f reconsumption we observe in real data

Thursday, April 10, 14

slide-33
SLIDE 33

models

33

first, fix vocabulary of items E a consumption history for user is where each Xu = x1, . . . xi ∈ E u at each step, user picks next item to consume using some function of consumption history

Thursday, April 10, 14

slide-34
SLIDE 34

quality model

34

natural hypothesis: item quality dictates consumption behavior associate score for each , and at each step next item is chosen proportionally to its score: s(e) e ∈ E P(xi = e) = s(e)/ X

e02E

s(e0)

Thursday, April 10, 14

slide-35
SLIDE 35

recency model

35

since recency is the strongest empirical effect, we formulate a copying model based on it at every step i, user copies item at position i-j proportional to weight w(i-j)

Thursday, April 10, 14

slide-36
SLIDE 36

recency model

36

since recency is the strongest empirical effect, we formulate a copying model based on it at every step i, user picks item at position i-j proportional to weight w(i-j)

a b b c d e b a c d c d ?

consumption history

recency model

Thursday, April 10, 14

slide-37
SLIDE 37

recency model

37

since recency is the strongest empirical effect, we formulate a copying model based on it at every step i, user picks item at position i-j proportional to weight w(i-j)

a b b c d e b a c d c d ?

consumption history weights w

w(1) w(2) w(3) w(4) w(5) w(6) w(7) w(8) w(9) w(10) w(11) w(12)

recency model

Thursday, April 10, 14

slide-38
SLIDE 38

recency model

38

since recency is the strongest empirical effect, we formulate a copying model based on it at every step i, user picks item at position i-j proportional to weight w(i-j)

a b b c d e b a c d c d ?

consumption history

w(2) w(5) w(8)

e.g.: P(xi = d) ∼ + +

recency model

Thursday, April 10, 14

slide-39
SLIDE 39

recency model

39

since recency is the strongest empirical effect, we formulate a copying model based on it at every step i, user picks item at position i-j proportional to weight w(i-j) P(xi = e) = P

j<i I(xi = e)w(i − j)

P

j<i w(i − j)

recency model

Thursday, April 10, 14

slide-40
SLIDE 40

40

recency model

we assume additivity in weights thought experiment: learn weights, and compare additivity prediction to actual likelihoods from copying

Thursday, April 10, 14

slide-41
SLIDE 41

41

recency model

very small deviations from additivity

Thursday, April 10, 14

slide-42
SLIDE 42

hybrid model

42

combination of recency and quality P(xi = e) = P

j<i I(xj = e)w(i − j)s(xj)

P

j<i w(i − j)s(xi−j)

w(2) w(5) w(8)

e.g.: P(xi = d) ∼ + +

( )·

s(d)

Thursday, April 10, 14

slide-43
SLIDE 43

learning model parameters

43

quality model: simply the empirical fraction of occurrences s(e) = 1 k

k

X

i=1

I(xi = e)

Thursday, April 10, 14

slide-44
SLIDE 44

learning model parameters

44

recency and hybrid models: maximize likelihood with stochastic gradient ascent LL = log Y

i∈R

P

j<i I(xi = xj)w(i − j)s(xj)

P

j<i w(i − j)s(xj)

!

Thursday, April 10, 14

slide-45
SLIDE 45

45

weight update: score update:

∂LL ∂w(δ) = X

i∈R

(

s(xi) Ai(xi=xj) − s(xi) Ai(1)

if xi = xi−δ, − s(xi)

Ai(1)

  • therwise

∂LL ∂s(e) = X

i∈R

( 1 − Ai(xj=e)

Ai(1)

if xi = e, − Ai(xj=e)

Ai(1)

  • therwise.

alternating updates to local maximum (not jointly convex)

learning model parameters

Thursday, April 10, 14

slide-46
SLIDE 46

46

the dynamics of repeat consumption

  • 1. empirical analysis
  • 2. models
  • 3. experiments

Thursday, April 10, 14

slide-47
SLIDE 47

experiments

47

scores for quality model

Thursday, April 10, 14

slide-48
SLIDE 48

experiments

48

learned recency weights

Thursday, April 10, 14

slide-49
SLIDE 49

experiments

49

log-likelihood per item of models, normalized by log-likelihood of hybrid model (which is 1.0)

Thursday, April 10, 14

slide-50
SLIDE 50

experiments

50

hybrid always wins, but recency model is close

Thursday, April 10, 14

slide-51
SLIDE 51

experiments

51

recency beats quality

Thursday, April 10, 14

slide-52
SLIDE 52

experiments

52

learning per-item quality scores always beats setting scores to be equal to popularity

Thursday, April 10, 14

slide-53
SLIDE 53

experiments

53

recency without scores > recency using popularity as quality scores

Thursday, April 10, 14

slide-54
SLIDE 54

experiments

54

learned quality scores are quite different from popularity (Kendall-Tau coefficient of 0.44)

Thursday, April 10, 14

slide-55
SLIDE 55

experiments

55

can our weights be compressed? currently, we learn a weight for each possible previous position

Thursday, April 10, 14

slide-56
SLIDE 56

experiments

56

weights follow power law with exponential cutoff Pr[x] ∝ (x + γ)−αe−βx

Thursday, April 10, 14

slide-57
SLIDE 57

experiments

57

log-likelihood of variants of recency model (full recency model set to 1.0) similar results for hybrid model

Thursday, April 10, 14

slide-58
SLIDE 58

conclusion

58

studied repeat consumption across many domains found recency and quality to be strong empirical effects in characterizing reconsumption developed quality, recency, and hybrid models validated these models on lots of real data

Thursday, April 10, 14

slide-59
SLIDE 59

thanks!

59

Thursday, April 10, 14

slide-60
SLIDE 60

60

Thursday, April 10, 14

slide-61
SLIDE 61

61

Thursday, April 10, 14

slide-62
SLIDE 62

recency

62

two problems:

  • 1. hit ratio depends on number of unique

items in the sequence

  • 2. some number of hits is expected

Thursday, April 10, 14

slide-63
SLIDE 63

recency

63

solutions:

  • 1. use normalized hit ratio: divide hit ratio

by , the upper bound on hit ratio

  • 2. compare to normalized hit ratios on

randomly shuffled version of sequences 1 − u/c

another baseline: compare to optimal stable cache (fraction of consumptions accounted for by top k items)

Thursday, April 10, 14

slide-64
SLIDE 64

satiation

64

no evidence of satiation in our data

Thursday, April 10, 14