Computer Mediated Transactions Hal Varian Google April 7 1 - - PowerPoint PPT Presentation

computer mediated transactions
SMART_READER_LITE
LIVE PREVIEW

Computer Mediated Transactions Hal Varian Google April 7 1 - - PowerPoint PPT Presentation

Computer Mediated Transactions Hal Varian Google April 7 1 Outline -- what does CMT enable? There is now a computer in the middle of most economic transactions. What does this enable? 1. Data extraction and analysis 2. Personalization and


slide-1
SLIDE 1

1

Computer Mediated Transactions

Hal Varian Google April 7

slide-2
SLIDE 2

Outline -- what does CMT enable?

  • 1. Data extraction and analysis
  • 2. Personalization and customization
  • 3. Experimentation and continuous improvement
  • 4. Contractual innovation

There is now a computer in the middle of most economic transactions. What does this enable?

slide-3
SLIDE 3

Data extraction and analysis

slide-4
SLIDE 4

Initial claims: good leading indicator for recessions Grey bars indicate recessions

slide-5
SLIDE 5

Google Correlate with initial claims data

slide-6
SLIDE 6

Initial claims and [unemployment filing]

slide-7
SLIDE 7

Nowcasting initial claims

Predict NSA initial claims (yt), using lagged values of initial claims and contemporaneous queries on [unemployment filing] (xt) Base: yt = a0 + a1 yt-1 + a52 yt-52 + et Trends: yt = a0 + a1 yt-1 + a52 yt-52 + b xt + et Result: R2 goes from 80.8% to 87.6%

slide-8
SLIDE 8

How can we make variable selection easier? Big data

Rows or columns?

How to choose best predictors?

Simple correlation? Judgment? Stepwise regression? Lasso, LARS, Elastic Net?

Spike-and-slab regression

Kalman filter for trend and seasonality George-McCulloch [1997]) ;Madigan-Raftery [1994] for regression Prior probability variable is included (spike) Prior probability distribution over coefficient value (slab) Sample from simulated posterior, average to get prediction See Scott and Varian (2012, 2013) for details Download R package from CRAN (BoomSpikeSlab, bsts)

slide-9
SLIDE 9

New Home Sales in US

slide-10
SLIDE 10

Raw correlation

slide-11
SLIDE 11

Predictors chosen by model

slide-12
SLIDE 12

model: yt = trendt + seasonalt + b1 x1t + b2 x2t plot1: yt = trendt plot2: yt = trendt + seasonalt plot3: yt = trendt + seasonalt + b1 x1t plot4: yt = trendt + seasonalt + b1 x1t + b2x2t Incremental fit plots

Visualize how much each predictor contributes to model fit

slide-13
SLIDE 13

Trend

slide-14
SLIDE 14

Seasonal

slide-15
SLIDE 15

[appreciation rate]

slide-16
SLIDE 16

[irs 1031]

slide-17
SLIDE 17

[century 21 realtors]

slide-18
SLIDE 18

[real estate purchase]

slide-19
SLIDE 19

[80-20 mortgage]

slide-20
SLIDE 20

One month ahead forecast

Does 23% better than simple AR1 model

slide-21
SLIDE 21

Geo-amplification

You can do the same thing for any geographically distributed variable Find out queries or query categories that are predictive of that variable Make predictions/extrapolations to other geographies Many applications Social science Policy Marketing Politics Example: New York Times index of “hard places” (June 26, 2014)

slide-22
SLIDE 22

Where are the hardest places to live in the U.S.?

slide-23
SLIDE 23

What queries are associated with “hard places”?

Based on state level data and Google Correlate

slide-24
SLIDE 24

What queries are associated with “easy places”?

Based on state level data and Google Correlate

slide-25
SLIDE 25

Customization and personalization

slide-26
SLIDE 26

Assembled in America

slide-27
SLIDE 27

Predictors of survey response

slide-28
SLIDE 28

Top and bottom cities' predicted score

Kershaw, SC: 83.2 % Summersville, WV: 82.8 % Grundy, VA: 82.8 % Chesnee, SC: 82.7 % Duffield, VA: 82.5 % Norton, VA: 82.3 % Jonesville, VA: 82.2 % Walnut Cove, NC: 82.2 % Weston, WV: 82.2 % Ennice, NC: 82.1 % Calipatria, CA: 40.2 % Fremont, CA: 40.2 % Mountain View, CA: 40.8 % San Jose, CA: 41.4 % Berkeley, CA: 41.4 % Redmond, WA: 41.5 % Glendale, CA: 41.5 % Cupertino, CA: 41.6 % Palo Alto, CA: 41.7 % Daggett, CA: 41.9 %

Top Bottom

slide-29
SLIDE 29

Assembled in America by DMA

slide-30
SLIDE 30

Experimentation and continuous improvement

slide-31
SLIDE 31

“To find out what happens when you change something, it is necessary to change it.” George Box

Causal inference

slide-32
SLIDE 32

Experiments: gold standard for causality What goes wrong with observational data? yt = xt b + et = observed + unobserved Correlation: if you observe x what is a good prediction for y? Causality: what happens to y if you change x? Confounder: something unobserved that affects both x and y

slide-33
SLIDE 33

Advertising Q: How do your know your advertising works? A: Every December I increase my ad spend...

slide-34
SLIDE 34

Advertising Q: How do your know your advertising works? A: Every December I increase my ad spend...and every December my sales go up!

slide-35
SLIDE 35

Advertising Q: How do your know your advertising works? A: Every December I increase my ad spend...and every December my sales go up! “Christmas holidays” are a confounding

  • variable. Here the solution is obvious, but

what happens if you can’t observe the confounders?

slide-36
SLIDE 36

Train, test, treat, compare

  • 1. Train a model on historical data
  • 2. Test the model on a holdout
  • 3. Apply treatment at some time
  • 4. Compare observed outcome with the

treatment to the counterfactual prediction of model

slide-37
SLIDE 37

Compare outcome to counterfactual

slide-38
SLIDE 38

Actual and natural experiments

You want randomized experiments to reduce systematic

  • effects. Sometimes you get randomization “for free”.

Impact of class size on performance

  • Why are classes larger in some schools than others?
  • In Israel maximum class size is 40. Classes with 41 are split in two.
  • Can identify causal effect of class size on performance

Impact of ad impressions on movie revenue Super Bowl facts

  • Ads are bought long before teams are chosen
  • Home cities of participating teams see elevated viewership
  • Natural randomization
slide-39
SLIDE 39

Experimentation capability should be coded in static code: const threshold = 3.14 if (x > threshold) do something learning code: param threshold = {3.13, 3.14, 3.15) performance = (num_right, num_wrong) if (x > threshold) do something report performance Research challenge: How to turn legacy code into learning code? Nice example: Keith Winstein et al, An Experimental Study of the Learnability of Congestion Control

slide-40
SLIDE 40

Contractual innovation

slide-41
SLIDE 41

What is a contract? “If you do this, I’ll do that.” But how do you verify “this” and “that”? Can only contract on things that can be

  • bserved and verified...
slide-42
SLIDE 42

What is a contract? “If you do this, I’ll do that.” But how do you verify “this” and “that”? Can only contract on things that can be

  • bserved and verified…

But with a computer in the middle of the transaction, lots more can be verified.

slide-43
SLIDE 43

Examples of contracts

  • “You take me to my hotel on the best route, I will

pay you.”

  • “You use the car and send me a monthly

payment.”

  • “You drive this rental car safely, I will give you a

discount.”

  • “You display an ad that brings someone to my

store, I will pay you.”

slide-44
SLIDE 44

Summary

  • 1. Data extraction and analysis
  • a. Can use searches to nowcast economic activity
  • 2. Personalization and customization
  • a. Can customize ads to different geos
  • 3. Experimentation and continuous improvement
  • a. Can use ML to estimate causal impact via train-test-

treat-compare cycle

  • 4. Contractual innovation
  • a. As more things become observable, more contracts

become viable

slide-45
SLIDE 45

Appendix

slide-46
SLIDE 46

Advertise a movie about surfing Honolulu: $1 ad spend $10 ticket sales Fargo: $0.10 ad spend $1 ticket sales Ticket sales = 10 x ad spend fits the data perfectly...

slide-47
SLIDE 47

Advertise a movie about surfing Honolulu: $1 ad spend $10 ticket sales Fargo: $0.10 ad spend $1 ticket sales Ticket sales = 10 x ad spend fits the data perfectly... But do you really believe that if you increased spend to $1 in Fargo, you would get 10 times the ticket sales?

slide-48
SLIDE 48

Ads and confounders “Interest in surfing” is a confounding variable Happens all the time in economics since people choose x (observing things you don’t

  • bserve.)

Causal effect of college on education? Causal effect of fertilizer on yield? Causal effect of health care on income?

slide-49
SLIDE 49

Superbowl as a natural ad experiment

  • 1. Viewership in home cities of teams that are

playing is about 10-15% higher than elsewhere.

  • 2. Ads are purchased long before it is known

who is playing Advertiser buys ad slot, then 2-3 months later two “random” cites get 10-15% more ad exposure.

slide-50
SLIDE 50

Regression discontinuity

Impact of class size on performance

  • Why are classes larger in some schools than others?
  • In Israel maximum class size is 40. Classes with 41 are split in two.
  • Can identify causal effect of class size on performance

What would happen to auto fatalities if you changed the minimum drinking age?

  • 20.5 year olds are a lot like 21.5 year olds
  • So looking at people on each side of the threshold can give estimate of

causal effect

slide-51
SLIDE 51

Regression discontinuity