Wrap up & Experimentation CS147L Lecture 8 Mike Krieger - - PowerPoint PPT Presentation

wrap up experimentation
SMART_READER_LITE
LIVE PREVIEW

Wrap up & Experimentation CS147L Lecture 8 Mike Krieger - - PowerPoint PPT Presentation

Wrap up & Experimentation CS147L Lecture 8 Mike Krieger Wednesday, November 25, 2009 Intro Wednesday, November 25, 2009 Welcome back! Wednesday, November 25, 2009 By the end of today... - Questions from implementations - A few


slide-1
SLIDE 1

Wrap up & Experimentation

CS147L Lecture 8 Mike Krieger

Wednesday, November 25, 2009

slide-2
SLIDE 2

Intro

Wednesday, November 25, 2009

slide-3
SLIDE 3

Welcome back!

Wednesday, November 25, 2009

slide-4
SLIDE 4

By the end of today...

  • Questions from implementations
  • A few implementation loose ends
  • A/B testing primer
  • Google Analytics

Wednesday, November 25, 2009

slide-5
SLIDE 5

Questions?

Wednesday, November 25, 2009

slide-6
SLIDE 6

Loose ends

Wednesday, November 25, 2009

slide-7
SLIDE 7

Floaty bar

Wednesday, November 25, 2009

slide-8
SLIDE 8

Canonical implementation

  • Gmail's mobile web app

Wednesday, November 25, 2009

slide-9
SLIDE 9

Gmail Demo

Wednesday, November 25, 2009

slide-10
SLIDE 10

Fitts Thumb

  • Though hand input not quite the same as

mouse, general principle applies:

  • Minimize thumb-moving distance
  • Make targets even larger than you think

they need to be (thumbs are clumsy)

Wednesday, November 25, 2009

slide-11
SLIDE 11

Getting plugin

  • Included with jQTouch under extensions/
  • Copy jqt.floaty.js into your JS folder

Wednesday, November 25, 2009

slide-12
SLIDE 12

Integrating & customizing

Wednesday, November 25, 2009

slide-13
SLIDE 13

Initializing

Wednesday, November 25, 2009

slide-14
SLIDE 14

Styling

Wednesday, November 25, 2009

slide-15
SLIDE 15

Demo

  • floaty.html

Wednesday, November 25, 2009

slide-16
SLIDE 16

Bottom bar

Wednesday, November 25, 2009

slide-17
SLIDE 17

UITabBarController on iPhone

Wednesday, November 25, 2009

slide-18
SLIDE 18

This won't work...

  • Traditional approach: position: fixed at

bottom:0

  • Or, div with overflow:hidden and bottom

bar with absolute position and bottom:0

Wednesday, November 25, 2009

slide-19
SLIDE 19

On the iPhone

  • Scrolling scrolls entire page
  • Floaty bar is probably the way to go...
  • You could hack it up, but your users

would have to learn to two-finger scroll for everything

Wednesday, November 25, 2009

slide-20
SLIDE 20

Full screening

Wednesday, November 25, 2009

slide-21
SLIDE 21

App mode

  • Only engaged when users click home

button (might want to prompt them, or do it before hand)

Wednesday, November 25, 2009

slide-22
SLIDE 22

App mode

Wednesday, November 25, 2009

slide-23
SLIDE 23

A/B Testing

Wednesday, November 25, 2009

slide-24
SLIDE 24

Why A/B test?

  • You can have the best designers & great

PMs...

  • But nothing beats seeing what on earth

people actually do

  • Differences in usability, virality, and

revenue

Wednesday, November 25, 2009

slide-25
SLIDE 25

Framework

  • Selecting an experiment
  • Choosing variations
  • Selecting / sampling users
  • Deploying & serving variations
  • Measuring user behavior
  • Analyzing results

Wednesday, November 25, 2009

slide-26
SLIDE 26

What makes a good experiment?

  • Measuring user funnels through a sale
  • Click-through rates for links
  • Time spent / time until an action is taken
  • Performance questions
  • Email newsletters

Wednesday, November 25, 2009

slide-27
SLIDE 27

(continued)

  • Minor tweaks to site design
  • Flows with a clear goal

Wednesday, November 25, 2009

slide-28
SLIDE 28

In sum—

  • a measurable user behavior that you

believe will be modulated by tweaks in design

Wednesday, November 25, 2009

slide-29
SLIDE 29

What A/B testing won't tell you

  • Is it aesthetically pleasing?
  • Is it fun?
  • Is it accessible?
  • Is this even what my company should be

doing?

Wednesday, November 25, 2009

slide-30
SLIDE 30

As Buxton would say

  • A/B testing will help you get the design

right, but can't help you get the right design in the first place

Wednesday, November 25, 2009

slide-31
SLIDE 31

Hillclimbing

controlled web experiments

Wednesday, November 25, 2009

slide-32
SLIDE 32

But really...

controlled web experiments

Wednesday, November 25, 2009

slide-33
SLIDE 33

Choosing Variations

  • Think in terms of variables
  • Spectrum of choices
  • If time (and participant pool), look at

interactions, too

Wednesday, November 25, 2009

slide-34
SLIDE 34

Examples

  • Twitter homepage call to action
  • Join the Conversation
  • or
  • Get Started

Wednesday, November 25, 2009

slide-35
SLIDE 35

Iteration

Wednesday, November 25, 2009

slide-36
SLIDE 36

Google Homepage

Wednesday, November 25, 2009

slide-37
SLIDE 37

...experiment?

  • People actively want to join Google's A/B

tests

  • But can use interest/reactions as proxies

for results in this case

Wednesday, November 25, 2009

slide-38
SLIDE 38

Selecting/sampling users

  • Two general approaches

Wednesday, November 25, 2009

slide-39
SLIDE 39

The naive way

  • Every time a user loads a page, they

have a random chance of ending up in a bucket

Wednesday, November 25, 2009

slide-40
SLIDE 40

Why doesn't this work?

  • Order effects
  • Confuse the users, who want a

consistent experience

  • Random functions not so random

Wednesday, November 25, 2009

slide-41
SLIDE 41

Using a hashing function

  • Suggested by Kohavi in his Web

Experimentation paper

  • How it's implemented at Meebo

Wednesday, November 25, 2009

slide-42
SLIDE 42

General idea

  • Be as consistent as possible per user
  • If we can, use User ID (across

computers)

  • If we can't, use a cookie (at least

consistent at one computer)

  • At the very worst, assign randomly

Wednesday, November 25, 2009

slide-43
SLIDE 43

MD5

  • Hashing function; not great for encryption

but fine for our purposes

  • Problem: hashes will be long strings and

we actually want a probability distribution

Wednesday, November 25, 2009

slide-44
SLIDE 44

Solution

  • Hash the unique identifier plus the

experiment name

  • Get the hexadecimal digest of the

resulting hash

  • Convert to a decimal and see where it

falls along the range of 0 to the Max number in the distribution

Wednesday, November 25, 2009

slide-45
SLIDE 45

In other words...

Wednesday, November 25, 2009

slide-46
SLIDE 46

(continued)

Wednesday, November 25, 2009

slide-47
SLIDE 47

Notes

  • Will be evenly distributed from 0 to 1.0
  • We can use this probability to bucket

people

  • Given the same input, will result in same

number every time

Wednesday, November 25, 2009

slide-48
SLIDE 48

Deploying & serving variations

  • For prototypes, much can be hard-coded
  • For real production use, infrastructure

can make life easier in the long run

Wednesday, November 25, 2009

slide-49
SLIDE 49

One easy way

Wednesday, November 25, 2009

slide-50
SLIDE 50

In the long term

  • Build out front-end to turn experiments
  • n/off or config file

Wednesday, November 25, 2009

slide-51
SLIDE 51

Overriding Javascript

  • Problem: you already have most of your

functions defined, but want your treatment to do something slightly different

Wednesday, November 25, 2009

slide-52
SLIDE 52

Encapsulation

Wednesday, November 25, 2009

slide-53
SLIDE 53

Overwriting

Wednesday, November 25, 2009

slide-54
SLIDE 54

Monkeypatching

  • Idea: we want to do mostly the same

thing, but do something before/ afterwards that's slightly different, or modify the input

Wednesday, November 25, 2009

slide-55
SLIDE 55

How to

Wednesday, November 25, 2009

slide-56
SLIDE 56

Measuring behavior

  • Are people actually doing something

different?

  • Using log lines, or writing straight to DB

Wednesday, November 25, 2009

slide-57
SLIDE 57

Normal Log lines

10.32.109.7 - - [18/Nov/2009:22:36:32 -0800] "GET /courses/ cs147/images/media.jpg HTTP/1.1" 200 191910 "http:// hci.stanford.edu/courses/cs147/" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_2; en-us) AppleWebKit/531.21.8 (KHTML, like Gecko) Version/4.0.4 Safari/531.21.10"

Wednesday, November 25, 2009

slide-58
SLIDE 58

Tracking Log Lines

10.32.109.7 - - [18/Nov/2009:22:36:32 -0800] "GET /track? condition=bluebutton&event=click&timebeforeclick=500 HTTP/1.1" 200 191910 "http://hci.stanford.edu/courses/cs147/" "Mozilla/ 5.0 (Macintosh; U; Intel Mac OS X 10_6_2; en-us) AppleWebKit/ 531.21.8 (KHTML, like Gecko) Version/4.0.4 Safari/531.21.10"

Wednesday, November 25, 2009

slide-59
SLIDE 59

How to process?

  • In the small: use Python
  • In the large: use Hadoop and Pig

Wednesday, November 25, 2009

slide-60
SLIDE 60

Pig, ultrabriefly

  • (because I think this will be huge in a year
  • r so)

Wednesday, November 25, 2009

slide-61
SLIDE 61

Pig

  • SQL-like language built on top of Hadoop
  • Makes writing Map/Reduce tasks really

quick

  • In use at Yahoo!, Twitter, Meebo, etc

Wednesday, November 25, 2009

slide-62
SLIDE 62

Pig sample code

Wednesday, November 25, 2009

slide-63
SLIDE 63

And the best part...

  • Will compile & run for you over as many

machines as necessary

Wednesday, November 25, 2009

slide-64
SLIDE 64

PHP Logging scripts

  • Wouldn't work for production data
  • Fine for any A/B tests or just logging /

instrumentation you want to do

Wednesday, November 25, 2009

slide-65
SLIDE 65

DB Schema

Wednesday, November 25, 2009

slide-66
SLIDE 66

Logging events

Wednesday, November 25, 2009

slide-67
SLIDE 67

Example

Wednesday, November 25, 2009

slide-68
SLIDE 68

HTML

Wednesday, November 25, 2009

slide-69
SLIDE 69

JS

Wednesday, November 25, 2009

slide-70
SLIDE 70

Closing the loop

  • Get the data out & aggregate
  • Visualize!

Wednesday, November 25, 2009

slide-71
SLIDE 71

Reading data from SQLite

  • report.php

Wednesday, November 25, 2009

slide-72
SLIDE 72

Basic code

Wednesday, November 25, 2009

slide-73
SLIDE 73

Reformat as data series

Wednesday, November 25, 2009

slide-74
SLIDE 74

Result

[{"label":"blue","data":[[0,2]]},{"label":"red","data": [[1,12]]},{"label":"green","data":[[2,1]]}]

Wednesday, November 25, 2009

slide-75
SLIDE 75

Analysis options

  • Excel
  • Tableau
  • R
  • Javascript or Flash graphing/visualization

libraries

Wednesday, November 25, 2009

slide-76
SLIDE 76

Briefly: flot

  • jQuery plugin
  • Super useful for basic graphing &

charting needs

  • Also handles time-series data well
  • http://code.google.com/p/flot/

Wednesday, November 25, 2009

slide-77
SLIDE 77

From report.php to flot

Wednesday, November 25, 2009

slide-78
SLIDE 78

Demo

flot.html

Wednesday, November 25, 2009

slide-79
SLIDE 79

Even better: Protovis

  • Stanford Graphics lab project
  • http://vis.stanford.edu/protovis/

Wednesday, November 25, 2009

slide-80
SLIDE 80

Significant change?

  • Chi-Squared test

Wednesday, November 25, 2009

slide-81
SLIDE 81

Chi-Squared

  • Idea: measure whether a particular

distribution of measures deviates significantly from expected

Wednesday, November 25, 2009

slide-82
SLIDE 82

Null hypothesis

  • Button color has no impact on click-

through rate

Wednesday, November 25, 2009

slide-83
SLIDE 83

Testing

Where: O(i) is the observed frequency, E(i) is the expected frequency degrees of freedom = (number of categories) - 1

Wednesday, November 25, 2009

slide-84
SLIDE 84

Sample data

Blue Red Green Total 267 267 266 800

Significant?

Wednesday, November 25, 2009

slide-85
SLIDE 85

Sample data

Blue Red Green Total 270 250 260 800

Significant?

Wednesday, November 25, 2009

slide-86
SLIDE 86

Sum the differences

Blue: (270-267)^2 / 267 = 0.03 Red: (250-267)^2 / 267 = 1.08 Green: (280-267)^2 / 267 = 0.63 0.03 + 1.08 + 0.63 = 1.74 = χ2

Wednesday, November 25, 2009

slide-87
SLIDE 87

Look it up in table

  • (or use R/SPSS/something fancier)
  • http://www2.lv.psu.edu/jxm57/irp/

chisquar.html

Wednesday, November 25, 2009

slide-88
SLIDE 88

Significant?

p ~ 0.4 (we want 0.05) not significant

Wednesday, November 25, 2009

slide-89
SLIDE 89

Sample data

Blue Red Green Total 240 230 330 800

Significant?

Wednesday, November 25, 2009

slide-90
SLIDE 90

Sum the differences

Blue: (240-267)^2 / 267 = 2.73 Red: (230-267)^2 / 267 = 5.12 Green: (330-267)^2 / 267 = 14.86 2.73 + 5.12 + 14.86 = 22 = χ2

Wednesday, November 25, 2009

slide-91
SLIDE 91

Significant?

p < 0.001 highly significant

Wednesday, November 25, 2009

slide-92
SLIDE 92

Wrap-up

  • Can be a bit of work
  • But can lead to amazing insights
  • Plus, data analysis & visualization is really

fun

Wednesday, November 25, 2009

slide-93
SLIDE 93

Google Analytics

Wednesday, November 25, 2009

slide-94
SLIDE 94

Why?

  • It's free!
  • For a small company or project, much

better than rolling out some of these analysis tools for yourself

Wednesday, November 25, 2009

slide-95
SLIDE 95

Signing up

  • http://analytics.google.com

Wednesday, November 25, 2009

slide-96
SLIDE 96

Integrating JS

Insert before </body>

Wednesday, November 25, 2009

slide-97
SLIDE 97

Tip: Tracking JS events

_trackevent function, takes (category, event, optional_key, optional_value)

Wednesday, November 25, 2009

slide-98
SLIDE 98

Using Web Optimizer

  • Attached to AdSense
  • Provides A/B testing tools with integrated

confidence interval and significant different calculations

Wednesday, November 25, 2009

slide-99
SLIDE 99

Demo

Wednesday, November 25, 2009

slide-100
SLIDE 100

Final notes

Wednesday, November 25, 2009

slide-101
SLIDE 101

Where to go from here?

  • Four fun things to explore

Wednesday, November 25, 2009

slide-102
SLIDE 102

Building full web apps in Django

  • Templating
  • Wrapping Request/Response
  • Interfacing with the database
  • Forms

Wednesday, November 25, 2009

slide-103
SLIDE 103

Some code from courseapp

Wednesday, November 25, 2009

slide-104
SLIDE 104

and the view...

Wednesday, November 25, 2009

slide-105
SLIDE 105

Trying out Google App Engine

  • Great for weekend projects that could

become something more

  • Django templating built-in
  • Can also use (most) of Django with app-

engine-patch (http://code.google.com/p/ app-engine-patch/)

  • And, it'll scale if you need it to

Wednesday, November 25, 2009

slide-106
SLIDE 106

Adapting for native app

  • PhoneGap lets you access native app

features from JavaScript

  • Can continue your class projects if you

want to take them further

  • Caveat: will have to get dev account

Wednesday, November 25, 2009

slide-107
SLIDE 107

Using Mechanical Turk for app feedback

  • And quick testing of ideas

Wednesday, November 25, 2009

slide-108
SLIDE 108

Evaluating Scenarios

Wednesday, November 25, 2009

slide-109
SLIDE 109

Evaluating Scenarios

Wednesday, November 25, 2009

slide-110
SLIDE 110

Evaluating Scenarios

Wednesday, November 25, 2009

slide-111
SLIDE 111

Wednesday, November 25, 2009

slide-112
SLIDE 112

Wednesday, November 25, 2009

slide-113
SLIDE 113

Wednesday, November 25, 2009

slide-114
SLIDE 114

Reactions

  • “Personally I prefer the idea of storyboard one. This is

because the user freely walks around the museum as they would traditionally, yet automatically receive info about exhibits - a virtual guide without user input. Much more impressive.”

  • “Personally I think the idea from storyboard one is more
  • compelling. The reason for this is that I would be

interested in finding out interesting information about a piece of artwork or a particular artist that I couldn't just get at the museum. The map of the museum is something that I can get at the museum on a piece of paper that doesn't require me to be pulling out my phone and wasting the battery to get to an exhibit.”

Wednesday, November 25, 2009

slide-115
SLIDE 115

Reactions

  • “Students who are visiting for a school

assignment and have limited time to partake in all the exhibits would definitely find that option helpful.”

  • “I liked storyboard 2 because of [the use
  • f] cell phones in a physical space.”
  • Self-reported as non-designers

Wednesday, November 25, 2009

slide-116
SLIDE 116

Final plug

  • We're hiring at Meebo!
  • User Experience, Usability, UI...
  • meebo.com/jobs or email me directly at

mike.krieger@meebo-inc.com

Wednesday, November 25, 2009

slide-117
SLIDE 117

Thanks!

Wednesday, November 25, 2009

slide-118
SLIDE 118

Q’s?

Wednesday, November 25, 2009