Dynamic Micro Targeting: Fitness- Based Approach to Predicting - PowerPoint PPT Presentation

Dynamic Micro Targeting: Fitness- Based Approach to Predicting Individual Preferences Tianyi Jiang Alexander Tuzhilin Leonard N. Stern School of Business New York University February 2007 1

Personalization Research From Amazon shopping to choosing your politician, personalizing your decision choices via Data Mining 2

Research Questions o How to effectively segment customer base? o What is the “ideal” segmentation of the customer base? o Is it practically achievable? o What is the distribution of the segment sizes in this ideal segmentation scheme? o Is it better to partition customers and products together to achieve better targeting? 3

Customer Segmentation via Direct Grouping Methods Direct grouping of customers C into segments:  combine the transactional data of customers C m , C m+1 , …, C n into a group P i = (C m , C m+1 , …, C n )  Build a predictive model  Define fitness score on the model e.g. RAE, RME, etc. 4

Customer Segmentation via Direct Grouping Methods (Example) • C are Amazon customers • P i is customers from NewYork City • X 1 , … , X p are these customer’s demographic and purchase attributes such as age, gender, day of purchase, purchase total, etc., • Y - will they buy a product while visiting Amazon.com, predicts these customers’ propensity to • purchase during a Amazon.com visit • fitness score is the Relative Absolute Error 5

Optimal Customer Segmentation (OCS) Problem Given the customer base C of N customers and predictive model Partition C into the set of mutually exclusive collectively exhaustive segments P = {P 1 ,...,P k }, • Build predictive model for each segment P i • Find optimal partitioning P = {P 1 ,...,P k } so that the objective function is maximized over all possible partitions P, where is the fitness function for segment P i and weight α i specifies “importance” of segment i . 6

OCS Solution Space Theorem. OCS problem is NP- hard… Therefore… suboptimal solution: • find a suboptimal polynomial customer segmentation methods providing reasonable fitness 7

Related Work • Combinatorial Optimization Problems in Operations Research – ( Land et al. 1960, Guignard et al. 1987, Gomory 1958) • Customer segmentation and clustering in Marketing Research – clustering, mixture models (Wedel et al. 2000) • Data Mining Research on Customer Segmentation – basket shopping, hierarchical, & pattern based clustering (Brijs et al. 2001, Jiang et al. 2006, Yang et al. 2003) 8

Traditional Segmentation Methods Hierarchical Clustering (HC) • compute some summary statistics from customers’ demographic and transactional data • consider these statistics as points in an n - dimensional space • group customers into segments by applying various clustering algorithms to these n - dimensional points. * Jiang, Tuzhilin , “Segmenting Customers from Population to Individuals: Does 1 -to- 1 Keep Your Customers Forever?” TKDE 18(10), 2006 9

Traditional Segmentation Methods Affinity Propagation (AP) • n unique customers • AP identifies a set of training points, exemplars , as cluster centers by recursively propagating “affinity messages” among training points. • Similar to greedy K-medoids algorithms, AP picks exemplars as cluster centers during every iteration • where each exemplar in our study is a single customer represented by his/her summary statistics vector. 10

Suboptimal Efficient Solution of OCS Problem using Direct Grouping Iterative Merge (IM) Method: start with segments containing individual • customers, • iteratively merge two existing segments Seg A , and Seg B at a time when I. the predictive model based on the combined data performs better and II. combining Seg A with any other existing segments would have resulted in a worse performance than the combination of both Seg A and Seg B . 11

Micro Targeting… Product Types × Customer Matrix ( √ stands for a purchase of product type by customer) … Customer Customer Customer 1 2 N √ Product Type 1 √ √ Product Type 2 … … … … … √ √ Product Type L 12

Micro Targeting Method Iterative Merge Products (IM_Prod): start with segments containing individual • customer’s specific product type transaction data • Bootstrap operation to merge small segments based on K-nearest neighbors of customer’s product type and demographic summary statistics vectors • Run IM with customers’ product type segments 13

Empirical Comparisons of Different Approaches Comparing Three Segmentation Approaches: • Statistics based • Direct grouping based • Micro Targeting based Across five dimensions of different • Types of datasets (ComScore, Nielsen, Synthetic data) • Types of customers (high vs. low-volume) • Types of predictive models (classifiers J48 & Naïve Bayes) • Dependent variables (3 variables per dataset) • Performance Measures  Root Mean Squared Error – RME  Relative Absolute Error – RAE  Correctly Classified Instances - CCI 14

Data Sets: Customer Types & Transaction Counts Average Customer % of Total Total DataSet Families Transactions Type Population Transactions Per Family ComScore High 5% 2,230 137,157 62 ComScore Low 5% 2,230 24,344 11 Nielsen High 10% 156 28,985 186 Nielsen Low 10% 156 5,007 32 Syn-High High 100% 2,048 204,800 100 Syn-Low Low 100% 2,048 20,480 10 15

Statistical Significance We apply the Mann-Whitney rank test to compare any two performance distributions across • 6 datasets • 3 variables • 2 classifiers • 3 performance measures for a total of 108 pair-wise distribution tests between any segmentation approaches 16

Statistical Significance The null hypothesis for comparing distributions generated by methods A and B for a performance measure is: (I) H 0 : The distribution of a performance measure generated by method A is not different from the distribution of the performance measure generated by method B. H 1 +: The distribution of a performance measure generated by method A is different from the distribution of the performance measure generated by method B in the positive direction. H 1 -: The distribution of a performance measure generated by method A is different from the distribution of the performance measure generated by method B in the negative direction. 17

Empirical Results Comparing All Methods Methods HC IM IM_Prod H+ H- H+ H- H+ H- 66 18 12 57 0 108 AP - - 6 90 0 108 HC 108 0 96 0 - - IM_Prod Performance tests across all statistics-based segmentation methods for Hypothesis Test (I) at 95% significance level (numbers in columns H 1 + and H 1 - indicate the number of statistical tests that reject hypothesis H 0 . Total significance tests per method to method comparison pair is 108) 18

Empirical Results Sample CCI score distributions (“Day of the Week” prediction across High & Low-Volume ComScore Customers) IM_Prod IM Low-Volume Datasets High-Volume Datasets 19

Empirical Results Error distributions (“Day of the Week” prediction across High & Low-Volume ComScore Customers) RAE High Volume RME Low Volume IM_Prod IM 20

Empirical Results Segment Size Distribution Generated by IM_Prod and IM 3000 16000 Number of Segments 14000 2500 Number of Segments 12000 2000 10000 IM_Prod 1500 8000 6000 1000 4000 500 2000 0 0 1 3 5 22 52 118 688 732 983 1 3 5 7 68 99 408 674 1032 1304 Segment Size Segment Size 8000 5000 7000 4500 4000 6000 IM 3500 5000 3000 Count Count 4000 2500 3000 2000 1500 2000 1000 1000 500 0 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 1 8 15 22 29 36 43 50 Segment Size Segment Size High-Volume Datasets Low-Volume Datasets 21

Empirical Results Customer Segment Membership Count Distribution 10000 7000 9000 6000 8000 5000 7000 Frequency Frequency 6000 4000 5000 3000 4000 3000 2000 2000 1000 1000 0 0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 High Vol Segment Membership Low Vol Segment Membership High-Volume Datasets Low-Volume Datasets 22

Empirical Results Generated segments in the “Segment Count” × “Average CCI per segment” × “Number of Purchases in Segment” space IM_Prod IM High-Volume Datasets Low-Volume Datasets 23

IM_Prod Computational Expense 24

Conclusions Partition customers based on micro targeting • results in formation of “better” customer segmentations than traditional clustering based and fitness-based direct grouping approaches Micro targeting produces smaller segments than • Direct Grouping methods The above results add support for Micro • Segmentation (partition based on both customer and product types) approaches to personalization 25

Future Research • Improve method not just based on predictive accuracy, but also in terms of the standard marketing oriented performance measures such as customer value, profitability and other economics based performance measures • Investigate scalability and generalizability issues of our approach against different types of very large real world datasets and be able to handle incremental or time series data 26

Dynamic Micro Targeting: Fitness- Based Approach to Predicting - PowerPoint PPT Presentation

Dynamic Micro Targeting: Fitness- Based Approach to Predicting Individual Preferences Tianyi Jiang Alexander Tuzhilin Leonard N. Stern School of Business New York University February 2007 1 Personalization Research From Amazon shopping to

W3: Individual Fitness Fitness Exercise Testing Pyschology Exercise Benchmark Sociology

2. Two typical geometries of fitness landscapes Fitness landscape analysis for understanding and

Improving Privacy Protection in the area of Behavioural Targeting Privacy & Innovation, Hong

Fitness + In-Memory Computing = Getting ahead of the game Craig Gresbrink Solutions Architect

An overview of the difference between inflation targeting, NGDP targeting, and a Taylor Rule;

Wellness 3C: Curriculum On Facts about Fitness Facts about Fitness Cardiorespiratory

FOOD FITNESS MINDSET www.staceygreenside.org ABOUT ME Wife and mother Formerly worked

Operation Fitness Personalized Health and Fitness Presented by Bishop-Lyons Entertainment

Race Driver Fitness, nutrition and development The best way to physical fitness and preparation

1. Basis of fitness landscape Fitness landscape analysis for understanding and designing local

How to Find the Poor: Field Experiments on Targeting Abhijit Banerjee, MIT Why is targeting

From Tap Water to Marine Organisms: a Micro-Spectroscopic Approach to Micro-Plastic

Trading Space for Place Micro-Loft Case Studies WHAT IS A MICRO-LOFT? Micro-Lofts are rental or

MICRO STANDING OFFER PROGRAM (MICRO-SOP) JUNE 9 VANCOUVER MEETING AND JUNE 10 WEBINAR June 9

BSN MICRO/i KREDIT PRIHATIN & SPECIAL RELIEF FACILITIES Micro Business Division RETAIL

Physical Fitness Lead: Bill Kraus Members: Kirk Erickson, Kathy Janz, Russ Pate, Ken Powell

Text analysis Natural Language Processing, or How to do cool stuff with words. Emily Rae

Regression Albert Bifet May 2012 COMP423A/COMP523A Data Stream Mining Outline 1. Introduction

Shortcuts through Colocation Facilities Vasileios Kotronis 1 , George Nomikos 1 , Lefteris

From Variational to Deterministic Autoencoders or the joys of density estimation in latent spaces

Health Benefits Survey Release Slides October 3, 2018 Drew Altman President and CEO, KFF Gary

Business Perspective Moderator: Holly Emrick Svetz, Womble Carlyle Alison Brown, NAVSYS

A Tale of two Patients A Tale of two Patients 1) 71 yo with persistent AF since 2018. OSA

Chapter 3 Deliberation with Refinement Automated Planning and Acting Methods Malik Ghallab,

Dynamic Micro Targeting: Fitness- Based Approach to Predicting - PowerPoint PPT Presentation

Dynamic Micro Targeting: Fitness- Based Approach to Predicting Individual Preferences Tianyi Jiang Alexander Tuzhilin Leonard N. Stern School of Business New York University February 2007 1 Personalization Research From Amazon shopping to

W3: Individual Fitness Fitness Exercise Testing Pyschology Exercise Benchmark Sociology

2. Two typical geometries of fitness landscapes Fitness landscape analysis for understanding and

Improving Privacy Protection in the area of Behavioural Targeting Privacy &amp; Innovation, Hong

Fitness + In-Memory Computing = Getting ahead of the game Craig Gresbrink Solutions Architect

An overview of the difference between inflation targeting, NGDP targeting, and a Taylor Rule;

Wellness 3C: Curriculum On Facts about Fitness Facts about Fitness Cardiorespiratory

FOOD FITNESS MINDSET www.staceygreenside.org ABOUT ME Wife and mother Formerly worked

Operation Fitness Personalized Health and Fitness Presented by Bishop-Lyons Entertainment

Race Driver Fitness, nutrition and development The best way to physical fitness and preparation

1. Basis of fitness landscape Fitness landscape analysis for understanding and designing local

How to Find the Poor: Field Experiments on Targeting Abhijit Banerjee, MIT Why is targeting

From Tap Water to Marine Organisms: a Micro-Spectroscopic Approach to Micro-Plastic

Trading Space for Place Micro-Loft Case Studies WHAT IS A MICRO-LOFT? Micro-Lofts are rental or

MICRO STANDING OFFER PROGRAM (MICRO-SOP) JUNE 9 VANCOUVER MEETING AND JUNE 10 WEBINAR June 9

BSN MICRO/i KREDIT PRIHATIN &amp; SPECIAL RELIEF FACILITIES Micro Business Division RETAIL

Physical Fitness Lead: Bill Kraus Members: Kirk Erickson, Kathy Janz, Russ Pate, Ken Powell

Text analysis Natural Language Processing, or How to do cool stuff with words. Emily Rae

Regression Albert Bifet May 2012 COMP423A/COMP523A Data Stream Mining Outline 1. Introduction

Shortcuts through Colocation Facilities Vasileios Kotronis 1 , George Nomikos 1 , Lefteris

From Variational to Deterministic Autoencoders or the joys of density estimation in latent spaces

Health Benefits Survey Release Slides October 3, 2018 Drew Altman President and CEO, KFF Gary

Business Perspective Moderator: Holly Emrick Svetz, Womble Carlyle Alison Brown, NAVSYS

A Tale of two Patients A Tale of two Patients 1) 71 yo with persistent AF since 2018. OSA

Chapter 3 Deliberation with Refinement Automated Planning and Acting Methods Malik Ghallab,

Improving Privacy Protection in the area of Behavioural Targeting Privacy & Innovation, Hong

BSN MICRO/i KREDIT PRIHATIN & SPECIAL RELIEF FACILITIES Micro Business Division RETAIL