personalizing personalizing personalizing personalizing
play

PERSONALIZING PERSONALIZING PERSONALIZING PERSONALIZING - PowerPoint PPT Presentation

PERSONALIZING PERSONALIZING PERSONALIZING PERSONALIZING PERSONALIZING PERSONALIZING PERSONALIZING PERSONALIZING PERSONALIZING PERSONALIZING PERSONALIZING PERSONALIZING SEARCH RESULTS IN REAL-TIME Grebennikov Roman / ndify.io


  1. PERSONALIZING PERSONALIZING PERSONALIZING PERSONALIZING PERSONALIZING PERSONALIZING PERSONALIZING PERSONALIZING PERSONALIZING PERSONALIZING PERSONALIZING PERSONALIZING SEARCH RESULTS IN REAL-TIME Grebennikov Roman / �ndify.io @public_void_grv grv@dfdx.me / / 1

  2. ABOUT FINDIFY white-label eCommerce SaaS search 1500 stores, 20M products 50M customers per month 2

  3. FINDIFY IN 2014 UI-focused Shopify search addon Backed by ElasticSearch Nothing special about product ranking 3

  4. RANKING IS IMPORTANT nobody is scrolling down 4

  5. RANKING IS IMPORTANT no second search 5

  6. RANKING IS IMPORTANT no second visit 6

  7. TYPICAL CUSTOMER SESSION 1. Arrive on a landing/product page (0s) 2. Click on product collections (+10s) 3. Make a search (+20s) 4. Leave forever (+30s) 7

  8. BETTER RANKING 8

  9. BETTER RANKING? 9

  10. 10

  11. AI ML (LINEAR REGRESSION) Algorithm Conversion AOV Elasticsearch baseline baseline Regression +3.1% +2.5% 11

  12. REINVENTING THE WHEEL Learn to Rank LambdaMART XGBoost/LightGBM/CatBoost 12

  13. ELASTICSEARCH INTEGRATION 13

  14. ELASTICSEARCH INTEGRATION 14

  15. TRAINING Historical click/purchase data Model per merchant Optimize for NDCG, watch for conversion 15

  16. FEATURE GROUPS search : # of terms, # of �lters product : price, # of pageviews variant : color, size current session : price sensitivity, # of searches historical sessions : # of sessions product and search : # of pageviews within context + different time windows 16

  17. 17

  18. MIXED RESULTS 18

  19. MIXED RESULTS Algorithm Conversion AOV Elasticsearch baseline baseline Regression +3.1% +2.5% +6.1% (+8.1%) LMART v1 no data 19

  20. TRAINING ISSUES Historical click/purchase data Model per merchant Optimize for NDCG 20

  21. POSITIVE FEEDBACK LOOP 21

  22. POSITIVE FEEDBACK LOOP 22

  23. POSITION BIAS POSITION BIAS POSITION BIAS POSITION BIAS POSITION BIAS POSITION BIAS POSITION BIAS POSITION BIAS POSITION BIAS POSITION BIAS POSITION BIAS POSITION BIAS CUSTOMERS ARE CLICKING ONLY ON FIRST PRODUCTS 23

  24. RANDOM RANKING 24

  25. RANDOM RANKING Algorithm Conversion AOV Elasticsearch baseline baseline Regression +3.1% +2.5% +6.1% (+8.1%) LMART v1 no data Random -2.8% -1.3% 25

  26. POSITION BIAS L. Li, W. Chu, J. Langford, R. Schapire. 2010. A contextual-bandit approach to personalized news article recommendation. Exploration and exploitation segments Un-biasing the training data 26

  27. EXPLORATION SEGMENT tiny segment, 0.1-1% of traf�c �rst page is shuf�ed used for training 27

  28. TRAINING ISSUES Historical Unbiased click/purchase data Model per merchant Optimize for NDCG 28

  29. MODEL PER MERCHANT Low-traf�c merchants Onboarding and data collection time Sacri�cing ranking for "Exploration segment" 29

  30. SUGGESTIONS HACKATHON Replace heuristics with ML Simpler problem than search All features are language-speci�c small, medium, large merchant 30

  31. BETTER SUGGESTIONS? 31

  32. MODEL TRANSPLANT from large-traf�c merchant to small-traf�c: 32

  33. GENERIC MODEL More training samples More diverse dataset No need for per-merchant data collection All features need to be scaled 33

  34. TRAINING ISSUES Historical Unbiased click/purchase data Model per merchant Generic model Optimize for NDCG 34

  35. NDCG 1.0 - good, 0.0 - bad, 0.4-0.7 - normal compares perfect ranking with real what is a perfect ranking? 35

  36. PERFECT RANKING 36

  37. STANLEY BONG ISSUE Rank improved from #20 to #1 Never bought Costs 3500$ 37

  38. STANLEY BONG ISSUE over-optimized for clicks 38

  39. PERFECT RANKING 39

  40. TRAINING ISSUES Historical Unbiased click/purchase data Model per merchant Generic model Optimize for NDCG (with proper weights) 40

  41. RESULTS RESULTS RESULTS RESULTS RESULTS RESULTS RESULTS RESULTS RESULTS RESULTS RESULTS RESULTS 41

  42. NDCG WITH PERSONALIZATION NDCG (of�ine) Algorithm Random 0.544 Popularity 0.578 Elasticsearch 0.601 Regression 0.615 LMART v1 ~0.621 LMART unbiased 0.635 42

  43. NDCG AND BUSINESS METRICS Algorithm NDCG CTR Conversion AOV Elasticsearch 0.601 baseline baseline baseline Random 0.544 -7.1% -2.8% -1.3% Regression 0.615 -1.1% +3.1% +2.5% LMART v1 ~0.621 no data +6.1% no data +8.1% (est) LMART unbiased 0.635 no data no data 43

  44. CONCLUSION Better ranking = more $$$ A lot of pitfalls Multiply development estimates by 44

  45. 45

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend