Predicting AsiaYo Users Spending for Improving Search Results - - PowerPoint PPT Presentation

predicting asiayo users spending for improving search
SMART_READER_LITE
LIVE PREVIEW

Predicting AsiaYo Users Spending for Improving Search Results - - PowerPoint PPT Presentation

Predicting AsiaYo Users Spending for Improving Search Results Travis Greene, Martin Hsia, Letitia She, Leo Lee Business Goal Stakeholders: Assumptions: AsiaYos managers 1. Model trained only on previous bookings 2. Avg. AsiaYo


slide-1
SLIDE 1

Predicting AsiaYo Users’ Spending for Improving Search Results

Travis Greene, Martin Hsia, Letitia She, Leo Lee

slide-2
SLIDE 2

Stakeholders: AsiaYo’s managers Assumptions: 1. Model trained only on previous bookings 2.

  • Avg. AsiaYo customers

are price sensitive 3. All users are new users Challenge: 1. Plenty places to book a room for a trip 2.

  • Avg. 3 trips per year

Opportunity: Improve search results to increase conversion rate

Business Goal

Improving the default sorting (AY Sort) Reduce time for user to search and decide UX is improved by predicting customers’ budgets Better conversion rate

slide-3
SLIDE 3

Goal Outcome Task Data Mining Goal

Predict the amount users will spend nightly Predictive and supervised task. We are taking past customers’ transaction data and predicting new users’ per night spending A predicted amount paid per night (numeric)

slide-4
SLIDE 4

Implementation

User’s Predicted Budget $1000

slide-5
SLIDE 5

INPUT OUTPUT

Guests

  • AVG. amount

paid/night

  • AVG. amount

paid/night Nights

  • Check-in/out day
  • Created at

day/month/time

  • Lead time

Fri. Sat.

Days of the Week

Platform User/Accom. Country Accom. City

slide-6
SLIDE 6

History order transaction data 50,546 rows 16 columns 1. Remove internal test data, outliers, unnecessary rows 2. Bin time, Convert to day of week, keep months. Compute time differences between order creation & check-in date 3. Create new column #per_night Training ~40000 data rows, Testing ~10000 data rows 80/20 split

Data Description Data Pre-process Partition

slide-7
SLIDE 7

Methods

Ensemble RMSE 846.08 36% Improvement from NAIVE

5 x 5 cross validation

slide-8
SLIDE 8

Performance Evaluation

glm gbm

residuals_glm residuals_gbm

slide-9
SLIDE 9
slide-10
SLIDE 10

Reducing Model Prediction Error

  • Country/City models
  • Connect previous transaction history to

search results

  • Use UTM Source data as predictor
  • Filtering categories based on counts

Recommendations

Algorithmic Considerations

  • Prediction intervals
  • Speed vs. accuracy

trade-off

  • Hyper-parameter tuning
  • Log (price)

Business Policy

  • Booking lead times
  • Booking time of day
  • Booking behavior on key dates
  • Collect more

personal/demographic data