1 4 INTRODUCTION RESULTS Trucking industry background & load - - PowerPoint PPT Presentation

1 4
SMART_READER_LITE
LIVE PREVIEW

1 4 INTRODUCTION RESULTS Trucking industry background & load - - PowerPoint PPT Presentation

P REDICTI TING C ARRI ARRIER L OAD AD C ANCE CELLA LLATI TIONS A UTHORS MIT Center for Transportation & Logistics Research Fest May 22, 2018 Ali Al-Habib Nicolas Favier Dr. Christopher Mejia A DVISOR A GENDA 1 4 INTRODUCTION RESULTS


slide-1
SLIDE 1

PREDICTI

TING CARRI ARRIER LOAD AD

CANCE

CELLA LLATI TIONS

  • Dr. Christopher Mejia

Ali Al-Habib Nicolas Favier AUTHORS ADVISOR

MIT Center for Transportation & Logistics

Research Fest

May 22, 2018

slide-2
SLIDE 2

AGENDA

INTRODUCTION

Trucking industry background & load cancellation impacts

DATA ANALYSIS

Descriptive analytics of load cancellation over three-year dataset

MODELING

Predictive models applied on the dataset to identify main cancellation drivers

RESULTS

Models results presented in confusion matrices and results analysis

CONCLUSION

Recommended actions and future research challenges

2

1 2 3 4 5

slide-3
SLIDE 3

INTRODUCTION

1

slide-4
SLIDE 4

MOTIVATION

Estimated Impact ≅ $4.6B /year

4

400 Million Truckloads 185 Million FTL Truckloads 32 Million Cancellations

~$145

/cancellation

Source: Freight Facts and Figures, by U.S. Department of Transportation Bureau of Transportation Statistics 2015; CSCMP’s Annual State of Logistics Report, by AT Kearney; & Data Analysis from the sponsor company

slide-5
SLIDE 5

PROCESS

5

3-YEAR Dataset

  • f Full Truckloads

Main Drivers

for Truckload Cancellations

Predictive Model

to Predict Cancellation Probability

3.6M Records of Full Truckload during 2015, 2016, 2017 Descriptive analytics to identify the main cancellation drivers Evaluating different models to predict future loads cancellations

slide-6
SLIDE 6

POTENTIAL CANCELLATION DRIVERS

6

Cancellations

Load Impact Shipper Impact Carrier Impact Other Impacts

Carrier Size Carrier Type Loads/Year Bounce/Carrier Carrier ID Carrier Length of Relationship Safety Rating Number of Claims/Incidence Shipper ID Facility Industry Shipper Length of Relationship Shipper Size Facility Dwell Time

Facility Impact Carrier History Impact Carrier Characteristics Impact Shipper Characteristics Impact

Shipments/Year

Shipment History Impact Carrier Issues Impact

Carrier Rep Weather Natural Disaster Geography Rep Tenure

Internal Factors Impact External Factors Impact

Day of the Week Book Time Load Time Load ID Origin Destination Number of Stops Load Cost Load Rate Spot Price Appointment Type Lead Time Empty Time High Risk High Value Book Lead Time Service Level On-Time Delivery On-Time PickUp Equipment Type Dead Head Leangth of Haul Duration Weight Loading Time Unloading Time Contract Type Load Changes Carrier Conference

Price Impact Load Characteristics Impact Trip Characteristics Impact Contract Characteristics Impact

slide-7
SLIDE 7

DATA ANALYSIS

2

slide-8
SLIDE 8

BEHAVIOR OVER TIME

8

Cancellation Ratios over time

0% 5% 10% 15% 20% 25%

2015-1 2015-2 2015-3 2015-4 2015-5 2015-6 2015-7 2015-8 2015-9 2015-10 2015-11 2015-12 2016-1 2016-2 2016-3 2016-4 2016-5 2016-6 2016-7 2016-8 2016-9 2016-10 2016-11 2016-12 2017-1 2017-2 2017-3 2017-4 2017-5 2017-6 2017-7 2017-8 2017-9 2017-10

Contract Cancellation Ratio Spot Cancellation Ratio Total Cancellation Ratio

slide-9
SLIDE 9

LOCATION FACTOR

9

Loads & Cancellation Ratios by city

slide-10
SLIDE 10

SHIPPERS & CARRIER FACTORS

10

Cancellation Ratios by shipper industry Cancellation Ratios by carrier length

  • f relation with the company
slide-11
SLIDE 11

TIME FACTORS

11

Cancellation Ratios by duration between booking & load pickup Cancellation Ratios by pickup time Cancellation Ratios by day of the week

slide-12
SLIDE 12

MODELING

3

slide-13
SLIDE 13

3

Correlation

Remove correlated attributes using Correlation & Multi-Collinearity Analysis

2

Outliers Processing

Remove outlier records to avoid undesired impact

DATA PREPARATION

13

1

Load-Level Data

Convert data from stop to load level data

4

Predictor Screening

Identify the most significant predictors in the data

5

Build the Model

Build multiple models to predict cancellations & assess results

slide-14
SLIDE 14

MODELING

14

LOGISTIC REGRESSION MACHINE LEARNING NEURAL NETWORKS RANDOM FOREST K-NEAREST NEIGHBOR Categorical Output Self-Explanatory Used as Main Model

Multiple Algorithms Harder to Explain Used to Validate Logistic Regression Results

slide-15
SLIDE 15

RESULTS

4

slide-16
SLIDE 16

Predictions No Yes Actual No 652,501 2,956 655,457 Yes 129,727 1,971 131,698 782,228 4,927 787,155 Error 16.86% Missed Bounces 98.50%

AVAILABLE DATASET

MODEL RESULTS PREDICTOR SCREENING

16 Error % Missed Bounces Neural Networks 16.73% 99.95% Random Forest 16.61% 99.48% K-Neares Neighbor 19.90% 84.44%

AVAILABLE DATASET

slide-17
SLIDE 17

DATA ENRICHMENT

17

Carrier (80887) & City (Rochelle) Bounce Ratio=1/12=0.08333 Average of the CarrierCity Bounce Ratio for Each Stop Aggregated carrierCityBounce Ratio

  • n Load-Level

Repeated loads are counted only once for the ratio calculation

SEVERE WEATHER DATA* CANCELLATION RATIOS ENRICHED DATASET

*Source: National Centers for Environmental Information

slide-18
SLIDE 18

Predictions No Yes Actual No 638,652 16,880 655,532 Yes 52,155 79,468 131,623 690,807 96,348 787,155 Error 8.77% Missed Bounces 39.62%

ENRICHED DATASET

MODEL RESULTS PREDICTOR SCREENING

18 Error % Missed Bounces Neural Networks 8.67% 39.04% Random Forest 8.70% 42.13% K-Neares Neighbor 9.33% 44.32%

ENRICHED DATASET

slide-19
SLIDE 19

Predictions No Yes Actual No 59,883 3,735 63,618 Yes 8,903 1,722 10,625 68,786 5,457 74,243 Error 17.02% Missed Bounces 83.79% Predictions No Yes Actual No 638,652 16,880 655,532 Yes 52,155 79,468 131,623 690,807 96,348 787,155 Error 8.77% Missed Bounces 39.62%

ADDITIONAL DATASET

ENRICHED DATASET NEW DATASET Dataset (~3-year data) Training (80%) Testing (20%) Additional 3- month data Cancellation Ratios Calculation (100%) Ratios

19 Error % Missed Bounces Neural Networks 16.78% 84.70% Random Forest 16.19% 87.98% K-Neares Neighbor 16.41% 86.66%

slide-20
SLIDE 20

Predictions No Yes Actual No 2,147 31 2,178 Yes 176 44 220 2,323 75 2,398 Error 8.63% Missed Bounces 80.00% Predictions No Yes Actual No 21,449 368 21,817 Yes 2,222 542 2,764 23,671 910 24,581 Error 10.54% Missed Bounces 80.39%

UNPREDICTABILITY TESTING

PREDICTION TIME HORIZON AVAILABLE HISTORICAL DATA <= 10 Historical Records (67%)

> 10 Historical Records (33%)

Additional 3-month data

7-day Horizon (3%)

Additional 3-month data <= 10 Historical Records (67%)

> 10 Historical Records (33%)

20

slide-21
SLIDE 21

MULTIPLE CLUSTERS, MULTIPLE MODELS

21

Test Error Missed Bounces

Logistic Regression (Threshold=0.5) - Base Scenario 17.02% 83.79% Cost Clustering Low Cost (<= $500) 18.20% 99.06% Mid Cost 16.67% 98.46% High Cost (>= $6000) 8.49% 100.00% Miles Clustering Same day delivery (<= 250 mi) 16.07% 99.18% Next Day delivery 18.08% 98.18% Long Haul (>= 550 mi) 18.08% 98.18% Book To pickup Hours Clustering Less than 24h 8.53% 100.00% Between 24h and 48h 16.91% 100.00% Between 48h and 72h 20.58% 99.99% More than 72h 22.33% 99.58%

slide-22
SLIDE 22

THRESHOLD SENSITIVITY ANALYSIS

22

slide-23
SLIDE 23

THRESHOLD SENSITIVITY ANALYSIS

23

0% 10% 20% 30% 40% 50% 60% 70% 80%

  • 5,000

10,000 15,000 20,000 25,000 30,000 35,000 40,000

  • 0.10

0.20 0.30 0.40 0.50 0.60

% of Bounces predicted correctly of total Bounces Loads Threshold

Loads

FN (Missed Bounces) FP (Missed Not Bounces) Bounces Predicted Correctly (%)

slide-24
SLIDE 24

CONCLUSION

5

slide-25
SLIDE 25

NEXT STEPS

25

THRESHOLD CHANGE FURTHER RESEARCH

  • Use the model with lower threshold (0.17)
  • Predict up-to 42% of cancelled loads
  • Tradeoff ratio 4:1

(predicted cancellation : actual cancellation)

  • Surveys to capture range of cancellation reasons
  • Record actual reasons for each cancellation
  • Capture details related to these reasons
  • Record additional information for each load:
  • Loads sequence at truck level
  • Carrier booked capacity
  • Rejection Rate
slide-26
SLIDE 26

CHALLENGES

26

LOAD SEQUENCE SCENARIO OVERBOOKING SCENARIO

COMPANY A COMPANY B COMPANY C SELECTED ROUTE

slide-27
SLIDE 27

Q&A

Ali Al-Habib Nicolas Favier

THANK YOU!