Machine Learning @ Amazon
Ralf Herbrich Amazon
6/29/17 1
Machine Learning @ Amazon Ralf Herbrich Amazon 6/29/17 1 - - PowerPoint PPT Presentation
Machine Learning @ Amazon Ralf Herbrich Amazon 6/29/17 1 Overview Machine Learning in Practise Probabilities Finite Resource Machine Learning @ Amazon Forecasting Machine Translation Visual Systems Conclusions
Ralf Herbrich Amazon
6/29/17 1
6/29/17 2
6/29/17 3
6/29/17 4
6/29/17 5
to each logical statement A.
6/29/17 6
6/29/17 7
function (Wiberg, 1996)
b c a
6/29/17 8
ϑ
Y2 Y4 Y6 Y1 Y3 Y5 Y7
ϑ4 ϑ5 ϑ2 ϑ1 ϑ3 Message Passing (“Communicate”) Belief Store (“Memory”) Data Messages (“Compute”)
6/29/17 9
6/29/17 10
6/29/17 11
Facebook 2015 Annual Revenue $17,928,000,000.00* Daily Revenue $49,117,808.22 Number of DAU 1,038,000,000** Number of Story Candidates 1,500*** Number of Daily Stories 1.557E+12 Maximum Cost per Story Candidate $0.0000315
*http://www.statista.com/statistics/277229/facebooks-annual-revenue-and-net- income/ **http://www.statista.com/statistics/346167/facebook-global-dau/ ***https://www.facebook.com/business/news/News-Feed-FYI-A-Window-Into-News- Feed
It’s power, stupid! Some constraints might not be obvious: building new datacenters and powering them is non-trivial. Example: 1 GPU box = 20 CPU boxes (in terms of power consumption)
6/29/17 13
14
ML Seattle ML Bangalore S9 A9 A2Z
6/29/17
Ivona ML Berlin Evi ML Cambridge ML Los Angeles
Retail
Forecasting
Prediction
Prediction
Customers
Recommendation
Detection
Seller
Crawling
Catalog
Classification
validation
Digital
Extraction
Detection
Recognition
Acquisiion
15 6/29/17
6/29/17 16
17 6/29/17
Training Range: Non-fashion items have longer training ranges that we can leverage. Need to information share across new and old products. Seasonality: This item has Christmas seasonality with higher growth over time. This is where we need growth features in addition to date features. Missing Features or Input: Unexplained spikes in demand are likely caused by missing features or incomplete input data. Example fashion product to illustrate the challenges of forecasting.
time sales/demand
Learning Model Parameters Forecasting
P(zi t | θ) ∼
Typical midsize dataset:
Binary classification #1
Binary classification #2
Count regression z-2
P(zi t | θ) ∼
y1,2 y1,1 y1,0
z1 x1
l1,2 l1,1 l1,0 y2,2 y2,1 y2,0
z2 x2
l2,2 l2,1 l2,0 y3,2 y3,1 y3,0
z3 x3
l3,2 l3,1 l3,0 y4,2 y4,1 y4,0
z4 x4
l4,2 l4,1 l4,0 y5,2 y5,1 y5,0
z5 x5
l5,2 l5,1 l5,0
Latent State Multistage Likelihood
Bridge GLM
6/29/17 24
Products Lifetime Profit Human Translation Machine Translation Selection Gap
6/29/17 25
MXNet: https://github.com/awslabs/sockeye
6/29/17 26
6/29/17 27
New Automated Inspection Current Inspection
Computer Vision
6/29/17 28
Age à Strawberry ID
6/29/17 2016 (c) Amazon 30
6/29/17 32
predictions about the future!
6/29/17 33
6/29/17 34 2016 (c) Amazon