Machine Learning @ Amazon Ralf Herbrich Amazon 6/29/17 1

Overview • Machine Learning in Practise • Probabilities • Finite Resource • Machine Learning @ Amazon • Forecasting • Machine Translation • Visual Systems • Conclusions and Challenges 6/29/17 2

Machine Learning: Formal Definition • Labelled Data • Unlabelled Data • Probability is a central concept in Machine Learning! 6/29/17 4

Why Probability? 1. Mathematics of Uncertainty (Cox’ axioms) 6/29/17 5

Cox Axioms: Probabilities and Beliefs • Design: System must assign degree of plausability to each logical statement A. • Axiom: is a real number • is independent of Boolean rewrite • • P must be a probability measure! 6/29/17 6

Why Probability? 1. Mathematics of Uncertainty (Cox’ axioms) 2. Variables and Factors map to Memory & CPU 6/29/17 7

Factor Graphs • Definition: Graphical representation of product structure of a function (Wiberg, 1996) • Nodes: = Factors = Variables • Edges: Dependencies of factors on variables. • Semantic: a b • Local variable dependency of factors c 6/29/17 8

Factor Graphs and Cloud Computing Belief Store (“Memory”) ϑ 1 ϑ 2 ϑ 3 ϑ ϑ 4 ϑ 5 Message Passing (“Communicate”) Data Messages (“Compute”) Y 1 Y 2 Y 3 Y 4 Y 5 Y 6 Y 7 6/29/17 9

Factor Graphs and MXNet 6/29/17 10

Finite Resource: Cost Economics 101 • Profit = Revenue – Cost • In the long run, a business that generates negative profits is not viable! Facebook 2015 It’s power, stupid! Annual Revenue $17,928,000,000.00* Some constraints might not be obvious: Daily Revenue $49,117,808.22 building new datacenters and powering Number of DAU 1,038,000,000** them is non-trivial. Number of Story Candidates 1,500*** Example: 1 GPU box = 20 CPU boxes Number of Daily Stories 1.557E+12 (in terms of power consumption) Maximum Cost per Story Candidate $0.0000315 *http://www.statista.com/statistics/277229/facebooks-annual-revenue-and-net- income/ **http://www.statista.com/statistics/346167/facebook-global-dau/ ***https://www.facebook.com/business/news/News-Feed-FYI-A-Window-Into-News- Feed

Locations S9 ML Berlin ML Seattle ML Cambridge A2Z Ivona A9 Evi ML Los Angeles ML Bangalore 6/29/17 14

Machine Learning Opportunities @ Amazon Retail Customers Seller Catalog Digital • Demand • Product • Fraud Detection • Browse-Node • Named-Entity Forecasting Recommendation Classification Extraction • Predictive Help • Vendor Lead Time • Product Search • Meta-data • XRay • Seller Search & Prediction validation • Visual Search Crawling • Plagiarism • Pricing • Review Analysis Detection • Product Ads • Packaging • Hazmat Prediction • Echo Speech • Shopping Advice Recognition • Substitute • Customer Problem Prediction • Knowledge Detection Acquisiion 6/29/17 15

Demand Forecasting Example fashion product to illustrate the challenges of forecasting. Training Range: Non-fashion items Missing Features or Input: have longer training ranges that we Unexplained spikes in demand are can leverage. Need to information likely caused by missing features or share across new and old products. incomplete input data. Seasonality: This item has Christmas seasonality with higher growth over time. This is where we need growth features in addition to date features. 6/29/17 17

Learning and Prediction P ( z i t | θ ) ∼ sales/demand time Learning Forecasting Model Parameters

Slow Moving Inventory Typical midsize dataset: • About 5M items • About 4.5B item-days • About 98% zero demand

Sampling Predictions P ( z i t | θ ) ∼ • 0 or ≥1 ? Binary classification #1 • 1 or ≥2 ? Binary classification #2 • If ≥2: Count regression z-2

x 1 x 2 x 3 x 4 x 5 l 1,2 l 2,2 l 3,2 l 4,2 l 5,2 Latent State l 1,1 l 2,1 l 3,1 l 4,1 l 5,1 l 1,0 l 2,0 l 3,0 l 4,0 l 5,0 y 1,2 y 2,2 y 3,2 y 4,2 y 5,2 Multistage Likelihood y 1,1 y 2,1 y 3,1 y 4,1 y 5,1 y 1,0 y 2,0 y 3,0 y 4,0 y 5,0 z 1 z 2 z 3 z 4 z 5

In Practice x 1 x 2 x 3 x 4 x 5 x 1 x 2 x 3 x 4 x 5 x 1 x 2 x 3 x 4 x 5 l 1,2 l 2,2 l 3,2 l 4,2 l 5,2 l 1,1 l 2,1 l 3,1 l 4,1 l 5,1 l 1,0 l 2,0 l 3,0 l 4,0 l 5,0 l 1,0 l 2,0 l 3,0 l 4,0 l 5,0 y 1,2 y 2,2 y 3,2 y 4,2 y 5,2 y 1,2 y 2,2 y 3,2 y 4,2 y 5,2 y 1,1 y 2,1 y 3,1 y 4,1 y 5,1 y 1,1 y 2,1 y 3,1 y 4,1 y 5,1 y 1,0 y 2,0 y 3,0 y 4,0 y 5,0 y 1,0 y 2,0 y 3,0 y 4,0 y 5,0 y 1,0 y 2,0 y 3,0 y 4,0 y 5,0 z 1 z 2 z 3 z 4 z 5 z 1 z 2 z 3 z 4 z 5 z 1 z 2 z 3 z 4 z 5

Modelling Out of Stock GLM Bridge

Product Machine Translation (2013 – 2015) Lifetime Profit Human Translation Machine Translation Products Selection Gap 6/29/17 25

Sockeye • Sequence-to-sequence Neural Machine Translation package build on MXNet: https://github.com/awslabs/sockeye • Support both CPU and GPU encoding and decoding • Training • Translating 6/29/17 26

Automated Produce Inspection: The Goal New Automated Inspection Current Inspection Computer Vision 6/29/17 28

Challenges • Illumination • Clutter/Occlusions • Viewpoint • Scale • Intra-class variability

Predicting Longevity Strawberry ID Age à 6/29/17 2016 (c) Amazon 30

Age Aligned Strawberries (Test Set)

Conclusions • Machine Learning “translates” data from the past into accurate predictions about the future! • In practice, probabilistic models and finite resources matter. • Machine Learning helps to improve customer experience at Amazon! 6/29/17 33

Thanks! 6/29/17 2016 (c) Amazon 34

Machine Learning @ Amazon Ralf Herbrich Amazon 6/29/17 1 - PowerPoint PPT Presentation

Machine Learning @ Amazon Ralf Herbrich Amazon 6/29/17 1 Overview Machine Learning in Practise Probabilities Finite Resource Machine Learning @ Amazon Forecasting Machine Translation Visual Systems Conclusions

Relational Document Time Series Amazon Aurora Amazon DocumentDB Amazon Timestream Graph

Relational Amazon Aurora Amazon RedShi f Amazon RDS AWS Database Migration Service DMS

Deep Semantic Matching for Amazon Product Search Yi Yiwei ei So Song ng Amazon Product

Instance Support Elastic Load Balancing Amazon EC2 AWS Elastic Beanstalk Amazon EC2 Container

ISTA 6-Amazon Packaging Solutions 1 Table of Contents o Introduction to E-Commerce & Amazon

VMD & NAMD on Elastic Compute Cloud (EC2) instance of Amazon Web Services (AWS) Start VMD

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Case Study at the Amazon Spheres WenMing Ye Miro Enev, PhD Specialist Solution Architect Sr.

Amazon Book Sleuth Comprehensive Book History Referral and Comparison App Yang Guo, Crystal Yang,

Amazon Reviews Dr. Jarad Niemi STAT 544 - Iowa State University March 5, 2018 Jarad Niemi

Me vs BigQuery CEO @ Applications Databases Files Stripe Asana Instagram Amazon Aurora

Innovation at AWS Eric Ferreira ericfe@amazon.com Principal Database Engineer Amazon Redshift

Regional High School Feasibility Study Wayne County School Districts Wayne Finger Lakes BOCES

1 Albany Alfred State Alfred Binghamton Brockport Buffalo Ceramics University Buffalo

Investor Presentation August 9, 2010 Forward Looking Statements NYSE: NRGY, NRGP Important

Story Walk at Phillis Wheatley Library Rochester, NY Intersections: Creating Culturally Complete

Variable stakeholders across the value chain 2 1 24/11/2014 Complex interactions lead to

Cl Clean Energy Emission Reduction E E i i R d ti Opportunities and Resources Webinar for

Collaborative Approaches from a Financing Perspective Addressing Stormwater through Green

AND FLEXIBLE USE OF AIRSPACE ICAO APAC CIVIL MILITARY COOPERATION REGIONAL CONFERENCE SCOPE