Product2Vec : MRNe Net-Pr A A Multi ti-task Recurrent Ne Neural - PowerPoint PPT Presentation

Product2Vec : MRNe Net-Pr A A Multi ti-task Recurrent Ne Neural Ne Network for Product Embedding Embeddings Arijit Biswas, Mukul Bhutani and Subhajit Sanyal Machine Learning, Amazon, Bangalore, India {barijit,mbhutani,subhajs}@amazon.com

Th The Collaborators Mukul Bhutani Subhajit Sanyal Machine Learning, Machine Learning, Amazon Amazon

A A Produc duct in n an n E-co commerce Company Product attributes Title • Color • Size • • Material Category • Item Type • Hazardous indicator • Batteries required • • High Value Target Gender • Weight • Offer • Review • • Price View Count •

Mo Motivation • Billions of products in the inventory • Diverse set of ML problems involving products • Product recommendation • Duplicate Product Detection • Product Safety Classification • Price Estimation • .... • Any ML application needs a good set of features • What is a good and useful featurization for products?

A A Naïve Fe Featurization • Bag-of-words: TF-IDF representations • Title • Description • Bullet Points etc. • Although effective, often difficult to use in practice: • Overfitting • Computational and Storage Inefficient • Not Semantically Meaningful • Increases the parameters in down-stream ML algorithms • Dense Low-dimensional Features could alleviate these issues

Su Summa mmary of Co Contri ributions • We propose a novel product representation approach • Dense, Low-dimensional, Generic • As good as TF-IDF representation • A Discriminative Multi-task Neural Network is trained • Different signals pertaining to a product are explicitly injected • Static : color, material, weight, size, sub-category • Dynamic : price, popularity, views • The learned representations should be generic • The title of a product is fed into a bidirectional LSTM • Hidden representation is “product embedding” or “product feature” • Training: Embedding is fed to multiple classification/regression/decoding units • Trained Jointly • Referred as Multi-task Recurrent Neural Network (MRNet)

Pr Prior Work • Word/Document Embeddings • Word2Vec [Mikolov, 2013] • Paragraph2Vec/Doc2Vec [Mikolov, 2014] • Product Embeddings • Prod2Vec [Grbovic, KDD 2015] • Meta-Prod2Vec [Vasile, Recsys 2016] • Designed for product recommendation • Traditionally, Multi-task Learning is used for correlated tasks • We use multi-task learning to make the product representations generic!

MR MRNet Decoding Regression Classification Our Approa Classification • Different product signals are injected into MRNet • To make the embedding generic Task 1 Task 2 Task 3 Task 4 Task 5 Tasks Classification Regression Decoding Embedding Layer (Product representation) Static Color, Size, Weight Tf-IDF Material,Category, representation of Item Type, Title Hazardous, High- (5000 dim.) value,Target Gender, Dynamic Offers, Reviews Price, # Views Bi-directional LSTM Word Word Word Word 1 2 T 3 Input words from Product Title

Lo Loss and Optimi mization

Lo Loss and Optimi mization Joint Optimization • Gradient is computed w.r.t full loss • Alternating Optimization • Randomly one task loss is selected • Backpropagation is performed with that loss • Only the weights of that task and task-invariant layers are updated •

Pr Product Group Agnostic Em Embe beddi ddings ngs Products organized as Product Groups (PGs): PG 1 PG 2 PG N • Furniture, Jewelry, Books, Home, Clothes etc. Fully connected linkages Signals are often product group specific: GL agnostic embedding • Weights of Home items are different from (sparsity enforced) Jewelry Fully connected linkages • Sizes of clothes (XL, XXL etc.) are different from furniture (king, queen) PG 1 PG 2 PG N • Embeddings are learned for each product group • A sparse Autoencoder is used to obtain PG- Embedding specific to PG1 agnostic embedding

Da Datas asets Plugs : If a product has an electrical plug or not • Binary, 205K samples • SIOC : If a product ships in it’s own container • Binary, 296K samples • Browse Category classification • Multi-class, 150K samples • Ingestible Classification • Binary, 1500 samples • SIOC (unseen population) • Binary, 150K training and 271 test samples •

Expe Experimental Resul sults s Baseline: TF-IDF-LR Proposed MRNet is comparable to TF-IDF-LR in most scenarios!

Qua ualitative e res esul ults

La Language Agnostic MR MRNet-Pr Product2Vec Products from different marketplaces have their metadata in the language Embedding: UK Embedding: FR native to that region. Hidden Layer We train a multi-modal Autoencoder to link representations of products pertaining to different marketplaces. Embedding: UK Embedding: FR Training Data Split 1/3 input: [Embedding:UK, Embedding:FR] Output:[Embedding:UK, Embedding:FR] 1/3 input: [Embedding:UK, (0,0,0,…..,0)] Output:[(0,0,0,...,0) Embedding:FR] 1/3 input: [(0,0,0,...,0), Embedding:FR] Output:[Embedding:UK,(0,0,0,...,0)]

Qua ualitative e Res esul ults (Langua nguage e Agno Agnostic) Nearest neighbors of French products in UK marketplace.

Co Conclusi sion and Future Work rk Propose a method for generic e-commerce product representation • Inject various product signals into it’s embedding • Comparable results w.r.t sparse and high-dimensional baseline • Product group agnostic embeddings • Language agnostic embeddings • Incorporate more signals: more generic • Include product image information •

Product2Vec : MRNe Net-Pr A A Multi ti-task Recurrent Ne Neural - PowerPoint PPT Presentation

Product2Vec : MRNe Net-Pr A A Multi ti-task Recurrent Ne Neural Ne Network for Product Embedding Embeddings Arijit Biswas, Mukul Bhutani and Subhajit Sanyal Machine Learning, Amazon, Bangalore, India {barijit,mbhutani,subhajs}@amazon.com

Safe Policy Improvement with Baseline Bootstrapping Romain Laroche, Paul Trichelair, R emi

Vandalism Detection on Wikipedia The class imbalance problem & new approaches Paul Gtze

The Player Kernel Lucas Maystre , Victor Kristof, Antonio Gonzlez Ferrer, Matthias Grossglauser

Factorization of the Label Conditional Distribution for Multi-Label Classification ECML PKDD 2015

Dagstuhl Seminar 17382 AAIP17 Approaches and Applications of Inductive Programming

In Interpretin ing Deep Sports Analy lytics: Valu luin ing Actio ions and Pla layers in in

A historical perspective on Machine Learning (on the occasion of the 25th Benelearn) Luc De

Towards a Methodology for Benchmarking Edge Processing Frameworks Pedro Silva, Alexandru Costan,

Improving Generalization by Data Categorization Ling Li, Amrit Pratap, Hsuan-Tien Lin, and Yaser

Grgory Marlire www.ifsttar.fr Institut franais des sciences et technologies des transports,

N UMERICAL R ESULTS (Q UALITY , S PEEDUP ) H 2 -matrix stored o.t.fly (f. both) data set #data

Exploring City Structure from Georeferenced Photos Using Graph Centrality Measures Katerina

Road Friday 20 November Anna Walker, Chair 1 Agenda for the day 9:30 Registration and

Text Categorization (I) Luo Si Department of Computer Science Purdue University Text

Sparse Memory Structures Detection Final Project for COMP 652 Alexandre Bouchard-Ct The

Vertical integration in the e-commerce sector Claire Borsengerger (La Poste), Helmuth Cremer

Freight Logistics eCommerce Trends Prepared for: Prepared by: January 12, 2018 Disclaimer

MULTINATIONAL & MULTILINGUAL REUSABLE E-COMMERCE ! PLATFORM CASE STUDY Maxime Topolov,

Deploying Machine Translation in an Industrial Setting: Challenges and Opportunities

FROM HTTP TO KAFKA-BASED FROM HTTP TO KAFKA-BASED MICROSERVICES MICROSERVICES Wojciech Rzsa,

POSTAL, DELIVERY AND E-COMMERCE ECONOMICS AND POLICY SESSION 1 Pos ostal r tal ref efor orm

eCommerce Strategies eCommerce Strategies Harvard Summer School 2009 John Paul Messina

An International Perspective on Internet Legislation 1 7 th May 2 0 0 7 Richard Clayton Outline

Google Analytics Traversari Academy > GOOGLE ANALYTICS Conversion Definition A completed

Sambuz

Useful Links

Newsletter

Mail Us

Product2Vec : MRNe Net-Pr A A Multi ti-task Recurrent Ne Neural - PowerPoint PPT Presentation

Product2Vec : MRNe Net-Pr A A Multi ti-task Recurrent Ne Neural Ne Network for Product Embedding Embeddings Arijit Biswas, Mukul Bhutani and Subhajit Sanyal Machine Learning, Amazon, Bangalore, India {barijit,mbhutani,subhajs}@amazon.com

Safe Policy Improvement with Baseline Bootstrapping Romain Laroche, Paul Trichelair, R emi

Vandalism Detection on Wikipedia The class imbalance problem &amp; new approaches Paul Gtze

The Player Kernel Lucas Maystre , Victor Kristof, Antonio Gonzlez Ferrer, Matthias Grossglauser

Factorization of the Label Conditional Distribution for Multi-Label Classification ECML PKDD 2015

Dagstuhl Seminar 17382 AAIP17 Approaches and Applications of Inductive Programming

In Interpretin ing Deep Sports Analy lytics: Valu luin ing Actio ions and Pla layers in in

A historical perspective on Machine Learning (on the occasion of the 25th Benelearn) Luc De

Towards a Methodology for Benchmarking Edge Processing Frameworks Pedro Silva, Alexandru Costan,

Improving Generalization by Data Categorization Ling Li, Amrit Pratap, Hsuan-Tien Lin, and Yaser

Grgory Marlire www.ifsttar.fr Institut franais des sciences et technologies des transports,

N UMERICAL R ESULTS (Q UALITY , S PEEDUP ) H 2 -matrix stored o.t.fly (f. both) data set #data

Exploring City Structure from Georeferenced Photos Using Graph Centrality Measures Katerina

Road Friday 20 November Anna Walker, Chair 1 Agenda for the day 9:30 Registration and

Text Categorization (I) Luo Si Department of Computer Science Purdue University Text

Sparse Memory Structures Detection Final Project for COMP 652 Alexandre Bouchard-Ct The

Vertical integration in the e-commerce sector Claire Borsengerger (La Poste), Helmuth Cremer

Freight Logistics eCommerce Trends Prepared for: Prepared by: January 12, 2018 Disclaimer

MULTINATIONAL &amp; MULTILINGUAL REUSABLE E-COMMERCE ! PLATFORM CASE STUDY Maxime Topolov,

Deploying Machine Translation in an Industrial Setting: Challenges and Opportunities

FROM HTTP TO KAFKA-BASED FROM HTTP TO KAFKA-BASED MICROSERVICES MICROSERVICES Wojciech Rzsa,

POSTAL, DELIVERY AND E-COMMERCE ECONOMICS AND POLICY SESSION 1 Pos ostal r tal ref efor orm

eCommerce Strategies eCommerce Strategies Harvard Summer School 2009 John Paul Messina

An International Perspective on Internet Legislation 1 7 th May 2 0 0 7 Richard Clayton Outline

Google Analytics Traversari Academy &gt; GOOGLE ANALYTICS Conversion Definition A completed

Sambuz

Useful Links

Newsletter

Mail Us

Vandalism Detection on Wikipedia The class imbalance problem & new approaches Paul Gtze

MULTINATIONAL & MULTILINGUAL REUSABLE E-COMMERCE ! PLATFORM CASE STUDY Maxime Topolov,

Google Analytics Traversari Academy > GOOGLE ANALYTICS Conversion Definition A completed