Deep Semantic Matching for Amazon Product Search Yi Yiwei ei So - - PowerPoint PPT Presentation

deep semantic matching for amazon product search
SMART_READER_LITE
LIVE PREVIEW

Deep Semantic Matching for Amazon Product Search Yi Yiwei ei So - - PowerPoint PPT Presentation

Search Deep Semantic Matching for Amazon Product Search Yi Yiwei ei So Song ng Amazon Product Search Deep Semantic Matching for Amazon Product Search Amazon Product Search Amazon is 4th most popular site in US [1] Majority of


slide-1
SLIDE 1

Search

Deep Semantic Matching for Amazon Product Search

Yi Yiwei ei So Song ng

Amazon Product Search

slide-2
SLIDE 2

Deep Semantic Matching for Amazon Product Search

slide-3
SLIDE 3

Place image here

Amazon Product Search

  • Amazon is 4th most popular site in US [1]
  • Majority of Amazon retail revenue is

attributed to search

  • Nearly half of US internet users start

product search on Amazon [2]

[1] https://www.alexa.com/topsites/countries/US [2] https://retail.emarketer.com/article/more-product-searches-start-on- amazon/5b92c0e0ebd40005bc4dc7ae

slide-4
SLIDE 4

Semantic Matching in Product Search

  • Goal of Semantic Matching is to reduce customers’ effort to shop

Reduced query reformulations Bridge the vocabulary gap between customers’ queries and product description

slide-5
SLIDE 5

What is a match for a query?

“health shampoo”

Zion Health Adama Clay Minerals Shampoo, 16 Fluid Ounce by Zion Health Lexical Match

slide-6
SLIDE 6

What is a match for a query?

“health shampoo”

Zion Health Adama Clay Minerals Shampoo, 16 Fluid Ounce by Zion Health ArtNaturals Organic Moroccan Argan Oil Shampoo and Conditioner Set - (2 x 16 Fl Oz / 473ml) - Sulfate Free - Volumizing & Moisturizing - Gentle on Curly & Color Treated Hair - Infused with Keratin by ArtNaturals Lexical Match Semantic Match

slide-7
SLIDE 7

What is a match for a query?

“countertop wine fridge” Antarctic Star 17 Bottle Wine Cooler/Cabinet Refigerator Small Wine Cellar Beer Counter Top Fridge Quiet Operation Compressor Freestanding Black by Antarctic Star Lexical Match

slide-8
SLIDE 8

What is a match for a query?

“countertop wine fridge” Antarctic Star 17 Bottle Wine Cooler/Cabinet Refigerator Small Wine Cellar Beer Counter Top Fridge Quiet Operation Compressor Freestanding Black by Antarctic Star DELLA 048-GM-48197 Beverage Center Cool Built-in Cooler Mini Refrigerator w/ Lock- Black/Stainless Steel by DELLA Lexical Match Semantic Match

slide-9
SLIDE 9

Semantic Matching augments Lexical Matching

iphone

P1 P1 P1 P10 P1 P12 P1 P100

Neural Network

P1: 0.12598,0.058533,-0.09845,0.010078,0.045166,-0.014076,… P2: 0.051819,0.0054588,0.0047226,0.045959,-0.015015,… P3: 0.010887,-0.015808,-0.098145,0.039215,-0.058655,-0.085388, P4: 0.042053,0.087402,0.070129,0.082397,-0.051056,-0.089478,

KNN Search

Query Embedding Product Embeddings

Merge Ranking

Query

xr

P1 P1 P2 P2 P9 P9

case

P1 P1 P3 P3 P4 P4 Lexical Matches Semantic Matches

slide-10
SLIDE 10

Neural Network Representation Model

Neural Networks

Query Text Document Text

Similarity Function

Query Embedding Document Embedding

Neural Networks

slide-11
SLIDE 11

Data

“artistic iphone 6s case” purchased

slide-12
SLIDE 12

Data

“artistic iphone 6s case” purchased Impressed but not purchased

slide-13
SLIDE 13

Data

“artistic iphone 6s case” purchased Impressed but not purchased Random

slide-14
SLIDE 14

Loss Function

“artistic iphone 6s case” purchased Random Impressed but not purchased Similarity between Query_Embed and Product_Embed Low Medium High

slide-15
SLIDE 15

Loss Function

  • For purchases:

!"## $, & $ = (0, & $ ≥ 0.9 & $ − 0.9 ., & $ < 0.9

  • For impressed but not purchased:

l"##($, & $) = (0, & $ ≤ 0.55 & $ − 0.55 ., & $ > 0.55

  • For randomly-sampled:

l"##($, & $) = (0, & $ ≤ 0.2 & $ − 0.2 ., & $ > 0.2

slide-16
SLIDE 16

Loss Function

slide-17
SLIDE 17

N-gram Average Neural Network

Query Product Title

Cosine Similarity Query Embedding

N-gram Parser

Dense Layer Average, Normalize, Activation Product Embedding Average, Normalize, Activation

Shared Embedding Layer

Product Attributes

Query Ngrams Title Ngrams

slide-18
SLIDE 18

N-gram Average Neural Network

“artistic iphone 6s case”

"artistic", "iphone", "6s", "case", "artistic#iphone", "iphone#6s", "6s#case", "artistic#iphone#6s", "iphone#6s#case", "#ar", "art", "rti", …, "#ca", "cas", "ase", "se#"

slide-19
SLIDE 19

N-gram Average Neural Network

“artistic iphone 6s case”

"artistic", "iphone", "6s", "case", "artistic#iphone", "iphone#6s", "6s#case", "artistic#iphone#6s", "iphone#6s#case", "#ar", "art", "rti", …, "#ca", "cas", "ase", "se#" Out of Vocab? "iphone#6s" "se#" "artistic#iphone" "artistic" "artistic#iphone#6s" No Yes Hash() Embedding Matrix Vocab Size OOV Bucket Size Embedding Size

Build N-gram vocab by frequency Hash OOV N-gram to a bin to group low count tokens

slide-20
SLIDE 20

N-gram Average Neural Network

0.45 0.47 0.49 0.51 0.53 0.55 0.57 0.59 0.61 0.63 100 200 300 400 500 600 700 800

MAP

Epoch

  • Word Unigram baseline on small dataset
slide-21
SLIDE 21

N-gram Average Neural Network

0.45 0.47 0.49 0.51 0.53 0.55 0.57 0.59 0.61 0.63 100 200 300 400 500 600 700 800

MAP

Epoch

  • Word Unigram baseline on small dataset
  • Use more data
slide-22
SLIDE 22

N-gram Average Neural Network

0.45 0.47 0.49 0.51 0.53 0.55 0.57 0.59 0.61 0.63 100 200 300 400 500 600 700 800

MAP

Epoch

  • Word Unigram baseline on small dataset
  • Use more data
  • Add Word Bigram
slide-23
SLIDE 23

N-gram Average Neural Network

0.45 0.47 0.49 0.51 0.53 0.55 0.57 0.59 0.61 0.63 100 200 300 400 500 600 700 800

MAP

Epoch

  • Word Unigram baseline on small dataset
  • Use more data
  • Add Word Bigram
  • Add Character Trigram
slide-24
SLIDE 24

N-gram Average Neural Network

0.45 0.47 0.49 0.51 0.53 0.55 0.57 0.59 0.61 0.63 100 200 300 400 500 600 700 800

MAP

Epoch

  • Word Unigram baseline on small dataset
  • Use more data
  • Add Word Bigram
  • Add Character Trigram
  • Add OOV hashing for ngrams
slide-25
SLIDE 25

N-gram Average Neural Network

0.45 0.47 0.49 0.51 0.53 0.55 0.57 0.59 0.61 0.63 100 200 300 400 500 600 700 800

MAP

Epoch

  • Word Unigram baseline on small dataset
  • Use more data
  • Add Word Bigram
  • Add Character Trigram
  • Add OOV hashing for ngrams
  • More tokens/parameters overfits on small

dataset

slide-26
SLIDE 26

Increase Vocab Size by Model Parallelism

  • 3000 MM
  • 500 MM
  • 180 MM

0.45 0.47 0.49 0.51 0.53 0.55 0.57 0.59 0.61 0.63 200 400 600 800 1000

MAP

Epoch

Performance increases with more parameters in model

slide-27
SLIDE 27

Structured Product Features

Dense Layer Title Embedding Product Features: sales, review rating Product Embedding

Product title embedding

Product Features

slide-28
SLIDE 28

Still Day 1

iphone

P1 P1 P1 P10 P1 P12 P1 P100

Neural Network

P1: 0.12598,0.058533,-0.09845,0.010078,0.045166,-0.014076,… P2: 0.051819,0.0054588,0.0047226,0.045959,-0.015015,… P3: 0.010887,-0.015808,-0.098145,0.039215,-0.058655,-0.085388, P4: 0.042053,0.087402,0.070129,0.082397,-0.051056,-0.089478,

KNN Search

Query Embedding Product Embeddings

Merge Ranking

Query

xr

P1 P1 P2 P2 P9 P9

case

P1 P1 P3 P3 P4 P4 Lexical Matches Semantic Matches

slide-29
SLIDE 29

Still Day 1

slide-30
SLIDE 30

Thank you

Questions? Want to join us?

https://www.amazon.jobs/en/teams/search.html

30