ML for the industry Part 1
MLSS 2016 – Cádiz Nicolas Le Roux Criteo
Part 1 MLSS 2016 Cdiz Nicolas Le Roux Criteo Why such a class? - - PowerPoint PPT Presentation
ML for the industry Part 1 MLSS 2016 Cdiz Nicolas Le Roux Criteo Why such a class? Companies are an ever growing opportunity for ML researchers Academics know about the publications of these companies ...but not about the less
MLSS 2016 – Cádiz Nicolas Le Roux Criteo
226 200 250 250 81 36 12 100 40 60 26 640 100 120 520
226 226 526 776 1026 1026 1147 1883 1983 1983 2141 2661
500 1000 1500 2000 2500 3000
Structured data
Unstructured data
0 1 2 i (P-2) (P-1)
0 1 2 i (P-2) (P-1)
≈ CTR("http://google.com")
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 1+ 𝑓−𝑥𝑈𝑦 ≈ 𝑓𝑥𝑈𝑦 = 𝑗 𝑓𝑥𝑗𝑦𝑗
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
The values in M are the same as those in w!
2.3 1.1 3.7
1.1 2.3
2.3
3.7
3.7
5.9 1.1 2.3 5.9 3.7 5.9
1.1
2.3
3.7 5.9
1.1 1.1 5.9 5.9 5.9 M=
The structure is determined by the hashing function
increase statistical efficiency
𝑥 𝑗 𝑚(𝑥, 𝑦𝑗, 𝑧𝑗)
𝑥 𝑗 𝑚(𝑥, 𝑦𝑗, 𝑧𝑗) + 𝜇 𝑥 1
𝑥 𝑗 𝑚(𝑥, 𝑦𝑗, 𝑧𝑗) + 𝜇 ℊ 𝑥ℊ 2
2.3 1.1
2.3
3.7 5.9
2.3
1.1 M=
2.3
3.7
1.1 2.3
5.9 2.3 1.1
Rendle, S. Factorization machines. In Data Mining (ICDM), 2010 IEEE 10th International Conference on (pp. 995-1000). IEEE.
gobernie.com drumpf4ever.com hillaryous.com S&W
f(𝑥𝑐𝑓𝑠𝑜𝑗𝑓 + 𝑥𝑇&𝑋) f(𝑥𝑒𝑠𝑣𝑛𝑞𝑔 + 𝑥𝑇&𝑋) f(𝑥ℎ𝑗𝑚𝑚𝑏𝑠𝑧 + 𝑥𝑇&𝑋)
Carebear
f(𝑥𝑐𝑓𝑠𝑜𝑗𝑓 + 𝑥𝑑𝑏𝑠𝑓𝑐𝑓𝑏𝑠) f(𝑥𝑒𝑠𝑣𝑛𝑞𝑔 + 𝑥𝑑𝑏𝑠𝑓𝑐𝑓𝑏𝑠) f(𝑥ℎ𝑗𝑚𝑚𝑏𝑠𝑧 + 𝑥𝑑𝑏𝑠𝑓𝑐𝑓𝑏𝑠)
JP Morgan
f(𝑥𝑐𝑓𝑠𝑜𝑗𝑓 + 𝑥𝐾𝑄𝑁𝑝𝑠𝑏𝑜) f(𝑥𝑒𝑠𝑣𝑛𝑞𝑔 + 𝑥𝐾𝑄𝑁𝑝𝑠𝑏𝑜) f(𝑥ℎ𝑗𝑚𝑚𝑏𝑠𝑧 + 𝑥𝐾𝑄𝑁𝑝𝑠𝑏𝑜)
gobernie.com drumpf4ever.com hillaryous.com S&W
f(𝑥𝑐𝑓𝑠𝑜𝑗𝑓,𝑇&𝑋) f(𝑥𝑒𝑠𝑣𝑛𝑞𝑔,𝑇&𝑋) f(𝑥ℎ𝑗𝑚𝑚𝑏𝑠𝑧,𝑇&𝑋)
Carebear
f(𝑥𝑐𝑓𝑠𝑜𝑗𝑓,𝑑𝑏𝑠𝑓𝑐𝑓𝑏𝑠) f(𝑥𝑒𝑠𝑣𝑛𝑞𝑔,𝑑𝑏𝑠𝑓𝑐𝑓𝑏𝑠) f(𝑥ℎ𝑗𝑚𝑚𝑏𝑠𝑧,𝑑𝑏𝑠𝑓𝑐𝑓𝑏𝑠)
JP Morgan
f(𝑥𝑐𝑓𝑠𝑜𝑗𝑓,𝐾𝑄𝑁𝑝𝑠𝑏𝑜) f(𝑥𝑒𝑠𝑣𝑛𝑞𝑔,𝐾𝑄𝑁𝑝𝑠𝑏𝑜) f(𝑥ℎ𝑗𝑚𝑚𝑏𝑠𝑧,𝐾𝑄𝑁𝑝𝑠𝑏𝑜)
gobernie.com drumpf4ever.com hillaryous.com S&W
f(𝒙𝑐𝑓𝑠𝑜𝑗𝑓 ∙ 𝒙𝑇&𝑋) f(𝒙𝑒𝑠𝑣𝑛𝑞𝑔 ∙ 𝒙𝑇&𝑋) f(𝒙ℎ𝑗𝑚𝑚𝑏𝑠𝑧∙ 𝒙𝑇&𝑋)
Carebear
f(𝒙𝑐𝑓𝑠𝑜𝑗𝑓 ∙ 𝒙𝑑𝑏𝑠𝑓𝑐𝑓𝑏𝑠) f(𝒙𝑒𝑠𝑣𝑛𝑞𝑔 ∙ 𝒙𝑑𝑏𝑠𝑓𝑐𝑓𝑏𝑠) f(𝒙ℎ𝑗𝑚𝑚𝑏𝑠𝑧 ∙ 𝒙𝑑𝑏𝑠𝑓𝑐𝑓𝑏𝑠)
JP Morgan
f(𝒙𝑐𝑓𝑠𝑜𝑗𝑓 ∙ 𝒙𝐾𝑄𝑁𝑝𝑠𝑏𝑜) f(𝒙𝑒𝑠𝑣𝑛𝑞𝑔 ∙ 𝒙𝐾𝑄𝑁𝑝𝑠𝑏𝑜) f(𝒙ℎ𝑗𝑚𝑚𝑏𝑠𝑧 ∙ 𝒙𝐾𝑄𝑁𝑝𝑠𝑏𝑜)
Standard cross-features
Standard cross-features
2.3 1.1 3.7
1.1 2.3
2.3
3.7
3.7
5.9 1.1 2.3 5.9 3.7 5.9
1.1
2.3
3.7 5.9
1.1 1.1 5.9 5.9 5.9
2.3 1.1
2.3
3.7 5.9
2.3
1.1
2.3
3.7
1.1 2.3
5.9 2.3 1.1
predict the clicks
He et al. Practical Lessons from Predicting Clicks on Ads at Facebook. ADKDD
categorical features
Gathering data Generating logs Learning the model
Gathering data Generating logs Learning the model Gain
Gathering data Generating logs Learning the model Gain
Stochastic methods
Batch methods
Stochastic methods
Batch methods
their values
P(C=1 | Y=0, E) =
D>E P(C = 1, D)dD
Chapelle, O. Modeling delayed feedback in display advertising. KDD
Michaeli et al. Semi-supervised single- and multi-domain regression with multi-domain training. Information and Inference