 
              Look-a-likes How Internet Giants Reach the Most Relevant Audience at Scale Moran Gavish, Outbrain mgavish@outbrain.com Big Data Moscow, October-11 th -2018
Outbrain’s Mission: Helping people discover great content
3
+550M +250B 6000 Unique Monthly Global Recommendations Servers Audience Served Monthly Across 3 Data Center 35K/Sec Requests @ 50ms
5
5
Online Marketing KPIs Marketers optimize by their marketing objectives TRAFFIC Visitors ENGAGEMENT Time spent Video views Comments Social sharing ACTION Purchases Leads Downloads Sign-ups
Returning Audience • Past engagement is a great predictor for future action. • Therefore, u sers who engaged in the past (e.g. visited marketer’s website) are much more likely to convert and hence will be targeted aggressively. 8
Remarketing – How is it working from an advertiser’s perspective 9
What are “Lookalike Audiences”? • “Lookalike Audience” - is a way to reach new people who are similar to the “ engaged ” ( seed) population. 10
Consider the following online retailer… 11
Case Study • has a list of 1K users that visited their website, but did not complete a purchase (“seed users”). • The seed users respond greatly to ‘s online campaign. • However, its scale is marginal 12
Schematic Conceptual View Original Seed Users Amplified Look-a-like Audience • Lookalike audience amplifies reach by targeting users who are “similars” to the seed users. 13
Schematic Product View New LAL Input parameters: • Seed users Modeling Request • Required Reach Create LAL Model Models Repository Online Serving LAL Scoring (1ms) • Real-time vs Offline LAL Scoring 14
Q: How to identify LAL users? A: Classification with confidence. Seed Users 15
How do marketing platforms get to know their online audience? SEARCH SOCIAL INTEREST GRAPH GRAPH GRAPH What they are What they are What they are sharing searching for reading & watching (Driven by Ego) ROI Engagement Strategy Creation Discovery #OutbrainMasterclass
Dissonanse between Consuming vs. Sharing Content - More Consume - More Share • For Your Eyes Only: Consuming vs. Sharing Content, APRIL 4, 2016 | ROY SASSON , RAM MESHULAM 17
How is user data represented? • The user data is represented by the content she consumed i.e. categories of the articles she read in the past and the websites where she read them. • There is neither identification nor demographic data 18
What are the negative examples for training the classifier? Seed Users Perhaps an “unbiased sample” of the general population? 19
1 st LAL Classifier Seed Users General Population (“Unbiased sample”) 20
1 st LAL Classifier 50% - 40% - 5% - 5% - Rest of World 90% from Seed Users General Population (“Unbiased sample”) 21
Siloed LAL Modeling • Observation: “Commonly, the seed users are associated to a small set of distinct silos”. 22
Sub - Silos • Country – Language – Platform Triplets (equivalence classes) – E.g. “US_English_Desktop”, “ ES_Spanish_ Mobile”, etc… • Important Property : In Serving time, a request corresponds to a one unique Silo. 23
User Data Example • The user data is: Categories (What) Page Views # 26 Politics – Aggregated Basketball 18 – Decayed Investing 14 Justice 10 – Top-K Dining 8 – Un-structured (JSON) 4 Marketing • Normalization and Flattening are required Music 3 Autos 1 1 Celebrities Websites (Where) Page Views # http://www.cnn.com/news 30 http://www.espn.com/basketball 17 http://fox-news/news 12 http://www.wired.com/journalists 6 http://www.the-sun.com/men 5 http://www.outbrain.com/blog/ 4 http://www.foodsdictionary.co.il/recipes 3 http://www.geektime.co.il/ 3 http://www.maariv.co.il/culture 2 24
Within a Silo (All the users are of the same Country – Language – Platform) • Repository of “Neutral” users • Important Property : Higher homogenity of the user profiles. • Flattening and Dimensionality reduction 25
Classification Formulation Concatenate into User Categories Web Domains a unified sparse (100 features) (500 features) feature-vector Seed 1 Seed 2 Seed 3 Neutral 1 Neutral 2 Neutral 3 Neutral 4 Neutral 5 (Sparse Users-Features Matrix) • Best classification performance achieved using Random-Forests • White-box classification algorithms 26
Schematic LAL View 27
Towards Productization • Requirement 1: Score 1K Models in less than 1 ms • Requirement 2: Reasonable Memory Footprint (per additional LAL Model) • Requirement 3: Work for any size of training set (even when numOfSeedUsers << numOfFeatures) Reduction of Dimensionality 28
Dimensionality Reduction • Observation: The features within user profiles are highly correlated . Recall: Principal components analysis (PCA) is a procedure for identifying a smaller number of uncorrelated variables, called " principal components ", from a large set of data. The goal of PCA is to explain the maximum amount of variance with the fewest number of principal components. • Eigen Faces 29
Eigen Profiles Eigen Profile (Human interpretation) PCA # Politics 1 2 MSN no Sport Sport Fans 3 Autos-Investing-Computers (Men?) 4 Celebrities 5 Investing 6 7 CNN over Fox (Liberals?) News junkies 8 Stock Markets 9 Dining 10 Television but not Celebrities 11 12 Football but not Basketball Travel but not Autos and not Dining 13 Baseball 14 Interpersonal Relationships 15 16 Football and Basketball but not Baseball 17 Education War and Conflict but not Travel 18 Lifestyle 19 30
Eigen Profiles Transform User Categories Web Eigen Profile 1 Eigen Profile 2 Eigen Profile 3 Eigen Profile 4 Eigen Profile 5 Eigen Profile 6 Eigen Profile 7 … (100 features) Domains 𝑌 = (500 features) Seed 1 Seed 2 Neutral 1 600 𝑌 250 Neutral 2 ( 𝑇 + 𝑂) 𝑌 600 (Sparse Users-Features Matrix) User Eigen Space (Up to 250) Seed 1 Seed 2 Neutral 1 Neutral 2 ( 𝑇 + 𝑂) 𝑌 250 (Dense Users-Features Matrix in eigen space) 31
Eigen Space Operations • Logistic Regression • Reconstruct the model coefficients (Inverse Transform) 32
Experiments • Methodology – Same Dates – Same CPC – Same Ads • Groups – Control – Look-a-Likes – Re-targeting • Results 33
Hair Building (magical) Science - Example • What is the top LAL differentiating Category? 34
Some Remarks • Silos – No Free meals – Complex Engineering • Example bias in time – World Cup • Eliminate cherry picking – All models compete on the same users - Remove meta data • Recap the entire Process X Silos 35
One more Takeaway (Konstanz Information Miner) 36
Thank You mgavish@outbrain.com
Backup Slides
Flattening Politics Crime Celebrities Television Soccer Basketball Mobile Science Careers Health Investing Education Aging Category Page Views Weight Science 7 0.7 Basketball 2 0.2 0 0 0 0 0 0.2 0 0.7 0 0 0.1 0 0 Investing 1 0.1 (Sparse Categories Vector) • ~100 Categories • Loseless 39
Flattening (cont.) • > 1M Websites  Lossy 40% Web Domains Popularity Popularity 30% Website PVs www.cnn.com/news 2 20% US-Desktop www.cnn.com/politics 2 10% Israel-Desktop www.espn.com/nfl 2 0% www.espn.com/mma 1 0 20 40 60 80 100 www.techcrunch.com 1 Web Domain Rank www.really-good-food.com 1 www.cnn.com www.msn.com www.foxnews.com www.espn.com www.wired.com www.mtv.com www.techcrunch.com www.RateMyProfessors.com www.ask.com www.dummies.com www.historychannel.com www.vogue.com Long Tail Domains www.best-invest.com 1 Clustering of websites into Web Domains Web Domain PVs Weight www.cnn.com 4 0.4 www.espn.com 3 0.3 www.techcrunch.com 1 0.1 Long Tail Domains 2 0.2 0.4 0 0 0.3 0 0 0.1 0 0 0 0 0 0.2 (Sparse Web Domains Vector) 40
Features Required vs Data Loss Web Domains Flattening “Lossy - ness” 100% 90% 80% 70% Acc. Weight 60% 50% US-Desktop 40% Israel-Desktop 30% 20% 10% 0% 0 50 100 150 200 250 300 350 400 450 500 # Features 41
Classification Results vs. Feature Sets 42
Recommend
More recommend