Recommender system industry challenges move towards real-world, - PowerPoint PPT Presentation

Get on with it! Recommender system industry challenges move towards real-world, online evaluation Padova – March 23th, 2016 Andreas Lommatzsch - TU Berlin, Berlin, Germany Jonas Seiler - plista, Berlin, Germany Daniel Kohlsdorf - XING, Hamburg, Germany CrowdRec - www.crowdrec.eu

Andreas Lommatzsch • Andreas Andreas.Lommatzsch@tu-berlin.de http://www.dai-lab.de

Jonas Seiler • s Jonas.Seiler@plista.com http://www.plista.com

Daniel Kohlsdorf • Daniel Daniel.Kohlsdorf@xing.com http://www.xing.com

Moving towards real-world evaluation Where are recommender system challenges headed? Direction 1: Use info beyond the user- item matrix. Direction 2: Online evaluation + multiple metrics. Flickr credit: rodneycampbell

Why evaluate? <Images showing “our” use cases> Evaluation is crucial for the success of real-life systems • ● ● How should we evaluate? • Influence on sales Precision and ● Recall ● Required hardware ● Technical resources complexity Business ● User models ● satisfaction ● Scalability Diversity of the presented results

Traditional Evaluation in IR Evaluation Settings • A static collection of documents • A set of queries • A list of relevant documents defined by Query0 experts for each query * #nn * #nn * #nn Advantages “The Cranfield paradigm” • Reproducible setting • All researches have exactly the same information • Optimized for measuring precision

Traditional Evaluation in IR Weaknesses of traditional IR evaluation • High costs for creating dataset • Datasets are not up-to-date • Domain-specific documents • The expert-defined ground truth does not consider individual user preferences • Individual user preferences Context is everythin g • Context-awareness is not considered • Technical aspects are ignored

Industry and recsys challenges Challenges benefit both industry and academic research. • • We look at how industry challenges have evolved since the Netflix prize 2009.

Traditional Evaluation in RecSys Evaluation Settings • Rating prediction on user-item matrices • Large, sparse dataset • Predict personalized ratings • Cross-validation, RMSE Advantages • Reproducible setting • Personalization • Dataset is based on real user ratings “The Netflix paradigm”

Traditional Evaluation in RecSys Weaknesses of traditional Recommender evaluation • Static data • Only one type of data - only user ratings • User ratings are noisy • Temporal aspects tend to be ignored • Context-awareness is not considered • Technical aspects are ignored

Challenges of Developing Applications Challenges • Data streams - continuous changes • Big data • Combine knowledge from different sources • Context-Awareness • Users expect personally relevant results • Heterogeneous devices • Technical complexity , real-time requirements

How to Setup a better Evaluation? ● How to address these challenges in the Evaluation? • Realistic evaluation setting ● – Heterogeneous data sources – Streams – Dynamic user feedback ● • Appropriate metrics – Precision and User satisfaction ● – Technical complexity – Sales and Business models • Online and Offline Evaluation

Approaches for a better Evaluation • News recommendations @ plista • Job recommendations @ XING

The plista Recommendation Scenario Setting ● 250 ms response time ● 350 Mio AI/day ● In 10 Countries Challenges ● News change continuously ● User do not log-in explicitly ● Seasonality, context- depend user preferences

Evaluation @ plista Offline Online • • Cross-validation AB Tests – – M etric O ptimization E ngine Limited • (https://github.com/Yelp/MOE) by Caching Memory – • Integration into Spark Computational • How well does it correlate with Resources – Online Evaluation? MOE* • Time Complexity

Evaluation using MOE Offline • Mean and variance estimation of parameter space with Gaussian Process • Evaluate parameter with highest Expected Improvement (EI), Upper Confidence Interval …. • Rest API

Evaluation using MOE Online • A/B Tests are expensive • Model non-stationarity • Integrate out non-stationarity to get mean EI

The CLEF-NewsREEL challenge Provide an API enabling researchers testing own ideas • The CLEF-NewsREEL challenge • A Challenge in CLEF (Conferences and Labs of the Evaluation Forum) • 2 Tasks: Online and Offline Evaluation

CLEF-NewsREEL Online Task How does the challenge work? • Live streams consisting of impressions, requests, and clicks, 5 publishers, approx 6 Million messages per day • Technical requirements: 100 ms per request • Live evaluation based on CTR

CLEF-NewsREEL Offline Task Online vs. Offline Evaluation • Technical aspects can be evaluated without user feedback • Analyze the required resources and the response time • Simulate the online evaluation by replaying a recorded stream

CLEF-NewsREEL Offline Task Challenge • Realistic simulation of streams • Reproducible setup of computing environments Solution • A framework simplifying the setup of the evaluation environment • The Idomaar framework developed in the CrowdRec project http://rf.crowdrec.eu

CLEF-NewsREEL More Information • SIGIR forum Dec 2015 (Vol 49, #2) http://sigir.org/files/forum/2015D/p129.pdf Evaluate your algorithm online and offline in NewsREEL • Register for the challenge! http://crowdrec.eu/2015/11/clef-newsreel-2016/ (register until 22nd of April) • Tutorials and Templates are provided at orp.plista.com

XING - RecSys Challenge https://recsys.xing.com/

Job Recommendations @ XING

XING - Evaluation based on interaction ● On Xing users can give feedback on recommendations. ● Number of user feedback way lower than implicit measures. ● A/B Tests focus on clickthrough rate.

XING - RecSys Challenge, Scoring, Space on Page Top 6 ● Predict 30 items for each user. ● Score: weighted combination of the precision ○ precisionAt(2) ○ precisionAt(4) ○ precisionAt(6) ○ precisionAt(20)

XING - RecSys Challenge, User Data • User ID • Job Title • Educational Degree • Field of Study • Location

XING - RecSys Challenge, User Data • Number of past jobs • Years of Experience • Current career level • Current discipline • Current industry

XING - RecSys Challenge, Item Data • Job title • Desired career level • Desired discipline • Desired industry

XING - RecSys Challenge, Interaction Data • Timestamp • User • Job • Type: – Deletion – Click – Bookmark

XING - RecSys Challenge, Anonymization

XING - RecSys Challenge, Future • Live Challenge – Users submit predicted future interactions – The solution is recommended on the platform – Participants get points for actual user clicks Score Release to Challenge Collect Clicks Work On Predictions

Concluding ... How to setup a better Evaluation • Consider different quality criteria (prediction, technical, business models) • Aggregate heterogeneous information sources • Consider user feedback • Use online and offline analyses to understand users and their requirements

Concluding ... Participate in challenges based on real-life scenarios • • NewsREEL challenge RecSys 2016 challenge http://orp.plista.com http://2016.recsyschallenge.com/ => Organize a challenge. Focus on real-life data .

Thank You More Information • http://www.crowdrec.eu • http://www.clef-newsreel.org • http://orp.plista.com • http://2016.recsyschallenge.com • http://www.xing.com

Recommender system industry challenges move towards real-world, - PowerPoint PPT Presentation

Get on with it! Recommender system industry challenges move towards real-world, online evaluation Padova March 23th, 2016 Andreas Lommatzsch - TU Berlin, Berlin, Germany Jonas Seiler - plista, Berlin, Germany Daniel Kohlsdorf - XING,

Web Mining and Recommender Systems Recommender Systems: Introduction Learning Goals

Recommender System for Real Mobile Applications: Two Case Studies Big data vs. small data &

Privacy in Recommender Systems CompSci 590.03 Instructor: Ashwin Machanavajjhala Lecture 21:

2. Recommender Systems Recommenders Everywhere Advanced Topics in Information Retrieval /

Affect- and Personality-based Recommender Systems Part II: Acquisition, Usage in Recommender

On the Economics of Recommender Systems Emilio Calvano Center for Studies in Econ and Finance U.

Content- -based Recommender Systems based Recommender Systems Content problems, challenges

Recommender Systems Research Challenges Francesco Ricci Free University of Bozen-Bolzano

ACADEMIC RECOMMENDER SYSTEM DESIGN 1 WHATS ACADEMIC RECOMMENDER SYSTEM Similar

Real graduates, Real graduates, real transitions, real transitions, real stories: real

COMP9313: Big Data Management Recommender System Source from Dr. Xin Cao Recommendations

Recommender Systems MLSS 14 Collaborative Filtering and other approaches Xavier Amatriain

CSE 255 Lecture 5 Data Mining and Predictive Analytics Recommender Systems Why

Recommendations, Activities, and Behavior Feb 9, 2018 Julian McAuley Where are recommender

CSE 158 Lecture 7 Web Mining and Recommender Systems Recommender Systems Announcements

Web Mining and Recommender Systems Advanced Recommender Systems: Bayesian Personalized Ranking

Social Data Toby Segaran Author, Programming Collective Intelligence Data Magnate, Metaweb

Automating Operational Decisions in Real-time Chris Sanden Senior Analytics Engineer About Me.

Seeing the unseen: from coin flips to statistical inverse problems Alberto J. Coca StatsLab,

Castlestone FAANG+ UCITS Fund Q4 2019 Fund Overview AQA UCITS Fund SICAV plc is licensed in

Transforming Transforming Business with Social Media Technologies Yuqing (Ching) Ren I f

Index Investing Core and Satellite Approach to Portfolio Construction Active approach

Dieudonn Kantu Ipsos Ian Durbach Ipsos & Department of Statistical Sciences, University of

THE GLOBAL COMMISSION ON THE ECONOMY AND CLIMATE Better Growth, Better Climate: The New Climate

Recommender system industry challenges move towards real-world, - PowerPoint PPT Presentation

Get on with it! Recommender system industry challenges move towards real-world, online evaluation Padova March 23th, 2016 Andreas Lommatzsch - TU Berlin, Berlin, Germany Jonas Seiler - plista, Berlin, Germany Daniel Kohlsdorf - XING,

Web Mining and Recommender Systems Recommender Systems: Introduction Learning Goals

Recommender System for Real Mobile Applications: Two Case Studies Big data vs. small data &amp;

Privacy in Recommender Systems CompSci 590.03 Instructor: Ashwin Machanavajjhala Lecture 21:

2. Recommender Systems Recommenders Everywhere Advanced Topics in Information Retrieval /

Affect- and Personality-based Recommender Systems Part II: Acquisition, Usage in Recommender

On the Economics of Recommender Systems Emilio Calvano Center for Studies in Econ and Finance U.

Content- -based Recommender Systems based Recommender Systems Content problems, challenges

Recommender Systems Research Challenges Francesco Ricci Free University of Bozen-Bolzano

ACADEMIC RECOMMENDER SYSTEM DESIGN 1 WHATS ACADEMIC RECOMMENDER SYSTEM Similar

Real graduates, Real graduates, real transitions, real transitions, real stories: real

COMP9313: Big Data Management Recommender System Source from Dr. Xin Cao Recommendations

Recommender Systems MLSS 14 Collaborative Filtering and other approaches Xavier Amatriain

CSE 255 Lecture 5 Data Mining and Predictive Analytics Recommender Systems Why

Recommendations, Activities, and Behavior Feb 9, 2018 Julian McAuley Where are recommender

CSE 158 Lecture 7 Web Mining and Recommender Systems Recommender Systems Announcements

Web Mining and Recommender Systems Advanced Recommender Systems: Bayesian Personalized Ranking

Social Data Toby Segaran Author, Programming Collective Intelligence Data Magnate, Metaweb

Automating Operational Decisions in Real-time Chris Sanden Senior Analytics Engineer About Me.

Seeing the unseen: from coin flips to statistical inverse problems Alberto J. Coca StatsLab,

Castlestone FAANG+ UCITS Fund Q4 2019 Fund Overview AQA UCITS Fund SICAV plc is licensed in

Transforming Transforming Business with Social Media Technologies Yuqing (Ching) Ren I f

Index Investing Core and Satellite Approach to Portfolio Construction Active approach

Dieudonn Kantu Ipsos Ian Durbach Ipsos &amp; Department of Statistical Sciences, University of

THE GLOBAL COMMISSION ON THE ECONOMY AND CLIMATE Better Growth, Better Climate: The New Climate

Recommender System for Real Mobile Applications: Two Case Studies Big data vs. small data &

Dieudonn Kantu Ipsos Ian Durbach Ipsos & Department of Statistical Sciences, University of