Using Ratings & Posters for Anime & Manga Recommendations - PowerPoint PPT Presentation

Using Ratings & Posters for Anime & Manga Recommendations Jill-Jênn Vie August 31, 2017

Recommendation System Problem ▶ Every user rates few items (1 %) ▶ How to infer missing ratings?

Every supervised machine learning algorithm fjt( X , y ) 42 12 ?disliked 25 24 rating work_id user_id y X y = predict( X ) … … … favorite 823 X y user_id work_id rating 24 like 25 12 823 dislike 12 ?liked ˆ ˆ

Evaluation: Root Mean Squared Error (RMSE) i n If I predict ˆ y i for each user-work pair to test among n , while truth is y ∗ i : � � y , y ∗ ) = ∑ y i − y ∗ RMSE (ˆ � (ˆ i ) 2 . � 1

Dataset 1: Movielens ▶ 700 users ▶ 9000 movies ▶ 100000 ratings

Dataset 2: Mangaki anime / manga / OST fav / like / dislike / neutral / willsee / wontsee ▶ 2100 users ▶ 15000 works ▶ 310000 ratings ▶ User can rate anime or manga ▶ And receive recommendations ▶ Also reorder their watchlist ▶ Code is 100% on GitHub ▶ Awards from Microsoft and Kokusai Kōryū Kikin ▶ Ongoing data challenge on universityofbigdata.net

but u didn’t see K -nearest neighbors Hint KNN → measure similarity ▶ R u represents the row vector of user u in the rating matrix (users × works). ▶ Similarity score between users (cosine): R u · R v score ( u , v ) = ||R u || · ||R v || . ▶ Let’s identify the k -nearest neighbors of user u ▶ And recommend to user u what u ’s neighbors liked R u If R ′ the N × M matrix of rows ||R u || , we can get the N × N score matrix by computing R ′ R ′ T .

If P . Interpreting Key Profjles P 2 : romance P C P 3 : plot twist Matrix factorization P 1 : adventure . And C u . Singular Value Decomposition PCA, SVD → reduce dimension to generalize   R 1 R 2     R = = =       R n Each row R u is a linear combination of profjles P . − 0 , 5 0 , 2 0 , 6 ⇒ u likes a bit adventure, hates romance, loves plot twists. R = ( U · Σ) V T where U : N × r et V : M × r are orthogonal and Σ : r × r is diagonal.

Closer points mean similar taste Visualizing fjrst two columns of V j in SVD

Find your taste by plotting fjrst two columns of U i You will like movies that are close to you

Variants of Matrix Factorization R ratings, C coeffjcients, P profjles ( F features). j i WALS by Tensorfmow™ : Who do you think wins? Objective functions (reconstruction error) to minimize R = CP = CF T ⇒ r ij ≃ ˆ r ij ≜ C i · F j . SVD : ∑ i , j ( r ij − C i · F j ) 2 (deterministic) ALS : ∑ i , j known ( r ij − C i · F j ) 2 i , j known ( r ij − C i · F j ) 2 + λ ( ∑ i N i || C i || 2 + ∑ ALS-WR : ∑ j M j || F j || 2 ) w ij · ( r ij − C i · F j ) 2 + λ ( || C i || 2 + ∑ ∑ ∑ || F j || 2 ) i , j

About the Netfmix Prize The fjrst one who beats our algorithm (Cinematch) by more than 10% will receive 1,000,000 USD. and gave anonymized data in this problem ▶ On October 2, 2006, Netfmix organized an online contest: ▶ Half of world AI community suddenly became interested ▶ October 8, someone beat Cinematch ▶ October 15, 3 teams beat it, notably by 1.06% ▶ June 26, 2009, team 1 beat Cinematch by 10.05% → last call: still one month to win ▶ July 25, 2009, team 2 beat Cinematch by 10.09% ▶ Team 1 does 10.09% also ▶ 20 minutes later team 2 does 10.10% ▶ … Actually, both teams were ex æquo on the validation set ▶ … So the fjrst team to send their results won (team 1, 10.09%)

Privacy concerns ▶ August 2009, Netfmix wanted to restart a contest ▶ Meanwhile, in 2007 two researchers from Texas University could de-anonymize users by crossing data with IMDb ▶ (approximate birth year, zip code, watched movies) ▶ In December 2009, 4 Netfmix users sued Netfmix ▶ March 2010, amicable settlement ( enmankaiketsu ) → complaint is closed

ALS for feature extraction Issue: Item Cold-Start But we have posters! R = CP ▶ If no ratings are available for an anime ⇒ no feature will be trained ▶ If anime features at put to 0 ⇒ prediction of ALS will be constant for every unrated anime. ▶ On Mangaki, almost all works have a poster ▶ How to extract information?

Illustration2Vec (Saito and Matsui, 2015) (1.5M illustrations with tags) ▶ CNN pretrained on ImageNet, trained on Danbooru ▶ 502 most frequent tags kept, outputs tag weights

LASSO for explanation of user preferences Interpretation and explanation Least Absolute Shrinkage and Selection Operator (LASSO) 1 2 N i 2 T matrix of 15000 works × 502 tags ▶ Each user is described by its preferences P → a sparse row of weights over tags. ▶ Estimate user preferences P such that r ij ≃ PT T . ▶ You seem to like magical girls but not blonde hair ⇒ Look! All of them are brown hair ! Buy now! ∥R i − P i T T ∥ 2 + α ∥ P i ∥ 1 . where N i is the number of items rated by user i .

Blending r LASSO ij r LASSO ij r ALS ij r BALSE But we can’t. Why? We would like to do: ij otherwise ij r ALS ij r BALSE { ˆ if item j was rated at least γ times ˆ = ˆ ˆ = σ ( β ( R j − γ ))ˆ + ( 1 − σ ( β ( R j − γ )))ˆ where R j denotes the number of ratings of item j β and γ are learned by stochastic gradient descent. We call this gate the Steins;Gate.

We call this model BALSE. Blended Alternate Least Squares with Explanation tags posters illustration2vec LASSO ratings γ ALS

Results LASSO 1.247 1.150 BALSE 1.358 1.347 1.446 1.493 RMSE 1.299 1.157 ALS Cold-start items 1000 least rated (1.5%) Test set 1.316

Thank you! Read this article http://jiji.cat/bigdata/balse.pdf (soon on arXiv) Compete to Mangaki Data Challenge research.mangaki.fr (problem + University of Big Data) Reproduce our results on GitHub github.com/mangaki Follow us on Twitter: @MangakiFR

Using Ratings & Posters for Anime & Manga Recommendations - PowerPoint PPT Presentation

Using Ratings & Posters for Anime & Manga Recommendations Jill-Jnn Vie August 31, 2017 Recommendation System Problem Every user rates few items (1 %) How to infer missing ratings? Every supervised machine learning algorithm

Using Ratings & Posters for Anime & Manga Recommendations Jill-Jnn Vie 13 Florian Yger 2

Posters Posters Posters Posters Saturday, April 7, 11:45 am Saturday, April 7, 11:45 am

Using Posters and Deep Learning to Recommend Anime & Mangas Jill-Jnn Vie, PhD RIKEN Center

The e Mec echa Anime e Panel el at Aya yacon 20 2013 Introduction Wha What Is Mecha

Manga and Sound Effects The Essentials 3-11 PROJECT FOR HIGH SCHOOL STUDENTS Manga and

How does magma reach the surface? 2004-2008, effusive Michael Manga 1980, explosive Why do

ABOUT US & HISTORY OUTSIDE OF THE WEEKLY MEETINGS AND THE BROCK ANIME CLUB IS AVAILABLE

AnimeContributed anime intros Tue 07 Apr 2020 11:12:54 AM CST

On this side of the Gate: Japanese politics and geopolitics in the anime series Gate: The

Academic Affairs Student Ratings Report University-wide System of Student Ratings on Teaching

Fire Group Ratings & Critical Radiant Flux Fire Group Ratings for Interior wall &

with SDSS-IV/MaNGA 10,000 of these Daniel Thomas University of Portsmouth Daniel Goddard, Taniya

2016 TPHA Annual Meeting 61 poster submissions 33 posters eliminated after 1 st poster

Nanocrafter Xylem vs. r pl r lv r pl r lv r pl r lv + d - d - d + d BLIND BLIND RATINGS

NETFLIX Movie Recommendations Virgil Pavlu Shahzad Rajput Keshi Dai Movie ratings: 1 (bad) - 5

Click to edit Master title style Entergy Practices Dynamic Line Ratings Presented to MISO

PACT Academy Building, Leveraging and Communicating with your Board and Advisors December 8,

k-Nearest Neighbors Lecture 2 k-Nearest Neighbors September 16, 2015 1 Wentworth Institute of

Heads and history nominal domain till in Swedish Prepositions in the verbal domain Infinitival

Buildi ding ng Reco commen mmende ders rs and Searc rch h Engines es by Re-usin sing g

Applications & Tools Demo Technology Open-source, text-mining tool.

Controlling Linguistic Style Aspects in Neural Language Generation Jessica Ficler and Yoav

Privacy Computer Security Peter Reiher December 11, 2014 Lecture 16 Page 1 CS 136, Fall 2014

Clients from Hell or Learning Opportunities? Christine Rondeau Former Chemist Switched to Web

Using Ratings & Posters for Anime & Manga Recommendations - PowerPoint PPT Presentation

Using Ratings & Posters for Anime & Manga Recommendations Jill-Jnn Vie August 31, 2017 Recommendation System Problem Every user rates few items (1 %) How to infer missing ratings? Every supervised machine learning algorithm

Using Ratings &amp; Posters for Anime &amp; Manga Recommendations Jill-Jnn Vie 13 Florian Yger 2

Posters Posters Posters Posters Saturday, April 7, 11:45 am Saturday, April 7, 11:45 am

Using Posters and Deep Learning to Recommend Anime &amp; Mangas Jill-Jnn Vie, PhD RIKEN Center

The e Mec echa Anime e Panel el at Aya yacon 20 2013 Introduction Wha What Is Mecha

Manga and Sound Effects The Essentials 3-11 PROJECT FOR HIGH SCHOOL STUDENTS Manga and

How does magma reach the surface? 2004-2008, effusive Michael Manga 1980, explosive Why do

ABOUT US &amp; HISTORY OUTSIDE OF THE WEEKLY MEETINGS AND THE BROCK ANIME CLUB IS AVAILABLE

AnimeContributed anime intros Tue 07 Apr 2020 11:12:54 AM CST

On this side of the Gate: Japanese politics and geopolitics in the anime series Gate: The

Academic Affairs Student Ratings Report University-wide System of Student Ratings on Teaching

Fire Group Ratings &amp; Critical Radiant Flux Fire Group Ratings for Interior wall &amp;

with SDSS-IV/MaNGA 10,000 of these Daniel Thomas University of Portsmouth Daniel Goddard, Taniya

2016 TPHA Annual Meeting 61 poster submissions 33 posters eliminated after 1 st poster

Nanocrafter Xylem vs. r pl r lv r pl r lv r pl r lv + d - d - d + d BLIND BLIND RATINGS

NETFLIX Movie Recommendations Virgil Pavlu Shahzad Rajput Keshi Dai Movie ratings: 1 (bad) - 5

Click to edit Master title style Entergy Practices Dynamic Line Ratings Presented to MISO

PACT Academy Building, Leveraging and Communicating with your Board and Advisors December 8,

k-Nearest Neighbors Lecture 2 k-Nearest Neighbors September 16, 2015 1 Wentworth Institute of

Heads and history nominal domain till in Swedish Prepositions in the verbal domain Infinitival

Buildi ding ng Reco commen mmende ders rs and Searc rch h Engines es by Re-usin sing g

Applications &amp; Tools Demo Technology Open-source, text-mining tool.

Controlling Linguistic Style Aspects in Neural Language Generation Jessica Ficler and Yoav

Privacy Computer Security Peter Reiher December 11, 2014 Lecture 16 Page 1 CS 136, Fall 2014

Clients from Hell or Learning Opportunities? Christine Rondeau Former Chemist Switched to Web

Using Ratings & Posters for Anime & Manga Recommendations Jill-Jnn Vie 13 Florian Yger 2

Using Posters and Deep Learning to Recommend Anime & Mangas Jill-Jnn Vie, PhD RIKEN Center

ABOUT US & HISTORY OUTSIDE OF THE WEEKLY MEETINGS AND THE BROCK ANIME CLUB IS AVAILABLE

Fire Group Ratings & Critical Radiant Flux Fire Group Ratings for Interior wall &

Applications & Tools Demo Technology Open-source, text-mining tool.