Multi-Modal Adversarial Autoencoders for Recommendations of - - PowerPoint PPT Presentation

multi modal adversarial autoencoders for recommendations
SMART_READER_LITE
LIVE PREVIEW

Multi-Modal Adversarial Autoencoders for Recommendations of - - PowerPoint PPT Presentation

TraininG towards a society of data-saVvy inforMation prOfessionals to enable open leadership INnovation Multi-Modal Adversarial Autoencoders for Recommendations of Citations and Subject Labels Florian Mai Iacopo Vagliano Ansgar Scherp Lukas


slide-1
SLIDE 1

TraininG towards a society of data-saVvy inforMation prOfessionals to enable open leadership INnovation

Multi-Modal Adversarial Autoencoders for Recommendations of Citations and Subject Labels

Lukas Galke Florian Mai Iacopo Vagliano Ansgar Scherp Kiel University, Germany

UMAP ’18: 26th Conference on User Modeling, Adaptation and Personalization, July 8–11, 2018, Singapore

www.moving-project.eu UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [1/26]

slide-2
SLIDE 2

“Avoid using GANs, if you care for your mental health”

  • Alfredo Canziani

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [2/26]

slide-3
SLIDE 3

Motivation

◮ Adversarial regularization improves autoencoders on images (Makhzani et al. 2015)

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [3/26]

slide-4
SLIDE 4

Motivation

◮ Adversarial regularization improves autoencoders on images (Makhzani et al. 2015) ◮ Autoencoders enable flexible treatment of multi-modal input (Barbieri et al. 2017)

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [3/26]

slide-5
SLIDE 5

Motivation

◮ Adversarial regularization improves autoencoders on images (Makhzani et al. 2015) ◮ Autoencoders enable flexible treatment of multi-modal input (Barbieri et al. 2017)

Research Questions

  • 1. Does adversarial regularization improve autoencoders for

recommendation?

  • 2. To what extent do preferable input modalities depend on task?
  • 3. What is the effect of sparsity?

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [3/26]

slide-6
SLIDE 6

Two Different Tasks

Recommendations for citations (left) and subject labels (right)

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [4/26]

slide-7
SLIDE 7

Two Different Tasks

Recommendations for citations (left) and subject labels (right)

◮ Two recommendation tasks on scientific documents ◮ Items are either citations or subject labels ◮ Assumption: test documents unknown

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [4/26]

slide-8
SLIDE 8

Example

◮ A researcher is writing a new paper

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [5/26]

slide-9
SLIDE 9

Example

◮ A researcher is writing a new paper ◮ the draft cites already 10 other publications (item set)

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [5/26]

slide-10
SLIDE 10

Example

◮ A researcher is writing a new paper ◮ the draft cites already 10 other publications (item set) ◮ “Am I missing any relevant related work?”

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [5/26]

slide-11
SLIDE 11

Example

◮ A researcher is writing a new paper ◮ the draft cites already 10 other publications (item set) ◮ “Am I missing any relevant related work?” → Recommend citation candidates

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [5/26]

slide-12
SLIDE 12

Example

◮ A researcher is writing a new paper ◮ the draft cites already 10 other publications (item set) ◮ “Am I missing any relevant related work?” → Recommend citation candidates Important: the draft is unseen by the system (New User)

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [5/26]

slide-13
SLIDE 13

Exploit Multiple Modalities

Use only citations of draft?

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [6/26]

slide-14
SLIDE 14

Exploit Multiple Modalities

Use only citations of draft? Hmm, there is more...

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [6/26]

slide-15
SLIDE 15

Exploit Multiple Modalities

Use only citations of draft? Hmm, there is more... Use more data of the draft?

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [6/26]

slide-16
SLIDE 16

Exploit Multiple Modalities

Use only citations of draft? Hmm, there is more... Use more data of the draft? Yes.

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [6/26]

slide-17
SLIDE 17

Exploit Multiple Modalities

Use only citations of draft? Hmm, there is more... Use more data of the draft? Yes. → Start with title, but it could be more (condition)

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [6/26]

slide-18
SLIDE 18

Related Work

◮ Document-level citation recommendation: collaborative filtering (McNee et al. 2002), SVD (Caragea et al. 2013)

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [7/26]

slide-19
SLIDE 19

Related Work

◮ Document-level citation recommendation: collaborative filtering (McNee et al. 2002), SVD (Caragea et al. 2013) ◮ Subject Labeling: MLP for Multi-label classification (Galke et al. 2017), but professionals use predictions only as recommendations

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [7/26]

slide-20
SLIDE 20

Approach

Multi-Modal Adversarial Autoencoder

◮ Train autoencoder on item sets

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [8/26]

slide-21
SLIDE 21

Approach

Multi-Modal Adversarial Autoencoder

◮ Train autoencoder on item sets ◮ Supply condition to the decoder (multi-modal)

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [8/26]

slide-22
SLIDE 22

Approach

Multi-Modal Adversarial Autoencoder

◮ Train autoencoder on item sets ◮ Supply condition to the decoder (multi-modal) ◮ Jointly train encoder to produce code indistinguishable from a sample of indepentend Gaussians (adversarial)

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [8/26]

slide-23
SLIDE 23

Rationale

◮ Recommendation tasks are highly sparse ◮ Good representations (Bengio, Courville, and Vincent 2012) might be helpful, e.g. smoothness a ≈ b → f (a) ≈ f (b) ◮ Enforce smoothness on the code (adversarial regularization)

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [9/26]

slide-24
SLIDE 24

Rationale

◮ Recommendation tasks are highly sparse ◮ Good representations (Bengio, Courville, and Vincent 2012) might be helpful, e.g. smoothness a ≈ b → f (a) ≈ f (b) ◮ Enforce smoothness on the code (adversarial regularization) ◮ Leads to more generalizable reconstruction? → RQ 1

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [9/26]

slide-25
SLIDE 25

Adversarial Autoencoders

Model Overview

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [10/26]

slide-26
SLIDE 26

Multi-Layer Perceptron

Parts used for the Multi-Layer Perceptron (MLP)

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [11/26]

slide-27
SLIDE 27

Undercomplete Autoencoders

Parts used for the Autoencoder (AE)

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [12/26]

slide-28
SLIDE 28

Adversarial Autoencoders

Parts used for the Adversarial Autoencoder (AAE)

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [13/26]

slide-29
SLIDE 29

Experimental Setup

Close to real-world evaluation: ◮ Train and test split on the time axis → disjoint “only published resources are citable” ◮ Number of considered items is crucial (Beel et al. 2016) → pruning thresholds as controlled variable ◮ Title as additional input (as condition) vs. only item sets ◮ Datasets: PubMed for citations, Econ62k for subject labels ◮ Evaluate mean reciprocal rank (MRR) of one dropped item among the predictions. ◮ Re-run three times → 408 experiments.

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [14/26]

slide-30
SLIDE 30

Time Split: PubMed

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [15/26]

slide-31
SLIDE 31

Time Split: Economics

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [16/26]

slide-32
SLIDE 32

Task Definition

Task: Given a partial set of items x \ {i∗}, find the missing item i∗. x row of binary ratings over documents × items. c condition: documents’ title y predicted probabilities for items: p(y|x, c) Goal: Missing item on high rank i∗ = arg max y

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [17/26]

slide-33
SLIDE 33

Method Summary

Only item sets

IC Item Co-occurence (McNee et al. 2002)

Only titles

MLP y = MLPdec(c)

Multi-Modal

SVD Singular value decomposition (Caragea et al. 2013) AE y = MLPdec(MLPenc(x)[, c]) AAE y = MLPdec(MLPenc(x)[, c]). Encoder MLPenc jointly optimized to fool discriminator MLPdisc.

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [18/26]

slide-34
SLIDE 34

Citation Recommendation

PubMed: MRR of methods by pruning threshold on minimum item count

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [19/26]

slide-35
SLIDE 35

Subject Label Recommendation

Economics: MRR of methods by pruning threshold on minimum item count

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [20/26]

slide-36
SLIDE 36

Results

◮ AAE yields consistently higher scores than AE ◮ Multiple input modalities improve both AE and AAE ◮ Surprising: MLP using only title data yields higher scores than AAE on subject labels but lower scores than AAE on citations

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [21/26]

slide-37
SLIDE 37

The Semantics of Item Co-Occurrence

What does it mean if two items co-occur in a document? ◮ Citation co-occurrence ≈ relatedness (Small 1973) → partial item set helpful ◮ Subject label co-occurrence ≈ diversity (guidelines) → partial item set not helpful, rather distracting

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [22/26]

slide-38
SLIDE 38

The Semantics of Item Co-Occurrence

What does it mean if two items co-occur in a document? ◮ Citation co-occurrence ≈ relatedness (Small 1973) → partial item set helpful ◮ Subject label co-occurrence ≈ diversity (guidelines) → partial item set not helpful, rather distracting Our two tasks are prototypical for each case. What is inbetween?

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [22/26]

slide-39
SLIDE 39

Some Intuition

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [23/26]

slide-40
SLIDE 40

Some Intuition

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [24/26]

slide-41
SLIDE 41

Some Intuition

  • 1. Manifold Learning

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [24/26]

slide-42
SLIDE 42

Some Intuition

  • 1. Manifold Learning
  • 2. Linear interpolations on the code

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [24/26]

slide-43
SLIDE 43

Some Intuition

  • 1. Manifold Learning
  • 2. Linear interpolations on the code
  • 3. Mixing well between modes

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [24/26]

slide-44
SLIDE 44

Some Intuition

  • 1. Manifold Learning
  • 2. Linear interpolations on the code
  • 3. Mixing well between modes

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [25/26]

slide-45
SLIDE 45

Conclusion

Adversarial Autoencoders: ◮ consistent improvement over undercomplete autoencoders ◮ capable of exploiting different input modalities ◮ robust to sparsity as other approaches

Take-home

Consider the semantics of item co-occurrence for the choice of an appropriate model. Code available at github.com/lgalke/aae-recommender Contact me via http://lpag.de

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [26/26]

slide-46
SLIDE 46

References I

Barbieri, Julio et al. (2017). “Autoencoders and recommender systems: COFILS approach”. In: Expert Syst. Appl. 89, pp. 81–90. Beel, Joeran et al. (2016). “paper recommender systems: a literature survey”. In: International Journal on Digital Libraries 17.4, pp. 305–338. Bengio, Yoshua, Aaron C. Courville, and Pascal Vincent (2012). “Unsupervised Feature Learning and Deep Learning: A Review and New Perspectives”. In: CoRR abs/1206.5538. Caragea, Cornelia et al. (2013). “Can’t see the forest for the trees?: a citation recommendation system”. In: JCDL. ACM,

  • pp. 111–114.

Galke, Lukas et al. (2017). “Using Titles vs. Full-text as Source for Automated Semantic Document Annotation”. In: K-CAP. ACM, 20:1–20:4.

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [27/26]

slide-47
SLIDE 47

References II

Makhzani, Alireza et al. (2015). “Adversarial Autoencoders”. In: CoRR abs/1511.05644. McNee, Sean M. et al. (2002). “On the recommending of citations for research papers”. In: CSCW. ACM, pp. 116–125. Small, Henry (1973). “Co-citation in the scientific literature: A new measure of the relationship between two documents”. In: JASIS 24.4, pp. 265–269.

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [28/26]

slide-48
SLIDE 48

Hyperparameters

Gridsearch on PubMed≥50: ◮ Hidden layer sizes between 50 and 1,000: 100 ◮ Code sizes between 10 and 500: 50 ◮ Drop probabilities between .1 and .5: .2 ◮ Stochastic Gradient Descent or Adam: Adam ◮ Initial learning rates between 0.01 and 0.00005: 0.001 ◮ Activation functions Tanh, ReLU, SELU: ReLU ◮ Prior distribution: Gaussian, Bernoulli, Multinomial: Gaussian ◮ SVD truncated at first 1,000 singular values

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [29/26]

slide-49
SLIDE 49

Dataset: PubMed

pruning cited documents citations documents density 15 35,664 1,173,568 136,911 0.000240 20 20,270 878,359 121,374 0.000357 25 12,881 692,037 105,170 0.000511 30 8,906 568,563 96,980 0.000658 35 6,469 478,693 87,498 0.000846 40 4,939 413,746 79,830 0.001049 45 3,904 363,870 73,200 0.001273 50 3,185 324,693 67,703 0.001506 55 2,643 292,791 62,647 0.001768

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [30/26]

slide-50
SLIDE 50

Dataset: Economics

pruning labels assigned labels documents density 1 4,568 323,670 61,104 0.001160 2 4,103 323,060 61,090 0.001289 3 3,760 322,199 61,060 0.001403 4 3,497 321,213 61,039 0.001505 5 3,259 320,048 60,983 0.001610 10 2,597 314,738 60,778 0.001994 15 2,192 309,101 60,524 0.002330 20 1,924 303,693 60,272 0.002619

UMAP ’18

  • L. Galke, F. Mai, I. Vagliano, A. Scherp

Adversarial Autoencoders for Recommendations [31/26]