The Quest for the Perfect Image Representation Tinne Tuytelaars KU - - PDF document

the quest for the perfect image representation
SMART_READER_LITE
LIVE PREVIEW

The Quest for the Perfect Image Representation Tinne Tuytelaars KU - - PDF document

9/20/19 The Quest for the Perfect Image Representation Tinne Tuytelaars KU Leuven ECML-PKDD 2019 Whats wrong with current image representations ? Theyre 2D instead of 3D Theyre pixel-focused instead of object-focused


slide-1
SLIDE 1

9/20/19 1

The Quest for the Perfect Image Representation

Tinne Tuytelaars KU Leuven ECML-PKDD 2019

What’s wrong with current image representations ?

  • They’re 2D instead of 3D
  • They’re pixel-focused instead of object-focused
  • They’re more low-level than we realize
  • They do not generalize well
  • They’re performant because they (over)exploit dataset bias
  • Due to research being too dataset-focused ?
  • > qualitative evaluation, network interpretation
  • > embodied learning, online learning, incremental learning
  • R. Geirhos, P. Rubisch, C. Michaelis, M. Bethge, F. A. Wichmann, and W. Brendel, ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness, ICLR 2019
slide-2
SLIDE 2

9/20/19 2

Overview

  • Network interpretation/visualization (ICLR19)
  • Disentangled representations (ICIP19)

Network interpretation

“Interpretability is the degree to which a human can understand the cause of a decision” (Miller, 2017)

  • Interpretation = explaining the model, introspection
  • Explanation = explaining a specific decision, for a given input
slide-3
SLIDE 3

9/20/19 3

Interpretation: related work

Visualizing pre-images Visualizing node activations

But: too many nodes

Link to proxy tasks

But: subjective But: biased

Explanation: related work

  • Natural language explanations

But: typically trained using human explanations / descriptions; causal assumption not satisfied

  • Heatmaps
slide-4
SLIDE 4

9/20/19 4

Heatmap evaluation

But: bias towards human interpretation But: subjective; bias towards human interpretation

Tackling some of these challenges

  • 1. Interpretation: too many neurons
  • 2. Explanation: visualizing and evaluating heatmaps
  • J. Oramas, K. Wang, T. Tuytelaars, Visual Explanation by Interpretation:

Improving Visual Feedback Capabilities of Deep Neural Networks, ICLR2019

slide-5
SLIDE 5

9/20/19 5

  • 1. Interpretation: too many neurons

Identifying relevant features (neurons)

slide-6
SLIDE 6

9/20/19 6

Interpretation visualization: ImageNet

For every identified relevant feature of each class:

  • Select top-k images with highest response
  • Crop every image based on the receptive field
  • Compute average image

Interpretation visualization: Cats

(blurriness and artifacts due to poor alignment of receptive fields)

slide-7
SLIDE 7

9/20/19 7

Interpretation visualization: Fashion 144k

  • 2. Explanation: visualizing heatmaps
slide-8
SLIDE 8

9/20/19 8

Explanation visualization Evaluation of methods for interpretability

  • Evaluation is problematic
  • Interpretability vs. correctness / accuracy
  • Interpretability evaluation needs humans in the loop ?

“Interpretability is the degree to which a human can understand the cause of a decision” (Miller, 2017)

slide-9
SLIDE 9

9/20/19 9

  • 3. Explanation: evaluation
  • 3. Explanation: evaluation
slide-10
SLIDE 10

9/20/19 10

Generated interpretations Overview

  • Network interpretation/visualization (ICLR19)
  • Disentangled representations (ICIP19)
slide-11
SLIDE 11

9/20/19 11

Translating shape, preserving style

Towards Object Shape Translation Through Unsupervised Generative Deep Models, L Bollens, T Tuytelaars, J Oramas-Mogrovejo, ICIP 2019, pp.4220-4224.

Model: step 1: VAE for domain A

slide-12
SLIDE 12

9/20/19 12

Model: step 2: VAE for domain B Model: step 3: GAN losses

slide-13
SLIDE 13

9/20/19 13

Model: step 3: Cycle GAN loss Model: step 3: similarity loss

slide-14
SLIDE 14

9/20/19 14

Results

input Integrated unpaired appearance-preserving shape translation across domains, K Wang, L Ma, J Oramas, L Van Gool, T Tuytelaars, arXiv preprint arXiv:1812.02134, 2019.

slide-15
SLIDE 15

9/20/19 15

Model Conclusions

  • Still lots of room for improvement
  • Embodied learning, online learning, incremental learning
  • 3D instead of 2D
  • Multimodal learning
  • Selfsupervised learning