Transformation Equivariance vs. Invariance: Unsupervised Learning of - PowerPoint PPT Presentation

Transformation Equivariance vs. Invariance: Unsupervised Learning of Visual Representations Guo-Jun Qi guojunq@gmail.com Laboratory for MA chine P erception and LE arning ( MAPLE ) & Futurewei Technologies (Huawei Research USA) Laboratory for MAchine Perception and LEarning (MAPLE) 1

Contents • TER: Transformation Equivariant Representations • Definition, Steerability • AET: AutoEncoding Transformations • Deterministic approach: AET (AutoEncoding Transformations) • Probabilistic approach: AVT (Autoencoding Variational Transformations) • SAT: (Semi-)supervised Autoencoding Transformations • Conclusions and Future Work • Unifying the Transformation Equivariance and Invariance Laboratory for MAchine Perception and LEarning (MAPLE) 2

Contents • TER: Transformation Equivariant Representations • Definition, Steerability • AET: AutoEncoding Transformations • Deterministic approach: AET (AutoEncoding Transformations) • Probabilistic approach: AVT (Autoencoding Variational Transformations) • SAT: (Semi-)supervised Autoencoding Transformations • Conclusions and Future Work Laboratory for MAchine Perception and LEarning (MAPLE) 3

Recipe in Success of CNNs CNN = Translation-Equivariant Representation + Fully-Connected classifier Visual structures Semantic concepts Horse Convolution Classifier Grass layers Tree … Fully Connected Classifier Translation Equivariant Representation Laboratory for MAchine Perception and LEarning (MAPLE) 4

Transformation Equivariant Representations • Beyond translations: equivariant feature maps under transformations Various transformations Representations Laboratory for MAchine Perception and LEarning (MAPLE) 5

Generalize CNNs beyond translations Transformation Equivariant Representations + Transformation Invariant Classifiers Semantic Spatial Horse Representati FC classifier Grass on Tree … Transformation Invariance Transformation Equivariance Laboratory for MAchine Perception and LEarning (MAPLE) 6

Transformation Equivariance • Definition of transformation equivariance 𝐹 𝐮 (𝐲) = 𝝇 𝐮 𝐹(𝐲) • 𝑭 -- the representation of a sample • 𝐮 -- a transformation on samples • 𝝇 𝐮 -- the representation transformation corresponding to 𝝇 . • Transformation invariance is a special case of transformation equivariance when 𝝇 𝐮 is an identity. Laboratory for MAchine Perception and LEarning (MAPLE) 8

Steerability property • Steerability: a transformed sample 𝐮 (𝐲) can be represented directly from the representation 𝐹 𝐲 of original sample, with no access to 𝐲 • 𝝇(𝐮) is a function of the transformation 𝐮 , independently of sample. 𝐹 𝐮 (𝐲) = 𝝇 𝐮 [𝐹(𝐲)] Laboratory for MAchine Perception and LEarning (MAPLE) 9

Our Goals • For general transformations • unnecessarily limited to discrete or spatial transformations • Recoloring, contrasting, etc. • Nonlinear representations between transformed and original images • Capturing complex visual structures from transformed images 𝐹 𝐮 (𝐲) = 𝝇 𝐮 [𝐹(𝐲)] Nonlinear transformations 𝝇 𝐮 on representations Laboratory for MAchine Perception and LEarning (MAPLE) 10

A Big Picture: Stack of AET • AET learns a general representation that can be applied everywhere. Stack of Autoencoding Transformations for learning TER Deterministic SAT Probabilistic SAT AED CNN Deterministic AET Probabilistic AVT SAT AET Group Equivariant CNN Transformation Equivariant Representations AVT Capsule Net SAT: (Semi-)Supervised Autoencding Transformations AET: AutoEncoding Transformations Autoencoders CNNs AVT: Autoencoding Variational Transformations Laboratory for MAchine Perception and LEarning (MAPLE) 12

Take A Glance: Autoencoding Transformations than Data E ( x ) E D x 𝐲 AutoEncoding Data (AED) x E ( x ) E D 𝐮 𝐮 E ( t(x) ) E t(x) AutoEncoding Transformations (AET) Zhang et al., AET vs. AED: Unsupervised Representation Learning by Auto-Encoding Transformations rather Laboratory for MAchine Perception and LEarning (MAPLE) 14 than Data, in CVPR 2019.

How does AET work? • Generative Process: • An input image x ~ p (x) x • A random transformation t ~ p (t) E ( x ) E • The transformed t(x) D 𝐮 𝐮 • A representation encoder • E : x ⟼ E(x), t(x) ⟼ E(t(x)) E ( t(x) ) E • A transformation decoder • D : (x , t(x)) ⟼ D(x , t(x)) t(x) Laboratory for MAchine Perception and LEarning (MAPLE) 15

Decoding Transformations • A Siamese network individually encodes representations of images • E.g., Visual structures, and spatial relations among objects x E ( x ) E D 𝐮 𝐮 E ( t(x) ) E t(x) • Decoding by comparing the representations before and after transformations. Laboratory for MAchine Perception and LEarning (MAPLE) 16

AET loss for training • Parameterized Transformations: 𝒰 = {t θ | θ ~ Ө } 1 2 ||𝑁 θ − 𝑁( 2 E.g. affine or projective: 𝑚 𝐮 θ , 𝐮 = θ )|| 2 θ • GAN-Induced Transformations: transformed image G( x , z ) z = 1 2 𝑚 𝐮 z , 𝐮 2 ||𝐴 − z|| 2 • Non-Parametric Transformations 𝐮 = 1 𝑚 𝐮, 2 𝔽 𝐲~𝑌 dist(𝐮 𝐲 , 𝐮(𝐲)) Laboratory for MAchine Perception and LEarning (MAPLE) 17

Revisit: Steerability of TER • Obtain the representation 𝐴 of a transformed sample 𝐮(𝐲) from 𝐮 and 𝐴 without accessing x 𝐹 𝐮 (𝐲) = 𝝇 𝐮 [𝐹(𝐲)] • Maximizing the mutual information between 𝐴 and ( 𝐴, 𝐮 ) t(x) 𝐴 𝐮 transformation 𝐴 Steerability of z through 𝝇 𝐮 and 𝐴 x Laboratory for MAchine Perception and LEarning (MAPLE) 19

An Information-Theoretical Insight t(x) 𝐴 𝐮 • Train a TER model 𝜾 by maximizing transformation max 𝜾 𝑱 𝜾 (𝐴; 𝐴, 𝐮) 𝐴 • By chain rule of mutual information, we have x 𝑱 𝜾 𝐴; 𝐴, 𝐮 = 𝑱 𝜾 𝐴; 𝐴, 𝐮, 𝐲 − 𝑱 𝜾 𝐴; 𝐲| 𝐴, 𝐮 ≤ 𝑱 𝜾 𝐴; 𝐴, 𝐮, 𝐲 • 𝑱 𝜾 𝐴; 𝐴, 𝐮 attains its maximum value 𝑱 𝜾 𝐴; 𝐴, 𝐮, 𝐲 (the upper bound) when 𝑱 𝜾 𝐴; 𝐲| 𝐴, 𝐮 = 𝟏 Steerability : Given ( 𝐴, 𝐮 ), x contains no more information about z . • Nonlinearity of transformation 𝝇 𝐮 in representations. Laboratory for MAchine Perception and LEarning (MAPLE) 20

AVT: Autoencoding Variational Transformations • Unable to maximize the mutual information directly • Intractable to evaluate the posterior 𝑞 𝜄 (𝐮|𝐴, 𝐲) • Deriving a lower bound by introducing a transformation decoder 𝑟 𝝔 𝑱 𝜾 𝐴; 𝐴, 𝐮 ≥ 𝐼(𝐮| 𝐴) + 𝔽 𝒒 𝜾 𝒖,𝒜, 𝐴 log𝑟 𝝔 (𝒖|𝒜, 𝐴) • Unsupervised loss to learn AVT max 𝜾,𝝔 𝔽 𝒒 𝜾 𝐮,𝐴, 𝐴 log𝑟 𝝔 (𝒖|𝒜, 𝐴) Qi, Learning Generalized Transformation Equivariant Representations via Autoencoding Transformations, preprint. Laboratory for MAchine Perception and LEarning (MAPLE) 21

AVT: Autoencoding Variational Transformations • Generative process • Given an image x sampled from p ( x ) x • Sample a transformation t from p ( t ) 𝑞 𝜄 (𝐴|𝐲, 𝟐) 𝐴 • Apply t to x , resulting in t(x) 𝑟 𝜚 (𝐮|𝐴, 𝐴) 𝐮 𝐮 • Sample a representation z of t(x) from 𝐴 𝑞 𝜄 (𝐴|𝐲, 𝐮) 𝑞 𝜄 (𝐴|𝐲, 𝐮) t(x) • 𝐴 is sampled by setting t to an identity • Decode transformations 𝐮 from 𝑟 𝜚 (𝐮|𝐴, 𝐴) AVT Laboratory for MAchine Perception and LEarning (MAPLE) 22

Transformation Equivariance vs. Invariance: Unsupervised Learning of - PowerPoint PPT Presentation

Transformation Equivariance vs. Invariance: Unsupervised Learning of Visual Representations Guo-Jun Qi guojunq@gmail.com Laboratory for MA chine P erception and LE arning ( MAPLE ) & Futurewei Technologies (Huawei Research USA) Laboratory

Invariance and Equivariance: Benefits, Costs, and Methods Robert Serfling 1 University of Texas

General Equivariance Zhuohui Zhang Amos Gropp Department of Computer Science & Applied Math

Understanding image representations by measuring their equivariance and equivalence Karel Lenc,

Strategy for City Transformation (Part 4) Sunday November 25, Strategy for City Transformation

Composing Transformation Composing Transformation Composing Transformation the process of

CHAPTER 11: MODEL TRANSFORMATION Transformation Definition Transformation Tool 2 Agenda

Start-up Thinking for Large Organizations Hi, Im Jason Digital Transformation Efforts

The New Rules of Bank Transformation Branch transformation strategy expert Branch

Transformation Center: Supporting health system transformation Chris DeMars, Director of Systems

Strategy for City Transformation (Part 2) Sunday October 21, 2012 Strategy for City

Transformation launch event HSCP Transformation Programme Launch 09:15 09:25 Judith Proctor

Transformation Management Office Rick Baniak November 16, 2018 Chief Transformation Officer

DIGITAL TRANSFORMATION CENTER The Digital Transformation Center, supports companies to close

Your Digital Transformation Partner Your Digital Transformation Partner INTRODUCTION Experienced

Kokomo Transformation Zone 1 Kokomo School Corporation Transformation Zone 2 KSC

Dieter De Hen Model Transformation Explicit Transformation Modeling Traffic and Petri

Ring-LWE Implementation Tobias Oder 1 , Tobias Schneider 2 , Thomas Pppelmann 3 , Tim Gneysu

6.975 Week 5 Universal Compression Via Grammar Based Codes Presenter: Emin Martinian Grammar

The Minisatellite Transformation Problem: The Run-Length-Encoding Approach and Further

Lecture 6 Polar Coding I-Hsiang Wang Department of Electrical Engineering National Taiwan

Deductive Program Verification with Why3 Jean-Christophe Filli atre CNRS Digicosme Spring

Encoding natural numbers datatype nat = Z | S of nat val zero = Z val one = S Z val two = S

Enc Encoding ding, F , Fas ast and Slo t and Slow: w: Low-Latency Video Processing Using

CS4405 JPEG Transform Coding JPEG Compression Workflow RGB Optional Chroma Subsample

Transformation Equivariance vs. Invariance: Unsupervised Learning of - PowerPoint PPT Presentation

Transformation Equivariance vs. Invariance: Unsupervised Learning of Visual Representations Guo-Jun Qi guojunq@gmail.com Laboratory for MA chine P erception and LE arning ( MAPLE ) & Futurewei Technologies (Huawei Research USA) Laboratory

Invariance and Equivariance: Benefits, Costs, and Methods Robert Serfling 1 University of Texas

General Equivariance Zhuohui Zhang Amos Gropp Department of Computer Science &amp; Applied Math

Understanding image representations by measuring their equivariance and equivalence Karel Lenc,

Strategy for City Transformation (Part 4) Sunday November 25, Strategy for City Transformation

Composing Transformation Composing Transformation Composing Transformation the process of

CHAPTER 11: MODEL TRANSFORMATION Transformation Definition Transformation Tool 2 Agenda

Start-up Thinking for Large Organizations Hi, Im Jason Digital Transformation Efforts

The New Rules of Bank Transformation Branch transformation strategy expert Branch

Transformation Center: Supporting health system transformation Chris DeMars, Director of Systems

Strategy for City Transformation (Part 2) Sunday October 21, 2012 Strategy for City

Transformation launch event HSCP Transformation Programme Launch 09:15 09:25 Judith Proctor

Transformation Management Office Rick Baniak November 16, 2018 Chief Transformation Officer

DIGITAL TRANSFORMATION CENTER The Digital Transformation Center, supports companies to close

Your Digital Transformation Partner Your Digital Transformation Partner INTRODUCTION Experienced

Kokomo Transformation Zone 1 Kokomo School Corporation Transformation Zone 2 KSC

Dieter De Hen Model Transformation Explicit Transformation Modeling Traffic and Petri

Ring-LWE Implementation Tobias Oder 1 , Tobias Schneider 2 , Thomas Pppelmann 3 , Tim Gneysu

6.975 Week 5 Universal Compression Via Grammar Based Codes Presenter: Emin Martinian Grammar

The Minisatellite Transformation Problem: The Run-Length-Encoding Approach and Further

Lecture 6 Polar Coding I-Hsiang Wang Department of Electrical Engineering National Taiwan

Deductive Program Verification with Why3 Jean-Christophe Filli atre CNRS Digicosme Spring

Encoding natural numbers datatype nat = Z | S of nat val zero = Z val one = S Z val two = S

Enc Encoding ding, F , Fas ast and Slo t and Slow: w: Low-Latency Video Processing Using

CS4405 JPEG Transform Coding JPEG Compression Workflow RGB Optional Chroma Subsample

General Equivariance Zhuohui Zhang Amos Gropp Department of Computer Science & Applied Math