Test-Time Training with Self-Supervision for Generalization under - PowerPoint PPT Presentation

Test-Time Training with Self-Supervision for Generalization under Distribution Shifts Yu Sun, Xiaolong Wang, Zhuang Liu, John Miller, Alexei Efros, Moritz Hardt UC Berkeley ICML 2020

same distribution P = Q x: train set o: test set x x o x • In theory : same distribution for training and testing

distribution shifts P Q x: train set o: test set x x o x • In theory : same distribution for training and testing • In the real word : distribution shifts are everywhere

distribution shifts P Q x: train set o: test set x x o x • In theory : same distribution for training and testing • In the real word : distribution shifts are everywhere CIFAR-10 2009 CIFAR-10 2019 Hendrycks and Dietterich, 2018 Recht, Roelofs, Schmidt and Shankar, 2019

Existing paradigms anticipate the shifts with data or math P Q x: train set o: test set x x o x

A Theory of Learning from Different Domains Ben-David, Blitzer, Crammer, Kulesza, Pereira and Vaughan, 2009 Adversarial Discriminative Domain Adaptation Existing paradigms anticipate the shifts with data or math Tzeng, Hoffman, Saenko and Darrell, 2017 Unsupervised Domain Adaptation through Self-Supervision Sun, Tzeng, Darrell and Efros, 2019 • Domain adaptation • Data from the test distribution P Q x: train set o: test set x x x x x o x

A Theory of Learning from Different Domains Ben-David, Blitzer, Crammer, Kulesza, Pereira and Vaughan, 2009 Adversarial Discriminative Domain Adaptation Existing paradigms anticipate the shifts with data or math Tzeng, Hoffman, Saenko and Darrell, 2017 Unsupervised Domain Adaptation through Self-Supervision Sun, Tzeng, Darrell and Efros, 2019 • Domain adaptation • Data from the test distribution (maybe unlabeled) • Hard to know the test distribution P Q x: train set o: test set x x x x x o x

Existing paradigms anticipate the shifts with data or math • Domain adaptation • Data from the test distribution • Hard to know the test distribution • Domain generalization P Q x: train set • Data from the meta distribution o: test set x x o x Domain generalization via invariant feature representation Muandet, Balduzzi and Scholkopf, 2013 Domain generalization for object recognition with multi-task autoencoders Ghifary, Bastiaan, Zhang and Balduzzi, 2015 Domain Generalization by Solving Jigsaw Puzzles Carlucci, D'Innocente, Bucci, Caputo and Tommasi, 2019

distribution shifts Existing paradigms anticipate the shifts with data or math P Q Q o: test set ⇐ P x: train set ⇒ • Domain adaptation x x o x … X 1 X n X • Data from the test distribution • Hard to know the test distribution • Domain generalization • Data from the meta distribution Domain generalization via invariant feature representation Muandet, Balduzzi and Scholkopf, 2013 Domain generalization for object recognition with multi-task autoencoders Ghifary, Bastiaan, Zhang and Balduzzi, 2015 Domain Generalization by Solving Jigsaw Puzzles Carlucci, D'Innocente, Bucci, Caputo and Tommasi, 2019

distribution shifts Existing paradigms anticipate the shifts with data or math P Q Q o: test set ⇐ P x: train set ⇒ • Domain adaptation x x o x … X 1 X n X • Data from the test distribution • Hard to know the test distribution • Domain generalization M • Data from the meta distribution … P 1 P n Q Domain generalization via invariant feature representation Muandet, Balduzzi and Scholkopf, 2013 … X 1 X n X Domain generalization for object recognition with multi-task autoencoders Ghifary, Bastiaan, Zhang and Balduzzi, 2015 Domain Generalization by Solving Jigsaw Puzzles Carlucci, D'Innocente, Bucci, Caputo and Tommasi, 2019

distribution shifts Existing paradigms anticipate the shifts with data or math P Q Q o: test set ⇐ P x: train set ⇒ • Domain adaptation x x o x … X 1 X n X • Data from the test distribution • Hard to know the test distribution meta distribution shifts • Domain generalization M Q M P • Data from the meta distribution • Hard to know the meta distribution … P 1 P n Q Domain generalization via invariant feature representation Muandet, Balduzzi and Scholkopf, 2013 … X 1 X n X Domain generalization for object recognition with multi-task autoencoders Ghifary, Bastiaan, Zhang and Balduzzi, 2015 Domain Generalization by Solving Jigsaw Puzzles Carlucci, D'Innocente, Bucci, Caputo and Tommasi, 2019

Certifying some distributional robustness with principled adversarial training Sinha, Namkoong and Duchi, 2017 Towards deep learning models resistant to adversarial attacks Existing paradigms anticipate the shifts with data or math Madry, Makelov, Schmidt, Tsipras and Vladu, 2017 Adversarially robust generalization requires more data Schmidt, Santurkar, Tsipras, Talwar and Madry, 2018 • Domain adaptation • Data from the test distribution • Hard to know the test distribution • Domain generalization • Data from the meta distribution • Hard to know the meta distribution • Adversarial robustness • Topological structure of the test distribution

Certifying some distributional robustness with principled adversarial training Sinha, Namkoong and Duchi, 2017 Towards deep learning models resistant to adversarial attacks Existing paradigms anticipate the shifts with data or math Madry, Makelov, Schmidt, Tsipras and Vladu, 2017 Adversarially robust generalization requires more data Schmidt, Santurkar, Tsipras, Talwar and Madry, 2018 • Domain adaptation • Data from the test distribution • Hard to know the test distribution P • Domain generalization • Data from the meta distribution • Hard to know the meta distribution • Adversarial robustness space of distributions • Topological structure of the test distribution

Certifying some distributional robustness with principled adversarial training Sinha, Namkoong and Duchi, 2017 Towards deep learning models resistant to adversarial attacks Existing paradigms anticipate the shifts with data or math Madry, Makelov, Schmidt, Tsipras and Vladu, 2017 Adversarially robust generalization requires more data Schmidt, Santurkar, Tsipras, Talwar and Madry, 2018 • Domain adaptation worst case P • Data from the test distribution • Hard to know the test distribution P • Domain generalization • Data from the meta distribution Q • Hard to know the meta distribution • Adversarial robustness space of distributions • Topological structure of the test distribution

Certifying some distributional robustness with principled adversarial training Sinha, Namkoong and Duchi, 2017 Towards deep learning models resistant to adversarial attacks Existing paradigms anticipate the shifts with data or math Madry, Makelov, Schmidt, Tsipras and Vladu, 2017 Adversarially robust generalization requires more data Schmidt, Santurkar, Tsipras, Talwar and Madry, 2018 • Domain adaptation worst case P • Data from the test distribution • Hard to know the test distribution P • Domain generalization • Data from the meta distribution Q • Hard to know the meta distribution • Adversarial robustness space of distributions • Topological structure of the test distribution • Hard to describe, especially in high dimension

Existing paradigms anticipate the distribution shifts • Domain adaptation • Data from the test distribution • Hard to know the test distribution • Domain generalization • Data from the meta distribution • Hard to know the meta distribution • Adversarial robustness • Topological structure of the test distribution • Hard to describe, especially in high dimension

Test-Time Training (TTT) • Does not anticipate the test distribution

Test-Time Training (TTT) standard test error = E Q [ ` ( x, y ); ✓ ] • Does not anticipate the test distribution • The test sample gives us a hint about Q x

Test-Time Training (TTT) standard test error = E Q [ ` ( x, y ); ✓ ] our test error = E Q [ ` ( x, y ); ✓ ] ( x ) • Does not anticipate the test distribution • The test sample gives us a hint about Q x • No fixed model, but adapt at test time

Test-Time Training (TTT) standard test error = E Q [ ` ( x, y ); ✓ ] our test error = E Q [ ` ( x, y ); ✓ ] ( x ) • Does not anticipate the test distribution • The test sample gives us a hint about Q x • No fixed model, but adapt at test time • One sample learning problem • No label? Self-supervision!

Rotation prediction as self-supervision (Gidaris et al. 2018) x • Create labels from unlabeled input Unsupervised Representation Learning by Predicting Image Rotations Gidaris, Singh and Komodakis, 2018

Rotation prediction as self-supervision (Gidaris et al. 2018) y s x • Create labels from unlabeled input 0º • Rotate input image by multiples of 90º 90º 180º 270º Unsupervised Representation Learning by Predicting Image Rotations Gidaris, Singh and Komodakis, 2018

Rotation prediction as self-supervision (Gidaris et al. 2018) y s x • Create labels from unlabeled input 0º • Rotate input image by multiples of 90º CNN 90º • Produce a four-way classification problem θ 180º 270º Unsupervised Representation Learning by Predicting Image Rotations Gidaris, Singh and Komodakis, 2018

Test-Time Training with Self-Supervision for Generalization under - PowerPoint PPT Presentation

Test-Time Training with Self-Supervision for Generalization under Distribution Shifts Yu Sun, Xiaolong Wang, Zhuang Liu, John Miller, Alexei Efros, Moritz Hardt UC Berkeley ICML 2020 same distribution P = Q x: train set o: test set x x o x

Noise2Self: Blind Denoising by Self-Supervision Joshua Batson Loc Royer Noisy Data

Supervision Strengthening Our Practice The plan Supervision what is it? Benefits

Supervision Mandatory Webinar 4 Webinar overview I. Background II. Why supervision? III.

Model-Based Testing (ISTQB Chapter 4) Arie van Deursen 1 4.1 ISTQB Test Design Test Scripts

Group and Commercial Insurer Supervision Presenter: Gerald Gakundi Assistant Director of

For Such a Time as This Esther 4 Here is some test text Here is some test text Here is some

200511316 200511316 Test plan Test design specification g p

FLSA DUTIES TEST Exemption/Duties Test Types of Duties/Exemption Test Executive Exemption

Engineering Best Practices Test, test, test, and test some more; test as you go Start from a

Test automation Building automatically repeatable test suites Test automation n Test automation

Nehemiah Prays Nehemiah 1-2 Here is some test text Here is some test text Here is some test

Management and Supervision Spring 2015 Cost Reporting Management and Supervision Spring 2015

TEST ANXIETY Strategies to Handle Test Anxiety OVERVIEW What is test anxiety? Positive verses

TEST AUTOMATION AT BMAR BMAR TEST TEAM Test Automation Planning 1. Selection Of Test

TESTING EQUIPMENTS FOR SAFETY TEST LIST OF TEST EQUIPMENT TEST SETUP FOR AIR CONDITIONER 1.

God Rescues Daniel from the Lions Daniel 6 Here is some test text Here is some test text Here

Transfer Learning from APP Domain to News Domain for Dual Cold-Start Recommendation Jixiong Liu 1

Prefix Top Lists: Gaining Insights with Prefixes from Domain-based Top Lists on DNS Deployment

Understanding the Domain Registration Behavior of Spammers Shuang Hao, Matthew Thomas, Vern

Our research focus Cancer research Cancer development / progression (e.g. Breast, Ewing's

Common Errors In OWL Alan Rector, Nick Drummond, Matthew Horridge, Holger Knublauch, Jeremy

3D-RADNet: Extracting labels from DICOM metadata for training general medical domain deep 3D

NAISC-L: An Authoritative Linked Data Interlinking Approach for the Library Domain November 26th

Chapter 5 Deliberation with Nondeterministic Domain Automated Planning Models and Acting Malik