Learning What and Where to Transfer Yunhun Jang* 1,2 , Hankook Lee* 1 - PowerPoint PPT Presentation

Learning What and Where to Transfer Yunhun Jang* 1,2 , Hankook Lee* 1 , Sung Ju Hwang 3,4,5 , Jinwoo Shin 1,4,5 1 School of Electrical Engineering, KAIST 2 OMNIOUS 3 School of Computing, KAIST 4 Graduate School of AI, KAIST 5 AITRICS * Equal contribution

Transfer Learning 2 • DNNs require large labeled datasets to train • Transfer learning is a popular method to mitigate the lack of samples • Improve the performance of a model on a new task • By utilizing the knowledge of pre-trained source models

Transfer Learning 2 • DNNs require large labeled datasets to train • Transfer learning is a popular method to mitigate the lack of samples • Improve the performance of a model on a new task • By utilizing the knowledge of pre-trained source models • Limitations of previous methods • Require the same architecture between a source and target models (e.g., fine-tuning) ? Output Output Output ImageNet New task Training Training Pre-train and fine-tuning

Transfer Learning 2 • DNNs require large labeled datasets to train • Transfer learning is a popular method to mitigate the lack of samples • Improve the performance of a model on a new task • By utilizing the knowledge of pre-trained source models • Limitations of previous methods • Require the same architecture between a source and target models (e.g., fine-tuning) • Require exhaustive hand-crafted tuning (e.g., attention transfer [1], Jacobian matching [2]) True labels ? Output Output Output Output Output ImageNet New task Training Training Pre-train and fine-tuning Attention transfer/Jacobian matching [1] Zagoruyko, S. and Komodakis, N. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. In ICLR 2017 [2] Srinivas, S. and Fleuret, F. Knowledge transfer with Jaco- bian matching. In Proceedings of the 35th International Conference on Machine Learning (ICML 2018) , 2018.

Learning What/Where to Transfer 3 • Propose meta-networks 𝑔 and 𝑕 • Learn the learning rules to transfer the source knowledge

Learning What/Where to Transfer 3 • Propose meta-networks 𝑔 and 𝑕 • Learn the learning rules to transfer the source knowledge Where to transfer • A meta-network 𝑕 decides useful pairs of source/target layers to transfer Previous methods Learning What and Where to Transfer (L2T-ww)

Learning What/Where to Transfer 3 • Propose meta-networks 𝑔 and 𝑕 : Learning what/where to transfer (L2T-ww) • Learn the learning rules to transfer the source knowledge Where to transfer • A meta-network 𝑕 decides useful pairs of source/target layers to transfer What to transfer • A meta-network 𝑔 decides useful channels to transfer Previous methods Learning What and Where to Transfer (L2T-ww)

<latexit sha1_base64="nUZoIMYfa6w+/Drqc7rxvx79KU4=">ADIHicdVJNb9MwGHbC1wgf6+DIxaKlEqlagqCanSpF564DBgXSc1aeS4TmvmOFHsjFUmP4ULf4ULBxCG/wanI9qZQNLVp73eb+e943DlFEh+/1fhnt+o2bt3ZuW3fu3ru/29p7cCySPMNkghOWZCchEoRTiaSkZO0oygOGRkGp6OSv/0jGSCJvxIrlPix2jJaUQxkpoK9ozn9vu5iru8GM4aEChc+MOoMbx0RZ232ic807Hsj2miy9Qk7P8T1QWeHJFJHKO5opXnEe5FyO5CkP1pirkSdpTAQcN9p4Vn2JrjMV3WBwrK3lVj2lkirKogRU6+KDetJCSOtos7+cN6BQ+hFGcLKLdRorNuIPA4U7b4rnIsmtcwLU+vVDbplFHwCrc1gG64zHwStdr/Xrw68CtwGtEFzDoPWT2+R4DwmXGKGhJi5/VT6CmWSYkb0LkgKcKnaElmGnKkt+Kr6gcX0NbMAkZJpi+XsGK3MxSKhVjHoY4sVyIu+0ryX75ZLqN9X1Ge5pJwXDeKcgZlAsvXAhc0I1iytQYIZ1RrhXiF9DqlflOWXoJ7eSr4HjQc5/2Bq+ftQ9eNuvYAY/AY+AF7wAB2AMDsEYOj8dn4anwzP5lfzO/mjzrUNJqch+CvY/7+A2ad/rE=</latexit> L2T-ww: Learning What to Transfer 4 • Transfer by making target features similar to those of source [3] Transformation for channel dimension matching (e.g., 1x1 conv) 1 Feature L m,n X ( r θ ( T n θ ( x )) c,i,j − S m ( x ) c,i,j ) 2 fm ( θ | x ) = Matching CHW i,j [3] Romero, A., Ballas, N., Kahou, S. E., Chassang, A., Gatta, C., and Bengio, Y. Fitnets: Hints for thin deep nets. In ICLR , 2015.

<latexit sha1_base64="WFLpo2AVQDyYsFg1INK5EiAbaUw=">ACxHicbVFda9swFJW9ry7aLo97kVbGCSQBTsbdAwGhcHWhz2krGkKcWJkRW60SraRrpsETftj+yX7G/sFk2MXlnYXBEfn3I/DvUkhuIYg+O35d+7eu/9g72Hr0eMnT/fbB8/OdF4qysY0F7k6T4hmgmdsDBwEOy8UIzIRbJcfqr0yRVTmufZKWwKNpPkIuMpwQcFbd/RZLAkhJhvtq5kf3MxiYCwKtU2m4ESwbkx7q/qUe/oijVBFqQmuOJxZHupSxoQ4AW8PWjUkEoZfWrK67OdW26kTe/26jnayuqsZVU2z3dG7q6fV3eu54n5Vg9/gb65dxV1TvfnQxu1OMAi2gW+DsAEd1MQobv+JFjktJcuACqL1NAwKmBmigFPBnMtSs8L5Ihds6mBGJNMzszVs8WvHLHCaK/cywFv23wpDpNYbmbjMaqX6plaR/9OmJaTvZ4ZnRQkso/WgtBQYclxdDC+4YhTExgFCFXdeMV0SdwVwd93ptLjihW5cr2vbLbek8OZKboOz4SB8OxievOscfWjWtYdeoFeoi0J0iI7QMRqhMaLeS+LN/JO/M+8LVf1qm+19Q8Rzvh/wLl7Hfsg=</latexit> L2T-ww: Learning What to Transfer 5 • Learn what to transfer 1 L m,n X X wfm ( θ | x, w m,n ) = w m,n ( r θ ( T n θ ( x )) c,i,j − S m ( x ) c,i,j ) 2 c HW c i,j

<latexit sha1_base64="K3d+ksuTdnFdGexql9AhTaU2go0=">ACFHicbVDLSgMxFM3UV62vUZdugkUQLGWmCogFNy4rGAf0BlLJk3b0CQzJBmlDPMRbvwVNy4UcevCnX9jOu2ith64cHLOveTeE0SMKu04P1ZuaXldS2/XtjY3NresXf3GiqMJSZ1HLJQtgKkCKOC1DXVjLQiSRAPGkGw+ux3wgUtFQ3OlRHyO+oL2KEbaSB375PE+4SWRdhKcQq9PoFOCnop59p71rqDbsYtO2ckAF4k7JUwRa1jf3vdEMecCI0ZUqrtOpH2EyQ1xYykBS9WJEJ4iPqkbahAnCg/yY5K4ZFRurAXSlNCw0ydnUgQV2rEA9PJkR6oeW8s/ue1Y9278BMqolgTgScf9WIGdQjHCcEulQRrNjIEYUnNrhAPkERYmxwLJgR3/uRF0qiU3dNy5fasWL2cxpEHB+AQHAMXnIMquAE1UAcYPIEX8AberWfr1fqwPietOWs6sw/+wPr6BR2DnY4=</latexit> <latexit sha1_base64="4Z5AMNmGo4HUKN+atZn/4WYMapM=">ACw3icbVHtihMxFM2MX2v9qvrTP8EitFBLpwqKICyIuj8EK263C512yKSZbWySGZI725aYF/NfA2fwExnFtxdLwROzrkfh3vTQnADw+HvILx89btOwd3W/fuP3j4qP34yYnJS03ZhOYi16cpMUxwxSbAQbDTQjMiU8Gm6fpDpU/PmTY8V8ewK9hckjPFM04JeCp/4olgRUlwn5xCyv7yiU2BsCbTLpuDCsG5Oe2v6mlHn6P40wTaiNnj6YOx6aUiaUeANvC3o39rBlTzm4unVtepE3v/h4jorFSVztquradUQ1z1e2Hp4/d32er62X5Xgl/i71ZxF1RvMXJuzMcDPeBr4OoAR3UxDhp/4mXOS0lU0AFMWYWDQuYW6KBU8G8ydKwgtA1OWMzDxWRzMzt3q/DLzyzxFmu/VOA9+y/FZIY3Yy9ZnVRs1VrSL/p81KyN7OLVdFCUzRelBWCgw5rg6Gl1wzCmLnAaGae6+Yrog/AvizXuq0POeFaVxva9stv6To6kqug5PRIHo1GH173Tl816zrAD1Dz1EXRegNOkRHaIwmiAY4+BR8Dcbhx3Ad6hDq1DBoap6iSxG6v4gr30c=</latexit> L2T-ww: Learning What to Transfer 5 • Learn what to transfer X w m,n w m,n ≥ 0 , = 1 c c c 1 L m,n X X wfm ( θ | x, w m,n ) = w m,n ( r θ ( T n θ ( x )) c,i,j − S m ( x ) c,i,j ) 2 c HW c i,j

<latexit sha1_base64="4Z5AMNmGo4HUKN+atZn/4WYMapM=">ACw3icbVHtihMxFM2MX2v9qvrTP8EitFBLpwqKICyIuj8EK263C512yKSZbWySGZI725aYF/NfA2fwExnFtxdLwROzrkfh3vTQnADw+HvILx89btOwd3W/fuP3j4qP34yYnJS03ZhOYi16cpMUxwxSbAQbDTQjMiU8Gm6fpDpU/PmTY8V8ewK9hckjPFM04JeCp/4olgRUlwn5xCyv7yiU2BsCbTLpuDCsG5Oe2v6mlHn6P40wTaiNnj6YOx6aUiaUeANvC3o39rBlTzm4unVtepE3v/h4jorFSVztquradUQ1z1e2Hp4/d32er62X5Xgl/i71ZxF1RvMXJuzMcDPeBr4OoAR3UxDhp/4mXOS0lU0AFMWYWDQuYW6KBU8G8ydKwgtA1OWMzDxWRzMzt3q/DLzyzxFmu/VOA9+y/FZIY3Yy9ZnVRs1VrSL/p81KyN7OLVdFCUzRelBWCgw5rg6Gl1wzCmLnAaGae6+Yrog/AvizXuq0POeFaVxva9stv6To6kqug5PRIHo1GH173Tl816zrAD1Dz1EXRegNOkRHaIwmiAY4+BR8Dcbhx3Ad6hDq1DBoap6iSxG6v4gr30c=</latexit> L2T-ww: Learning What to Transfer 5 • Learn what to transfer Choose important channels for learning a target task 1 L m,n X X wfm ( θ | x, w m,n ) = w m,n ( r θ ( T n θ ( x )) c,i,j − S m ( x ) c,i,j ) 2 c HW c i,j

<latexit sha1_base64="Fw/IGEpRZnARKdtX6DGSYRzYgeM=">ACGXicbVDLSsNAFJ3UV42PRl26GSyCymJCrqz4MZlBfuAtpbJZNIOnUzCzKRYQj5Eu9WPcOlO3LryO/wBJ0XtvXAwOGce7lnjhsxKpVtfxuFldW19Y3iprm1vbNbsvb2GzKMBSZ1HLJQtFwkCaOc1BVjLQiQVDgMtJ0hzeZ3xwRIWnI79U4It0A9Tn1KUZKSz2rlHSYnvbQxKc8jTtW7Yk8Bl4kzI+Xrt6cMz7We9dPxQhwHhCvMkJRtx45UN0FCUcxIanZiSKEh6hP2pyFBDZTabBU3isFQ/6odCPKzhV/24kKJByHLh6MkBqIBe9TPzPa3sjGsnZrcf8mDkfRflX3YTyKFaE4zyJHzOoQpjVBD0qCFZsrAnCgurPQDxAmGlyzR1S85iJ8ukcVZxziv2nVOuXoAcRXAIjsAJcMAlqIJbUAN1gEMJuAFvBoT4934MD7z0YIx2zkAczC+fgErfKWj</latexit> L2T-ww: Learning Where to Transfer 6 • Learn where to transfer λ m,n • Meta-networks choose important matching pairs to transfer • Given all possible candidate matching pairs 𝒟

<latexit sha1_base64="vCrLMCcInMnLaMtxwVMTL+Nvmzg=">AEQHicrZPLb9MwHMe9hMcIrw6OXCyqSKlUqYgbUKqNKmXHjgMWNtJTRs5rtN6y0uxw1YZ/2lc+BO4cebCAYS4csJ0y3d4CEpao/17fz8+OvSgjLfbX7c0/dbtO3e37xn3Hzx89Li282TI4izFZIDjIE6PMRIQCMy4JQH5ChJCQq9gIy8k14eH30gKaNxdMiXCZmEaB5Rn2LElcvd0Ybm6VSEzUh2x6XhCiwnXb/cOMmCWu/VRlpnjYZhOoFqPkNlzfyGrNR1+IJwZB1ORVT4HBo5IeILzxPv5FT0HE5DwmC/B9JxzDXyXm9WDWQhlklMcwKpGEWiopb4xRIN7IdThHPqKZtXl41kDdqHjpwgLW4peX8mxLHQFbR5L60JshXuxVdxKqJlnwRew1FPetbMx7dyoflqVb5bYGxjnFhWpvpvYBWu64jy4yxwCj1LyZ/fUV7UkxJu3DX8pzndWr3dahcLXjXs0qiDch24tS/OLMZSCKOA8TY2G4nfCJQyikOiDScjJE4RM0J2NlRkh9NRNRPAJTeWZQT9O1S/isPBWKwQKGVuGnsrMh2CXY7nzutg4/7eRNAoyTiJ8ErIzwLIY5i/JjijKcE8WCoD4ZQqVogXSN0vV2/OUIdgXx75qjHstOyXrc7bV/X91+VxbINn4DmwgA12wT7ogwMwAFj7pH3Tfmg/9c/6d/2X/nuVqm2VNU/BxtL/AXaEmpj</latexit> L2T-ww: Learning Where to Transfer 7 • Learn where to transfer X λ m,n L m,n L wfm ( θ | x, φ ) = wfm ( θ | x, w m,n ) ( m,n ) ∈ C

Learning What and Where to Transfer Yunhun Jang* 1,2 , Hankook Lee* 1 - PowerPoint PPT Presentation

Learning What and Where to Transfer Yunhun Jang* 1,2 , Hankook Lee* 1 , Sung Ju Hwang 3,4,5 , Jinwoo Shin 1,4,5 1 School of Electrical Engineering, KAIST 2 OMNIOUS 3 School of Computing, KAIST 4 Graduate School of AI, KAIST 5 AITRICS * Equal

Industrial Transfer Learning Introduction to Industrial Transfer Learning Industrial Transfer

Radiative Transfer Radiative Transfer Radiative transfer is a branch of atmospheric physics. We

Transfer United: Partnerships to Foster Transfer Student Success Tuesday, November 5 th

Transfer Learning Eu Wern Teh What are we covering? Why transfer learning? Fine

Technology Transfer and Commercialisation 1 05/06/2015 1 Tech Transfer and Commercialisation

Transfer Transfer Transitions: Transitions: First Semester First Semester Persistence and

Remit #2 Elimination of Transfer and Settlement What does transfer mean? Transfer

Regional STEMI Transfer Systems: Regional STEMI Transfer Systems: Regional STEMI Transfer

Heat Transfer Heat Transfer Introduction Practical occurrences, applications, factors

Transfer of transfert Transfer principles Thomas Hales and Julia Gordon December 2015 The

TRANSFER: MYTHS & FACTS ANNE HABERKERN, DIVISION DIRECTOR TRANSFER & CURRICULAR

Technology Transfer or Knowledge Transfer? Russ Somma, Ph.D. SommaTech,LLC Affiliate of IPS

Transfer! The VIEWS of Practitioners The RESULTS from ROI Dr Paul Donovan NUIM TRANSFER THAT

What Is Multicast? Key: Unicast transfer Broadcast transfer Unicast Multicast transfer

WASHINGTONS TRANSFER DEGREES WASHINGTON TRANSFER DEGREES

Technology Acquisition & Transfer What is Technology Transfer ? Technology Transfer is

A workflow-inspired, modular and robust approach to experiments in distributed systems Tomasz

Wake field monitors conception, installation and measurements in the CTF3 TBM and TBTS Reidar

Panda, a Pilot-based workflow manager New Mexico Grid School April 8, 2009 Marco Mambelli

Watch for Me NC Pedestrian and Bicycle Safety Program: 2015 Partner Kick off May 21, 2015

pomsets Workflow management for your cloud Michael Pan nephosity In the future, the rapidity

Data-Centric Workflow and Business Processes Victor Vianu and Jianwen Su Outline n Introduction

Damstra Technology ASX Small and Mid-Cap Conference 2020 10 September 2020 Financial data is

Using workflow managers to co-ordinate multistep analysis pipelines across multiple compute nodes

Learning What and Where to Transfer Yunhun Jang* 1,2 , Hankook Lee* 1 - PowerPoint PPT Presentation

Learning What and Where to Transfer Yunhun Jang* 1,2 , Hankook Lee* 1 , Sung Ju Hwang 3,4,5 , Jinwoo Shin 1,4,5 1 School of Electrical Engineering, KAIST 2 OMNIOUS 3 School of Computing, KAIST 4 Graduate School of AI, KAIST 5 AITRICS * Equal

Industrial Transfer Learning Introduction to Industrial Transfer Learning Industrial Transfer

Radiative Transfer Radiative Transfer Radiative transfer is a branch of atmospheric physics. We

Transfer United: Partnerships to Foster Transfer Student Success Tuesday, November 5 th

Transfer Learning Eu Wern Teh What are we covering? Why transfer learning? Fine

Technology Transfer and Commercialisation 1 05/06/2015 1 Tech Transfer and Commercialisation

Transfer Transfer Transitions: Transitions: First Semester First Semester Persistence and

Remit #2 Elimination of Transfer and Settlement What does transfer mean? Transfer

Regional STEMI Transfer Systems: Regional STEMI Transfer Systems: Regional STEMI Transfer

Heat Transfer Heat Transfer Introduction Practical occurrences, applications, factors

Transfer of transfert Transfer principles Thomas Hales and Julia Gordon December 2015 The

TRANSFER: MYTHS &amp; FACTS ANNE HABERKERN, DIVISION DIRECTOR TRANSFER &amp; CURRICULAR

Technology Transfer or Knowledge Transfer? Russ Somma, Ph.D. SommaTech,LLC Affiliate of IPS

Transfer! The VIEWS of Practitioners The RESULTS from ROI Dr Paul Donovan NUIM TRANSFER THAT

What Is Multicast? Key: Unicast transfer Broadcast transfer Unicast Multicast transfer

WASHINGTONS TRANSFER DEGREES WASHINGTON TRANSFER DEGREES

Technology Acquisition &amp; Transfer What is Technology Transfer ? Technology Transfer is

A workflow-inspired, modular and robust approach to experiments in distributed systems Tomasz

Wake field monitors conception, installation and measurements in the CTF3 TBM and TBTS Reidar

Panda, a Pilot-based workflow manager New Mexico Grid School April 8, 2009 Marco Mambelli

Watch for Me NC Pedestrian and Bicycle Safety Program: 2015 Partner Kick off May 21, 2015

pomsets Workflow management for your cloud Michael Pan nephosity In the future, the rapidity

Data-Centric Workflow and Business Processes Victor Vianu and Jianwen Su Outline n Introduction

Damstra Technology ASX Small and Mid-Cap Conference 2020 10 September 2020 Financial data is

Using workflow managers to co-ordinate multistep analysis pipelines across multiple compute nodes

TRANSFER: MYTHS & FACTS ANNE HABERKERN, DIVISION DIRECTOR TRANSFER & CURRICULAR

Technology Acquisition & Transfer What is Technology Transfer ? Technology Transfer is