learning what and where to transfer
play

Learning What and Where to Transfer Yunhun Jang* 1,2 , Hankook Lee* 1 - PowerPoint PPT Presentation

Learning What and Where to Transfer Yunhun Jang* 1,2 , Hankook Lee* 1 , Sung Ju Hwang 3,4,5 , Jinwoo Shin 1,4,5 1 School of Electrical Engineering, KAIST 2 OMNIOUS 3 School of Computing, KAIST 4 Graduate School of AI, KAIST 5 AITRICS * Equal


  1. Learning What and Where to Transfer Yunhun Jang* 1,2 , Hankook Lee* 1 , Sung Ju Hwang 3,4,5 , Jinwoo Shin 1,4,5 1 School of Electrical Engineering, KAIST 2 OMNIOUS 3 School of Computing, KAIST 4 Graduate School of AI, KAIST 5 AITRICS * Equal contribution

  2. Transfer Learning 2 • DNNs require large labeled datasets to train • Transfer learning is a popular method to mitigate the lack of samples • Improve the performance of a model on a new task • By utilizing the knowledge of pre-trained source models

  3. Transfer Learning 2 • DNNs require large labeled datasets to train • Transfer learning is a popular method to mitigate the lack of samples • Improve the performance of a model on a new task • By utilizing the knowledge of pre-trained source models • Limitations of previous methods • Require the same architecture between a source and target models (e.g., fine-tuning) ? Output Output Output ImageNet New task Training Training Pre-train and fine-tuning

  4. Transfer Learning 2 • DNNs require large labeled datasets to train • Transfer learning is a popular method to mitigate the lack of samples • Improve the performance of a model on a new task • By utilizing the knowledge of pre-trained source models • Limitations of previous methods • Require the same architecture between a source and target models (e.g., fine-tuning) • Require exhaustive hand-crafted tuning (e.g., attention transfer [1], Jacobian matching [2]) True labels ? Output Output Output Output Output ImageNet New task Training Training Pre-train and fine-tuning Attention transfer/Jacobian matching [1] Zagoruyko, S. and Komodakis, N. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. In ICLR 2017 [2] Srinivas, S. and Fleuret, F. Knowledge transfer with Jaco- bian matching. In Proceedings of the 35th International Conference on Machine Learning (ICML 2018) , 2018.

  5. Learning What/Where to Transfer 3 • Propose meta-networks 𝑔 and 𝑕 • Learn the learning rules to transfer the source knowledge

  6. Learning What/Where to Transfer 3 • Propose meta-networks 𝑔 and 𝑕 • Learn the learning rules to transfer the source knowledge Where to transfer • A meta-network 𝑕 decides useful pairs of source/target layers to transfer Previous methods Learning What and Where to Transfer (L2T-ww)

  7. Learning What/Where to Transfer 3 • Propose meta-networks 𝑔 and 𝑕 : Learning what/where to transfer (L2T-ww) • Learn the learning rules to transfer the source knowledge Where to transfer • A meta-network 𝑕 decides useful pairs of source/target layers to transfer What to transfer • A meta-network 𝑔 decides useful channels to transfer Previous methods Learning What and Where to Transfer (L2T-ww)

  8. <latexit sha1_base64="nUZoIMYfa6w+/Drqc7rxvx79KU4=">ADIHicdVJNb9MwGHbC1wgf6+DIxaKlEqlagqCanSpF564DBgXSc1aeS4TmvmOFHsjFUmP4ULf4ULBxCG/wanI9qZQNLVp73eb+e943DlFEh+/1fhnt+o2bt3ZuW3fu3ru/29p7cCySPMNkghOWZCchEoRTiaSkZO0oygOGRkGp6OSv/0jGSCJvxIrlPix2jJaUQxkpoK9ozn9vu5iru8GM4aEChc+MOoMbx0RZ232ic807Hsj2miy9Qk7P8T1QWeHJFJHKO5opXnEe5FyO5CkP1pirkSdpTAQcN9p4Vn2JrjMV3WBwrK3lVj2lkirKogRU6+KDetJCSOtos7+cN6BQ+hFGcLKLdRorNuIPA4U7b4rnIsmtcwLU+vVDbplFHwCrc1gG64zHwStdr/Xrw68CtwGtEFzDoPWT2+R4DwmXGKGhJi5/VT6CmWSYkb0LkgKcKnaElmGnKkt+Kr6gcX0NbMAkZJpi+XsGK3MxSKhVjHoY4sVyIu+0ryX75ZLqN9X1Ge5pJwXDeKcgZlAsvXAhc0I1iytQYIZ1RrhXiF9DqlflOWXoJ7eSr4HjQc5/2Bq+ftQ9eNuvYAY/AY+AF7wAB2AMDsEYOj8dn4anwzP5lfzO/mjzrUNJqch+CvY/7+A2ad/rE=</latexit> L2T-ww: Learning What to Transfer 4 • Transfer by making target features similar to those of source [3] Transformation for channel dimension matching (e.g., 1x1 conv) 1 Feature L m,n X ( r θ ( T n θ ( x )) c,i,j − S m ( x ) c,i,j ) 2 fm ( θ | x ) = Matching CHW i,j [3] Romero, A., Ballas, N., Kahou, S. E., Chassang, A., Gatta, C., and Bengio, Y. Fitnets: Hints for thin deep nets. In ICLR , 2015.

  9. <latexit sha1_base64="WFLpo2AVQDyYsFg1INK5EiAbaUw=">ACxHicbVFda9swFJW9ry7aLo97kVbGCSQBTsbdAwGhcHWhz2krGkKcWJkRW60SraRrpsETftj+yX7G/sFk2MXlnYXBEfn3I/DvUkhuIYg+O35d+7eu/9g72Hr0eMnT/fbB8/OdF4qysY0F7k6T4hmgmdsDBwEOy8UIzIRbJcfqr0yRVTmufZKWwKNpPkIuMpwQcFbd/RZLAkhJhvtq5kf3MxiYCwKtU2m4ESwbkx7q/qUe/oijVBFqQmuOJxZHupSxoQ4AW8PWjUkEoZfWrK67OdW26kTe/26jnayuqsZVU2z3dG7q6fV3eu54n5Vg9/gb65dxV1TvfnQxu1OMAi2gW+DsAEd1MQobv+JFjktJcuACqL1NAwKmBmigFPBnMtSs8L5Ihds6mBGJNMzszVs8WvHLHCaK/cywFv23wpDpNYbmbjMaqX6plaR/9OmJaTvZ4ZnRQkso/WgtBQYclxdDC+4YhTExgFCFXdeMV0SdwVwd93ptLjihW5cr2vbLbek8OZKboOz4SB8OxievOscfWjWtYdeoFeoi0J0iI7QMRqhMaLeS+LN/JO/M+8LVf1qm+19Q8Rzvh/wLl7Hfsg=</latexit> L2T-ww: Learning What to Transfer 5 • Learn what to transfer 1 L m,n X X wfm ( θ | x, w m,n ) = w m,n ( r θ ( T n θ ( x )) c,i,j − S m ( x ) c,i,j ) 2 c HW c i,j

  10. <latexit sha1_base64="K3d+ksuTdnFdGexql9AhTaU2go0=">ACFHicbVDLSgMxFM3UV62vUZdugkUQLGWmCogFNy4rGAf0BlLJk3b0CQzJBmlDPMRbvwVNy4UcevCnX9jOu2ith64cHLOveTeE0SMKu04P1ZuaXldS2/XtjY3NresXf3GiqMJSZ1HLJQtgKkCKOC1DXVjLQiSRAPGkGw+ux3wgUtFQ3OlRHyO+oL2KEbaSB375PE+4SWRdhKcQq9PoFOCnop59p71rqDbsYtO2ckAF4k7JUwRa1jf3vdEMecCI0ZUqrtOpH2EyQ1xYykBS9WJEJ4iPqkbahAnCg/yY5K4ZFRurAXSlNCw0ydnUgQV2rEA9PJkR6oeW8s/ue1Y9278BMqolgTgScf9WIGdQjHCcEulQRrNjIEYUnNrhAPkERYmxwLJgR3/uRF0qiU3dNy5fasWL2cxpEHB+AQHAMXnIMquAE1UAcYPIEX8AberWfr1fqwPietOWs6sw/+wPr6BR2DnY4=</latexit> <latexit sha1_base64="4Z5AMNmGo4HUKN+atZn/4WYMapM=">ACw3icbVHtihMxFM2MX2v9qvrTP8EitFBLpwqKICyIuj8EK263C512yKSZbWySGZI725aYF/NfA2fwExnFtxdLwROzrkfh3vTQnADw+HvILx89btOwd3W/fuP3j4qP34yYnJS03ZhOYi16cpMUxwxSbAQbDTQjMiU8Gm6fpDpU/PmTY8V8ewK9hckjPFM04JeCp/4olgRUlwn5xCyv7yiU2BsCbTLpuDCsG5Oe2v6mlHn6P40wTaiNnj6YOx6aUiaUeANvC3o39rBlTzm4unVtepE3v/h4jorFSVztquradUQ1z1e2Hp4/d32er62X5Xgl/i71ZxF1RvMXJuzMcDPeBr4OoAR3UxDhp/4mXOS0lU0AFMWYWDQuYW6KBU8G8ydKwgtA1OWMzDxWRzMzt3q/DLzyzxFmu/VOA9+y/FZIY3Yy9ZnVRs1VrSL/p81KyN7OLVdFCUzRelBWCgw5rg6Gl1wzCmLnAaGae6+Yrog/AvizXuq0POeFaVxva9stv6To6kqug5PRIHo1GH173Tl816zrAD1Dz1EXRegNOkRHaIwmiAY4+BR8Dcbhx3Ad6hDq1DBoap6iSxG6v4gr30c=</latexit> L2T-ww: Learning What to Transfer 5 • Learn what to transfer X w m,n w m,n ≥ 0 , = 1 c c c 1 L m,n X X wfm ( θ | x, w m,n ) = w m,n ( r θ ( T n θ ( x )) c,i,j − S m ( x ) c,i,j ) 2 c HW c i,j

  11. <latexit sha1_base64="4Z5AMNmGo4HUKN+atZn/4WYMapM=">ACw3icbVHtihMxFM2MX2v9qvrTP8EitFBLpwqKICyIuj8EK263C512yKSZbWySGZI725aYF/NfA2fwExnFtxdLwROzrkfh3vTQnADw+HvILx89btOwd3W/fuP3j4qP34yYnJS03ZhOYi16cpMUxwxSbAQbDTQjMiU8Gm6fpDpU/PmTY8V8ewK9hckjPFM04JeCp/4olgRUlwn5xCyv7yiU2BsCbTLpuDCsG5Oe2v6mlHn6P40wTaiNnj6YOx6aUiaUeANvC3o39rBlTzm4unVtepE3v/h4jorFSVztquradUQ1z1e2Hp4/d32er62X5Xgl/i71ZxF1RvMXJuzMcDPeBr4OoAR3UxDhp/4mXOS0lU0AFMWYWDQuYW6KBU8G8ydKwgtA1OWMzDxWRzMzt3q/DLzyzxFmu/VOA9+y/FZIY3Yy9ZnVRs1VrSL/p81KyN7OLVdFCUzRelBWCgw5rg6Gl1wzCmLnAaGae6+Yrog/AvizXuq0POeFaVxva9stv6To6kqug5PRIHo1GH173Tl816zrAD1Dz1EXRegNOkRHaIwmiAY4+BR8Dcbhx3Ad6hDq1DBoap6iSxG6v4gr30c=</latexit> L2T-ww: Learning What to Transfer 5 • Learn what to transfer Choose important channels for learning a target task 1 L m,n X X wfm ( θ | x, w m,n ) = w m,n ( r θ ( T n θ ( x )) c,i,j − S m ( x ) c,i,j ) 2 c HW c i,j

  12. <latexit sha1_base64="Fw/IGEpRZnARKdtX6DGSYRzYgeM=">ACGXicbVDLSsNAFJ3UV42PRl26GSyCymJCrqz4MZlBfuAtpbJZNIOnUzCzKRYQj5Eu9WPcOlO3LryO/wBJ0XtvXAwOGce7lnjhsxKpVtfxuFldW19Y3iprm1vbNbsvb2GzKMBSZ1HLJQtFwkCaOc1BVjLQiQVDgMtJ0hzeZ3xwRIWnI79U4It0A9Tn1KUZKSz2rlHSYnvbQxKc8jTtW7Yk8Bl4kzI+Xrt6cMz7We9dPxQhwHhCvMkJRtx45UN0FCUcxIanZiSKEh6hP2pyFBDZTabBU3isFQ/6odCPKzhV/24kKJByHLh6MkBqIBe9TPzPa3sjGsnZrcf8mDkfRflX3YTyKFaE4zyJHzOoQpjVBD0qCFZsrAnCgurPQDxAmGlyzR1S85iJ8ukcVZxziv2nVOuXoAcRXAIjsAJcMAlqIJbUAN1gEMJuAFvBoT4934MD7z0YIx2zkAczC+fgErfKWj</latexit> L2T-ww: Learning Where to Transfer 6 • Learn where to transfer λ m,n • Meta-networks choose important matching pairs to transfer • Given all possible candidate matching pairs 𝒟

  13. <latexit sha1_base64="vCrLMCcInMnLaMtxwVMTL+Nvmzg=">AEQHicrZPLb9MwHMe9hMcIrw6OXCyqSKlUqYgbUKqNKmXHjgMWNtJTRs5rtN6y0uxw1YZ/2lc+BO4cebCAYS4csJ0y3d4CEpao/17fz8+OvSgjLfbX7c0/dbtO3e37xn3Hzx89Li282TI4izFZIDjIE6PMRIQCMy4JQH5ChJCQq9gIy8k14eH30gKaNxdMiXCZmEaB5Rn2LElcvd0Ybm6VSEzUh2x6XhCiwnXb/cOMmCWu/VRlpnjYZhOoFqPkNlzfyGrNR1+IJwZB1ORVT4HBo5IeILzxPv5FT0HE5DwmC/B9JxzDXyXm9WDWQhlklMcwKpGEWiopb4xRIN7IdThHPqKZtXl41kDdqHjpwgLW4peX8mxLHQFbR5L60JshXuxVdxKqJlnwRew1FPetbMx7dyoflqVb5bYGxjnFhWpvpvYBWu64jy4yxwCj1LyZ/fUV7UkxJu3DX8pzndWr3dahcLXjXs0qiDch24tS/OLMZSCKOA8TY2G4nfCJQyikOiDScjJE4RM0J2NlRkh9NRNRPAJTeWZQT9O1S/isPBWKwQKGVuGnsrMh2CXY7nzutg4/7eRNAoyTiJ8ErIzwLIY5i/JjijKcE8WCoD4ZQqVogXSN0vV2/OUIdgXx75qjHstOyXrc7bV/X91+VxbINn4DmwgA12wT7ogwMwAFj7pH3Tfmg/9c/6d/2X/nuVqm2VNU/BxtL/AXaEmpj</latexit> L2T-ww: Learning Where to Transfer 7 • Learn where to transfer X λ m,n L m,n L wfm ( θ | x, φ ) = wfm ( θ | x, w m,n ) ( m,n ) ∈ C

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend