adaptive activation network and functional regularization
play

Adaptive Activation Network and Functional Regularization for - PowerPoint PPT Presentation

Adaptive Activation Network and Functional Regularization for Efficient and Flexible Deep Multi-Task Learning (AAAI-2020) Reading Group Dec. 11, 2019 Suman Saha (postdoc), Computer Vision Lab @ ETH Zurich M o t t i i v v a a t t i


  1. Adaptive Activation Network and Functional Regularization for Efficient and Flexible Deep Multi-Task Learning (AAAI-2020) Reading Group Dec. 11, 2019 Suman Saha (postdoc), Computer Vision Lab @ ETH Zurich

  2. M o t t i i v v a a t t i i o n n : : D D N N N A A c c t t i i v a t t i i o n n F F u n n c c t t i i o n n s s  L e a r n i n g a c t i v a t i o n f u n c t i o n s t o i m p r o v e d e e p n e u r a l n e t w o r k s ( D N N s ) [ 1 ]  P a r a m e t e r s i n t h e l i n e a r c o m p o n e n t s ( W a n d b ) a r e l e a r n e d f r o m d a t a  Wh i l e n o n l i n e a r i t i e s a r e p r e d e fj n e d , e . g . s i g m o i d , t a n h o r R e L U e t c .  A s s u m p t i o n – a n a r b i t r a r y c o m p l e x f u n c t i o n c a n b e a p p r o x i m a t e d u s i n g a n y o f t h e s e c o m m o n n o n l i n e a r f u n c t i o n s  I n p r a c t i c e , t h e c h o i c e o f n o n l i n e a r i t y a fg e c t s :  → t h e l e a r n i n g d y n a m i c s  → n e t w o r k e x p r e s s i v e p o w e r [1] Agostinelli, Forest, et al. "Learning activation functions to improve deep neural networks." arXiv preprint arXiv:1412.6830 (2014). 2

  3. M o t t i i v v a a t t i i o n n : : C C h o i c e o o f N N o o n n l l i i n n e e a r i t t y y  A c t i v e r e s e a r c h a r e a – d e s i g n a c t i v a t i o n f u n c t i o n s t h a t e n a b l e f a s t t r a i n i n g o f D N N V a n i s h i i n n g G r a d i i e e n n t t P P r o o b b l e m  D e r i v a t i v e o f a S i g m o i d F u n c t i o n  r a n g e s b e t w e e n 0 t t o o 0 0 . 2 5 We i g h t t U U p d a a t t e e  F o r D N N w i t h m o r e l a y e r s , t h e g r a d i e n t s t e n d t o v a n i s h m o r e i n t h e l o w e r l a y e r s 3

  4. M o t t i i v v a a t t i i o n n : : C C h o i c e o o f N N o o n n l l i i n n e e a r i t t y y  Tie r e c t i fj e d l i n e a r a c t i v a t i o n f u n c t i o n ( R e L U ) d o e s n o t s a t u r a t e l i k e s i g m o i d a l f u n c t i o n s  h e l p s t o o v e r c o m e t h e v a n i s h i n g g r a d i e n t p r o b l e m A A n n o h t t e e r r r r e c e e n n t t a a c c t t i i v v a a t t i i o o n f u n c t t i i o o n n s  M a x o u t a c t i v a t i o n ( G o o d f e l l o w e t a l . , 2 0 1 3 ) – c o m p u t e s t h e m a x i m u m  o f a s e t o f l i n e a r f u n c t i o n s  S p r i n g e n b e r g & R i e d m i l l e r ( 2 0 1 3 ) r e p l a c e d t h e m a x f u n c t i o n  G u l c e h r e e t a l . ( 2 0 1 4 ) e x p l o r e d a n a c t i v a t i o n f u n c t i o n t h a t r e p l a c e s t h e m a x f u n c t i o n w i t h a n L n o r m P 4

  5. M o t t i i v v a a t t i i o n  Tie t y p e o f a c t i v a t i o n f u n c t i o n c a n h a v e a s i g n i fj c a n t i m p a c t o n l e a r n i n g  O n e w a y t o e x p l o r e t h e s p a c e o f p o s s i b l e f u n c t i o n s i s t o l e a r n t h e a c t i v a t i o n f u n c t i o n d u r i n g t r a i n i n g ( A g o s t i n e l l i e t a l . , 2 0 1 4 ) 5

  6. A d d a a p t t i i v e P P i e c e w i s s e e L L i i n n e e a r ( ( A A P P L L ) ) u n n i i t t s s  A c t i v a t i o n f u n c t i o n s a s a s u m o f h i n g e - s h a p e d f u n c t i o n s r e s u l t i n g a p i e c e w i s e l i n e a r a c t i v a t i o n f u n c t i o n  S ( t h e n u m b e e r r o o f f h h i i n n g e e s s ) i s a h y p e r p a r a m e t e r s e t i n a d v a n c e  a r e t h e l e a a r r n a a b b l e e p p a r a a m m e e t t e e r r s , w h e r e  v a r i a b l e s c o n t r o l t h e s l o p e e s s o f t h e l i n e a r s e g m e n t s  v a r i a b l e s d e t e r m i n e t h e l o c a a t t i i o o n n s o f t h e h i n g e s 6

  7. A d d a a p t t i i v e P P i e c c e e w i s e L i n n e e a r ( ( A P L ) u u n n i i t t s s  F i g . 1 s h o w s e x a m p l e A P L f u n c t i o n s f o r S = 1  f o r l a r g e e n o u g h S , A P L c a n a p p r o x i m a t e a r b i t r a r i l y c o m p l e x c o n t i n u o u s f u n c t i o n s  t h e fj r s t t e r m i n E q . ( 1 ) i s R e e L L U  w h e n x x < < 0 0 t h e d e e r r i i v v a a t t i i v v e o f R e L U i s 0 r e s u l t i n g d e a a d d n n e e u u r o o n n s Figure 1: Sample activation functions obtained from changing the parameters. Notice that figure b L e a k y R e L U s a d d r e s s e s t h e d e a d n e u r o n s shows that the activation function can also be non-convex. p r o b l e m s , e . g . l e a k y R e L U m a y h a v e y = 0 . 0 1 x w h e n x < 0 Figure 2 7

  8. T A A N ( T a s k A A d d a a p t t i i v e A c t i v a t t i i o n n N N e t w o r k ) ) Categories of deep learning MTL: (a) Hard-sharing; (b) Soft-sharing; (c) Task Adaptive Activation Network (proposed model); (d) Inner Structure of Adaptive Activation Layer. ● P r o o p p o o s s e d a a p p p r o o a a c c h = = h h a r d - s h a r i i n n g + + l l e a a r r n a a b b l e e t t a a s s k k - - s p e c i i fj fj c a c t t i i v a t t i i o o n n f u n c t t i i o o n n s ● a l l t a s k s c a n s h a r e t h e i r w e i g h t s a n d b i a s e s o n t h e h i d d e n l a y e r s ● m o r e s c a l a b l e t h a n t h e s o f t - s h a r i n g m e t h o d s w h e r e t h e n u m b e r o f n e t w o r k c o m p o n e n t s i s p r o p o r t i o n a l t o t h e n u m b e r o f t a s k s 8

  9. T A A A A N N ● F o r a t a s k t , g i v e n t h e i n p u t f r o m e i t h e r t h e p r e v i o u s l a y e r o r d a t a i n p u t , t h e o u t p u t o f t h e l - t h A A L ( A d a p t i v e A c t i v a t i o n L a y e r ) i s d e fj n e d b y ● w e i g h t a n d b i a s p a r a m e t e r s a r e s h a r e d a c r o s s t a s k s (c) Task Adaptive Activation Network (proposed model); (d) Inner Structure of Adaptive Activation Layer. ● Tie t a s k - s p e c i fj c a c t i v a t i o n f u n c t i o n f o r t t a a s s k k t a n d l a y e e r r l l i s d e fj n e d a s ● R e c a l l f r o m s l i d e 6 a n d 7 , M ( t h e n u m b e r o f h i n g e s ) i s a h y p e r p a r a m e t e r s e t i n a d v a n c e ● d e n o t e s t h e c o o r d i n a t e s o f t h e b a s i s f u n c t i o n s 9

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend