SLIDE 1
Xilai Li 1* , Yingbo Zhou 2* , Tianfu Wu 1 , Richard Socher 2 , and - - PowerPoint PPT Presentation
Xilai Li 1* , Yingbo Zhou 2* , Tianfu Wu 1 , Richard Socher 2 , and - - PowerPoint PPT Presentation
Xilai Li 1* , Yingbo Zhou 2* , Tianfu Wu 1 , Richard Socher 2 , and Caiming Xiong 2 North Carolina State University 1 , Salesforce Research 2 Task 1 Task 2 Task i-1 Task i ... ... Model 1 Model 2 Model i-1 Model i Learn to Grow: A
SLIDE 2
SLIDE 3
- Fixed structure: Will
finally limited by the capacity
- Manually growing is
sub-optimal
S1 S2 S3 S4 TB TA TB TA S4 S4 , S3 S3 , S2 S2 ,
Input Reused weight New weight
S1 S1 ,
Fixed Structure (Regularization based method, e.g. EWC, Kirkpatrick et. al., 2016) Uniformly Growth for new task (Progressive Nets, Rusu et. al., 2016)
Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting Poster #89
SLIDE 4
Input Reused weight New weight Adapter Task specific layer (prev) Task specific layer (current)
C1 C2 C3 C4 TB TA S3 S3
OR
+
Learn-to-Grow
Structure Search from options: “reuse”, “adaptation”, “new”
S1 S2 S3 S4 TB TA TB TA S4 S4 , S3 S3 , S2 S2 , S1 S1 ,
Fixed Structure (Regularization based method, e.g. EWC, Kirkpatrick et. al., 2016) Uniformly Growth for new task (Progressive Nets, Rusu et. al., 2016)
Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting Poster #89
SLIDE 5
Dt
train
t-1 t t+1
Super Net Θt-1 Structure Optimization θt = f(Dt
train, Θt-1)
Parameter Optimization Transfer knowledge from previously learned tasks (Super Net Θt-1) Store new knowledge Θt= Θt-1⋃ θt Update “reused” knowledge
Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting Poster #89
SLIDE 6
- The structure optimization results in “new”
- n the first layer and “reuse” for the rest.
- Ablations experiments validates the search
results.
Qualitative analysis on the Searched Structure on Task 2 (Task 1: ImageNet)
➢ Learned structure is sensible ○ Similar tasks tends to share more structure and parameters ○ Distant tasks share less Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting Poster #89
SLIDE 7
Comparison between “tune reuse” and “fix reuse”
- The “tune” higher than “fix” at certain
task indicates “positive forward transfer”
- The “tune” curve “goes up” means
“positive backward transfer” Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting Poster #89
SLIDE 8
Permuted-MNIST Split-CIFAR100
Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting Poster #89
SLIDE 9