Multi-Task Learning: Models, Optimization and Applications Linli Xu - PowerPoint PPT Presentation

Multi-Task Learning: Models, Optimization and Applications Linli Xu University of Science and Technology of China University of Science and Technology of China

Outline • Introduction to multi-task learning (MTL): problem and models • Multi-task learning with task-feature co-clusters • Low-rank optimization in multi-task learning • Multi-task learning applied to trajectory regression 2016/11/5 2

Multiple Tasks Examination Scores Prediction 1 (Argyriou et. al. ’08 ) School 1 - Alverno High School Student Birth Previous … School … Exam id year score ranking score 72981 1985 95 … 83% … ? … school-dependent student-dependent School 138 - Jefferson Intermediate School Exam Student Birth Previous … School … score id year score ranking ? 31256 1986 87 … 72% … student-dependent school-dependent School 139 - Rosemead High School Exam Student Birth Previous … School … score id year score ranking ? 12381 1986 83 … 77% … school-dependent student-dependent 1 The Inner London Education Authority (ILEA) 2016/11/5 5

Learning Multiple Tasks Learning each task independently School 1 - Alverno High School Exam Student Birth Previous School … 1st Score id year score ranking task ? 72981 1985 95 83% … Excellent … School 138 - Jefferson Intermediate School Student Birth Previous School … Exam 138th id year score ranking Score task 31256 1986 87 72% … ? Excellent School 139 - Rosemead High School Exam Student Birth Previous School … 139th id year score ranking Score task ? 12381 1986 83 77% … Excellent 2016/11/5 6

Learning Multiple Tasks Learning multiple tasks simultaneously School 1 - Alverno High School Exam Student Birth Previous School … 1st Score id year score ranking task ? 72981 1985 95 83% … … School 138 - Jefferson Intermediate School Exam Student Birth Previous School … 138th Score id year score ranking task ? 31256 1986 87 72% … School 139 - Rosemead High School Student Birth Previous School … Exam 139th id year score ranking Score task 12381 1986 83 77% … ? Learn tasks simultaneously …… Model the task relationships 2016/11/5 7

Multi-Task Learning Single Task Learning • Different from single task Task 1 Training Data Model Training learning Task 2 Training Data Training Model … … Task m Training Data Model Training Multi-Task Learning • Training multiple tasks Task 1 Training Data Model simultaneously to exploit task relationships Task 2 Training Data Model Training … … Task m Training Data Model 2016/11/5 8

Exploiting Task Relationships Key challenge in multi-task learning: Exploiting (statistical) relationships between the tasks so as to improve individual and/or overall predictive accuracy (in comparison to training individual models)! 2016/11/5 10

How Tasks Are Related? • All tasks are related – Models of all tasks are close to each other; – Models of all tasks share a common set of features; – Models share the same low rank subspace • Structure in tasks – clusters / graphs / trees • Learning with outlier tasks 2016/11/5 11

Regularization-based Multi-Task Learning Task m Dimension d Task m Task m Sample n 2 Sample n 2 Sample n m Sample n m Dimension d ... ... Learning Sample n 1 Sample n 1 Feature Matrices X i Target Vectors Y i Model Matrix W We focus on linear models: 𝑍 𝑗 ~𝑌 𝑗 𝒙 𝑗 𝑌 𝑗 ∈ ℝ 𝑜 𝑗 ×𝑒 , 𝑍 𝑗 ∈ ℝ 𝑜 𝑗 ×1 , 𝑋 = [𝒙 1 , 𝒙 2 , … , 𝒙 𝑛 ] ∈ ℝ 𝑒×𝑛 Generic framework 𝑀𝑝𝑡𝑡 𝑋, 𝑌 𝑗 , 𝑍 𝑗 + 𝜇 𝑆𝑓𝑕(𝑋) min 𝑋 𝑗 Impose various types of relations on tasks with 𝑆𝑓𝑕 𝑋 2016/11/5 12

MTL Methods: Mean-Regularized MTL Evgeniou & Pontil, 2004 KDD Assumption: model parameters of all tasks are close to each other. – Advantage: simple, intuitive, easy to implement – Disadvantage: too simple Regularization – Penalizes the deviation of each task from the mean 2 𝑛 𝑛 1 𝑋 𝑗 − min 𝑋 𝑀𝑝𝑡𝑡(𝑋) + 𝜇 𝑋 𝑡 𝑛 𝑗=1 𝑡=1 2 2016/11/5 14

MTL Methods: Joint Feature Learning Evgeniou et al. 2006 NIPS, Obozinski et. al. 2009 Stat Comput, Liu et. al. 2010 Technical Report Assumption: models of all tasks share a common set of features – Using group sparsity: ℓ 1,𝑟 -norm regularization Task m Task 1 Task 2 …… Feature 1 Regularization Feature 2 𝑒 𝑋 1,𝑟 = 𝑗=1 𝒙 𝑗 – Feature 3 𝑟 Feature 4 – When 𝑟 > 1 we have group sparsity Feature 5 Feature 6 min 𝑋 𝑀𝑝𝑡𝑡(𝑋) + 𝜇 𝑋 1,𝑟 Feature 7 …… Feature d 2016/11/5 15

MTL Methods: Low-Rank MTL Ji et. al. 2009 ICML Assumption: in high dimensional feature space, the linear models share the same low-rank subspace Regularization - Rank minimization formulation min 𝑋 𝑀𝑝𝑡𝑡(𝑋) + 𝜇 ∙ rank(𝑋) – Rank minimization is NP-Hard for general loss functions • Convex relaxation: nuclear norm minimization min 𝑋 𝑀𝑝𝑡𝑡(𝑋) + 𝜇 𝑋 ∗ ( 𝑋 ∗ : sum of singular values of 𝑋 ) 2016/11/5 16

MTL Methods: Clustered MTL Zhou et. al. 2011 NIPS Assumption: cluster structure in tasks - the models of tasks from the same group are closer to each other than those from a different group Regularization - capture clustered structures 𝑀𝑝𝑡𝑡 W + 𝛽 tr 𝑋 𝑈 𝑋 − tr 𝐺 𝑈 𝑋 𝑈 𝑋𝐺 + 𝛾 tr 𝑋 𝑈 𝑋 min 𝑋,𝐺:𝐺 𝑈 𝐺=𝐽 𝑙 Improves capture cluster structures generalization performance 2016/11/5 18

Regularization-based MTL: Decomposition Framework • In practice, it is too restrictive to constrain all tasks to share a single shared structure. • Assumption: the model is the sum of two components 𝑋 = 𝑄 + 𝑅 – A shared low dimensional subspace and a task specific component (Ando and Zhang, 2005, JMLR) – A group sparse component and a task specific sparse component (Jalali et.al., 2010, NIPS) – A low rank structure among relevant tasks + outlier tasks (Gong et.al., 2011, KDD) 2016/11/5 19

MTL Methods: Robust MTL Chen et. al. 2011 KDD Assumption: models share the same low-rank subspace + outlier tasks outlier tasks 𝑋 = 𝑄 + 𝑅 𝑅 Regularization Features 𝑄 ∗ : nuclear norm – 𝑛 𝑅 2,1 = 𝑘=1 𝒓 :,𝒌 2 – min 𝑋 𝑀𝑝𝑡𝑡(𝑋) + 𝛽 𝑄 ∗ + 𝛾 𝑅 2,1 low rank column-sparse 2016/11/5 21

Summary So Far… • All multi-task learning formulations discussed above can fit into the 𝑋 = 𝑄 + 𝑅 schema. – Component 𝑄 : shared structure – Component 𝑅 : information not captured by the shared structure 2016/11/5 22

Outline • Introduction to multi-task learning (MTL): problem and models • Multi-task learning with task-feature co-clusters • Low-rank optimization in multi-task learning • Multi-task learning applied to trajectory regression 2016/11/5 23

Recap: How Tasks Are Related? • All tasks are related – Models of all tasks are close to each other; – Models of all tasks share a common set of features; – Models share the same low rank subspace • Structure in tasks – clusters / graphs / trees Task-level • Learning with outlier tasks 2016/11/5 24

How Tasks are Related • Existing methods consider the structure at a general task-level • Restrictive assumption in practice: – In document classification: different tasks may be relevant to different sets of words – In a recommender system: two users with similar tastes on one feature subset may have totally different preference on another subset 2016/11/5 25

CoCMTL: MTL with Task-Feature Co-Clusters [Xu. et al, AAAI15] • Motivation: feature-level groups feature task clustering on the bipartite graph • Impose task-feature co-clustering structure with 𝑆𝑓𝑕(𝑋) 2016/11/5 26

CoCMTL: Model • Decomposition model: 𝑋 = 𝑄 + 𝑅 min 𝑋 𝑀𝑝𝑡𝑡(𝑋) + 𝜇 1 Ω 1 𝑄 + 𝜇 2 Ω 2 (𝑅) 2016/11/5 27

CoCMTL: Model • Decomposition model: 𝑋 = 𝑄 + 𝑅 min 𝑋 𝑀𝑝𝑡𝑡(𝑋) + 𝜇 1 Ω 1 𝑄 + 𝜇 2 Ω 2 (𝑅) non-convex min 𝑒,𝑛 𝜏 𝑗 2 (𝑅) Ω 2 𝑅 = 𝑗=𝑙+1 min 𝑒,𝑛 2 (𝑅) 𝑋 𝑀𝑝𝑡𝑡(𝑋) + 𝜇 1 tr(𝑄𝑀𝑄 𝑈 ) + 𝜇 2 min 𝜏 𝑗 𝑗=𝑙+1 2016/11/5 28

Multi-Task Learning: Models, Optimization and Applications Linli Xu - PowerPoint PPT Presentation

Multi-Task Learning: Models, Optimization and Applications Linli Xu University of Science and Technology of China University of Science and Technology of China Outline Introduction to multi-task learning (MTL): problem and models

Multi-Task Active Learning Yi Zhang Outline Active Learning Multi-Task Active Learning

Multi-Task Learning and Matrix Regularization Andreas Argyriou TTI Chicago Outline

Bond Task Force Draft Bond Task Force Recommendations Tuesday, February 27 , 2018 Bond Task

Task 1d: River basin management Task leader: LNEC; Involved partners EU: ISPRA, DTU, EWA Task

p wered Yva productivity AI Task Manager @nerdybff Task Management Task Management Todoist

Identifying beneficial task relations for multi-task learning in deep neural networks Author:

AI2 - Module 3 Task 5: Learning from Data Overview Task 5: Learning from Data Task 6: Coping

CGO Task Presentation CGO Task Presentation CGO Task Presentation Effective Task Presentation

Multi-agent learning Multi-agent reinforcement learning Gerard Vreeswijk , Intelligent Systems

Multi-Task Minimum Error Rate Training for SMT Patrick Simianer, Katharina W aschle, Stefan

IEA Bioenergy IEA BIOENERGY Task 42 Biorefinery 5 th Task Meeting Dublin, Ireland, 25/26 March

WASC 2019 Findings Presentation to stakeholders February 2019 2 3 Process Task 1 Task 2

Telematics Task Force Telematics Task Force Charlie Gorman Charlie Gorman Talking Points

iDASH - Secure Genome Analysis Task 1A Competition Using ObliVM Task 1B Set union Task 2A Xiao

AU Task Force: 2018 Consultation Bob Dony Chair, AU Task Force April 5, 2018 Outline About

Task-Centered Design Task-Centered Process Creating a Task Scenario Scenario-based Walk-throughs

Hadoop over NDN: Initial Experience and Results Mathias Gibbens, Lei Ye, Chris Gniady, and

Multi-Source Adjustment of Multi-Layer Annotation: the Bits of Wisdom Approach Kilian Evang 20

Professor Paul Knight Secondary Care Appraisal Lead Appraisal and Revalidation Update

Multitask Learning with Low-Level Auxiliary Tasks 1 Traditional automatic speech recognition

Recitation 1: Multitasking Kai Mast Threads vs. Processes Threads Processes How to start?

RegML 2016 Class 4 Regularization for multi-task learning Lorenzo Rosasco UNIGE-MIT-IIT June

Real Time Operating Systems from Fundamentals of Real Time Systems Mukul Shirvaikar &

IN5550 Neural Methods in Natural Language Processing Ensembles, transfer and multi-task

Multi-Task Learning: Models, Optimization and Applications Linli Xu - PowerPoint PPT Presentation

Multi-Task Learning: Models, Optimization and Applications Linli Xu University of Science and Technology of China University of Science and Technology of China Outline Introduction to multi-task learning (MTL): problem and models

Multi-Task Active Learning Yi Zhang Outline Active Learning Multi-Task Active Learning

Multi-Task Learning and Matrix Regularization Andreas Argyriou TTI Chicago Outline

Bond Task Force Draft Bond Task Force Recommendations Tuesday, February 27 , 2018 Bond Task

Task 1d: River basin management Task leader: LNEC; Involved partners EU: ISPRA, DTU, EWA Task

p wered Yva productivity AI Task Manager @nerdybff Task Management Task Management Todoist

Identifying beneficial task relations for multi-task learning in deep neural networks Author:

AI2 - Module 3 Task 5: Learning from Data Overview Task 5: Learning from Data Task 6: Coping

CGO Task Presentation CGO Task Presentation CGO Task Presentation Effective Task Presentation

Multi-agent learning Multi-agent reinforcement learning Gerard Vreeswijk , Intelligent Systems

Multi-Task Minimum Error Rate Training for SMT Patrick Simianer, Katharina W aschle, Stefan

IEA Bioenergy IEA BIOENERGY Task 42 Biorefinery 5 th Task Meeting Dublin, Ireland, 25/26 March

WASC 2019 Findings Presentation to stakeholders February 2019 2 3 Process Task 1 Task 2

Telematics Task Force Telematics Task Force Charlie Gorman Charlie Gorman Talking Points

iDASH - Secure Genome Analysis Task 1A Competition Using ObliVM Task 1B Set union Task 2A Xiao

AU Task Force: 2018 Consultation Bob Dony Chair, AU Task Force April 5, 2018 Outline About

Task-Centered Design Task-Centered Process Creating a Task Scenario Scenario-based Walk-throughs

Hadoop over NDN: Initial Experience and Results Mathias Gibbens, Lei Ye, Chris Gniady, and

Multi-Source Adjustment of Multi-Layer Annotation: the Bits of Wisdom Approach Kilian Evang 20

Professor Paul Knight Secondary Care Appraisal Lead Appraisal and Revalidation Update

Multitask Learning with Low-Level Auxiliary Tasks 1 Traditional automatic speech recognition

Recitation 1: Multitasking Kai Mast Threads vs. Processes Threads Processes How to start?

RegML 2016 Class 4 Regularization for multi-task learning Lorenzo Rosasco UNIGE-MIT-IIT June

Real Time Operating Systems from Fundamentals of Real Time Systems Mukul Shirvaikar &amp;

IN5550 Neural Methods in Natural Language Processing Ensembles, transfer and multi-task

Real Time Operating Systems from Fundamentals of Real Time Systems Mukul Shirvaikar &