Task Understanding From Confusing Multi-task Data
Xin SU Tsinghua University Yizhou JIANG Tsinghua University Shangqi GUO Tsinghua University Feng CHEN Tsinghua University
Task Understanding From Confusing Multi-task Data Yizhou JIANG - - PowerPoint PPT Presentation
Task Understanding From Confusing Multi-task Data Yizhou JIANG Shangqi GUO Feng CHEN Xin SU Tsinghua University Tsinghua University Tsinghua University Tsinghua University 2 Motivation: From Narrow AI to AGI Narrow AI: A specific task
Xin SU Tsinghua University Yizhou JIANG Tsinghua University Shangqi GUO Tsinghua University Feng CHEN Tsinghua University
2
Multi-Task Learning :
Comprehensive problems in different semantic space
“Sweet” “Sour” “Sour”
…
“Lemon” “Apple”
…
“Apple”
…
“Red” “Green”
Task 1 (Color)
“Yellow”
Task Annotation Label Annotation
“Yellow” “Banana” “Sweet”
Task 2 (Name) Task 3 (Taste)
AGI Problem: How can we learn task concept from original raw data?
Narrow AI: A specific task in the determined environment.
3
Confusing Data
“Red” “Green” “Yellow” “Apple” “Apple” “Banana” “Lemon” “Sweet” “Sour” “Sour”
Data De-confuse
“Red” “Green” “Yellow” “Apple” “Apple” “Banana” “Lemon” “Sweet” “Sour” “Sour”
Task Understanding (Deconfusing Function) Multi-Task Learning (Mapping Function)
Confusing Supervised Learning
“Apple” “Red” “Sweet”
Without task annotation: Mapping conflicts between multi-task CSL: Learning task concepts by reducing mapping conflicts
4
Mapping Net Deconfusing Net min
𝑙 𝑀𝑛𝑏𝑞 𝑙 = 𝑗=1 𝑛𝑙
𝑧𝑗
𝑙 − 𝑙 𝑦𝑗 𝑙 2 , 𝑙 = 1, … , 𝑜
min
ℎ 𝑀𝑒𝑓𝑑 ℎ = 𝑗=1 𝑛
ℎ 𝑦𝑗, 𝑧𝑗 − ℎ 𝑦𝑗, 𝑧𝑗
2
Mapping-Net Training 𝑙(𝑦𝑗)
𝑦𝑗 𝑧𝑗
Deconfusing-Net
…
Sample Assignment
𝑦𝑗
Mapping Net
…
1(𝑦𝑗) 𝑜(𝑦𝑗)
Multi-task Outputs
𝑴𝒏𝒃𝒒
𝑧𝑗 Deconfusing-Net Training
𝑦𝑗 𝑧𝑗
Deconfusing-Net
…
𝑦𝑗
Mapping Net
…
1(𝑦𝑗) 𝑜(𝑦𝑗)
argmin
𝑙
…
1(𝑦𝑗) 𝑜(𝑦𝑗)
…
Temporary Ground-truth
𝑴𝒆𝒇𝒅 ℎ(𝑦𝑗,𝑧𝑗) ℎ(𝑦𝑗,𝑧𝑗) ℎ(𝑦𝑗,𝑧𝑗)
5
Narrow AI: A specific task in the determined environment. AI Success: Exceeded human-level performance on various problems.
6
Multi-Task Learning :
Comprehensive problems in different semantic space
“Sweet” “Sour” “Sour”
…
“Lemon” “Apple”
…
“Apple”
…
“Red” “Green”
Task 1 (Color)
“Yellow”
Task Annotation Label Annotation
“Yellow” “Banana” “Sweet”
Task 2 (Fruit) Task 3 (Taste)
AGI Problem: How can we learn task concept from original raw data?
7
Multi-tasks cannot be represented by a single mapping function. Task understanding is vital for multi-task learning.
Multi-task data without Task Annotation Confusing Data:
“Red” “Green” “Yellow” “Apple” “Apple” “Banana” “Lemon” “Sweet” “Sour” “Sour”
“Apple” “Red” “Sweet”
“Sweet”
8
Supervised Learning & Latent Variable Learning:
Mapping Confusing.
Multi-Task Learning: Task annotation is needed. Multi-Label Learning: Multiple labels are allocated. Confusing Supervised Learning: No task annotation or samples allocation.
A novel learning problem!
9
Confusing Data
“Red” “Green” “Yellow” “Apple” “Apple” “Banana” “Lemon” “Sweet” “Sour” “Sour”
Data De-confuse
“Red” “Green” “Yellow” “Apple” “Apple” “Banana” “Lemon” “Sweet” “Sour” “Sour”
Task Understanding (Deconfusing Function) Multi-Task Learning (Mapping Function)
Confusing Supervised Learning
Without task annotation: Mapping conflicts between multi-task
“Apple” “Red” “Sweet” Task 1 (Color) Task 2 (Fruit) Task 3 (Taste)
10
Model Traditional Supervised Learning Confusing Supervised Learning Risk Functional Solution
min 𝑆 ∗ > 0 min 𝑆 ∗, ℎ∗ = 0
𝑌
𝑍
𝑌
𝑍
𝑌
𝑍
𝑌 𝑍
Mapping (𝑦) Deconfusing ℎ(𝑦, 𝑔, )
X Y
function 1 function 2
11
Wrong allocation of confusing samples leads to unavoidable loss.
Loss > 0 Loss ≈ 0
X Y
Sam pl es
C onf usi ng Sam pl es
X Y
function 1 function 2 function 3
Task concept driven by global loss: Empirical risk should go towards 0!
12
Optimization Target: Expected Result: Constraint:
The output of Deconfusing-Net is one-hot!
Difficulty:
Approximation of Softmax leads to a trivial solution. Joint BP is not available.
13
Training of Mapping Net Training of Deconfusing Net min
𝑙 𝑀𝑛𝑏𝑞 𝑙 = 𝑗=1 𝑛𝑙
𝑧𝑗
𝑙 − 𝑙 𝑦𝑗 𝑙 2 , 𝑙 = 1, … , 𝑜
min
ℎ 𝑀𝑒𝑓𝑑 ℎ = 𝑗=1 𝑛
ℎ 𝑦𝑗, 𝑧𝑗 − ℎ 𝑦𝑗, 𝑧𝑗
2
Mapping-Net Training 𝑙(𝑦𝑗)
𝑦𝑗 𝑧𝑗
Deconfusing-Net
…
Sample Assignment
𝑦𝑗
Mapping Net
…
1(𝑦𝑗) 𝑜(𝑦𝑗)
Multi-task Outputs
𝑴𝒏𝒃𝒒
𝑧𝑗 Deconfusing-Net Training
𝑦𝑗 𝑧𝑗
Deconfusing-Net
…
𝑦𝑗
Mapping Net
…
1(𝑦𝑗) 𝑜(𝑦𝑗)
argmin
𝑙
…
1(𝑦𝑗) 𝑜(𝑦𝑗)
…
Temporary Ground-truth
𝑴𝒆𝒇𝒅 ℎ(𝑦𝑗,𝑧𝑗) ℎ(𝑦𝑗,𝑧𝑗) ℎ(𝑦𝑗,𝑧𝑗)
14
Supervised learning fails to fit multiple functions. Incorrect task number leads to confusing fitting results. CSL-Net learns reasonable task concepts and complete
multi-task mapping.
Results in the training process
15
Each sample represents the classification result of only one task. Two Learning Goal:
Two Evaluation Metrics:
Color Name Taste
“Red” “Green” “Yellow” “Lemon” “Apple” “Banana” “Sweet” “Sour” “Spicy”
16
Results on two confusing supervised datasets.
Before After Before After
17
Feature Visualization of Deconfusing Net. Deconfusing Net could separate confusing samples to reasonable task groups.
18
A novel learning problem for general raw data:
A novel learning paradigm: Confusing Supervised Learning
A novel network: CSL-Net
A novel application: learning system towards general intelligence.
mapping without manual task annotation.
19