A short overview on “Reducing model bias in a deep learning classifier using domain adversarial neural networks in the MINERA experiment”
- 1
A short overview on Reducing model bias in a deep learning - - PowerPoint PPT Presentation
A short overview on Reducing model bias in a deep learning classifier using domain adversarial neural networks in the MINERA experiment Anushree Ghosh, UTFSM, Chile Fermilab Date: 2018-11-07 1 Outline MINERvA detector and the
2
3
true vertex reconstructed vertex
4 Plane number Strip number
Target 1 2 3 4 5 Segment 0 2 4 6 7 10 1 3 5 8 9
5
Fiducial: within 85 cm apothem of beam spot
Active Tracker Water Target
Fiducial Mass Fe: 323 kg Pb: 264 kg
Fiducial Mass Fe: 323 kg Pb: 266 kg
Fiducial Mass C: 166 kg Fe: 169 kg Pb: 121 kg
Fiducial Mass Pb: 228 kg
Fiducial Mass Fe: 161 kg Pb: 135 kg WATER TARGET Fiducial Mass 625 kg H20
1 2 3 4 5
Helium Target Fiducial Mass 0.25 tons 4 tracker modules between each target
CH Carbon Iron Lead
6
7
Convolutional unit Label predictor
x view u view v view
Convolutional unit Convolutional unit
8
Data Convolution Unit Convolution Unit Convolution Unit Convolution Unit Fully Connected Loss
Data: hits-x Height: 127 Width: 50 Data: hits-u Height: 127 Width: 25 Data: hits-v Height: 127 Width: 25 Convolution Outputs: 12 Kernel Size: (8,3) Convolution Outputs: 12 Kernel Size: (8,3) Convolution Outputs: 12 Kernel Size: (8,3) ReLU ReLU ReLU MaxPooling Kernel Size: (2,1) Stride: (2,1) MaxPooling Kernel Size: (2,1) Stride: (2,1) MaxPooling Kernel Size: (2,1) Stride: (2,1) Convolution Outputs: 20 Kernel Size: (7,3) Convolution Outputs: 20 Kernel Size: (7,3) Convolution Outputs: 20 Kernel Size: (7,3) ReLU ReLU ReLU MaxPooling Kernel Size: (2,1) Stride: (2,1) MaxPooling Kernel Size: (2,1) Stride: (2,1) MaxPooling Kernel Size: (2,1) Stride: (2,1) Convolution Outputs: 28 Kernel Size: (7,3) Convolution Outputs: 28 Kernel Size: (7,3) Convolution Outputs: 28 Kernel Size: (7,3) ReLU ReLU ReLU MaxPooling Kernel Size: (2,1) Stride: (2,1) MaxPooling Kernel Size: (2,1) Stride: (2,1) MaxPooling Kernel Size: (2,1) Stride: (2,1) Convolution Outputs: 36 Kernel Size: (7,3) Convolution Outputs: 36 Kernel Size: (7,3) Convolution Outputs: 36 Kernel Size: (7,3) ReLU ReLU ReLU MaxPooling Kernel Size: (2,1) Stride: (2,1) MaxPooling Kernel Size: (2,1) Stride: (2,1) MaxPooling Kernel Size: (2,1) Stride: (2,1) InnerProduct Outputs: 196 InnerProduct Outputs: 196 InnerProduct Outputs: 196 ReLU ReLU ReLU Dropout Dropout Dropout InnerProduct Outputs: 128 ReLU Dropout InnerProduct Outputs: 11 Softmax w/ Loss
9
10
2 4 6 8 10 True z-segment 2 4 6 8 10 Reconstructed z-segment
Row normalized Tracking
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 2 4 6 8 10 True z-segment 2 4 6 8 10 Reconstructed z-segment
Log10 Row normalized Tracking
−2.5 −2.0 −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 2 4 6 8 10 True z-segment 2 4 6 8 10 Reconstructed z-segment
Row normalized DNN
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 2 4 6 8 10 True z-segment 2 4 6 8 10 Reconstructed z-segment
Log10 Row normalized DNN
−2.5 −2.0 −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5
11
12
http://adsabs.harvard.edu/cgi-bin/bib_query?arXiv:1505.07818
13
Convolutional unit Convolutional unit Convolutional unit Label predictor Inner product Domain classifier
X view U view V view
14
15
Data Convolution Unit Convolution Unit Convolution Unit Convolution Unit F ully Connected Label Predictor Domain Classifier
Data: hits-x Height: 127 Width: 50 Data: hits-u Height: 127 Width: 25 Data: hits-v Height: 127 Width: 25 Convolution Outputs: 12 Kernel Size: (8,3) Convolution Outputs: 12 Kernel Size: (8,3) Convolution Outputs: 12 Kernel Size: (8,3) ReLU ReLU ReLU MaxPooling Kernel Size: (2,1) Stride: (2,1) MaxPooling Kernel Size: (2,1) Stride: (2,1) MaxPooling Kernel Size: (2,1) Stride: (2,1) Convolution Outputs: 20 Kernel Size: (7,3) Convolution Outputs: 20 Kernel Size: (7,3) Convolution Outputs: 20 Kernel Size: (7,3) ReLU ReLU ReLU MaxPooling Kernel Size: (2,1) Stride: (2,1) MaxPooling Kernel Size: (2,1) Stride: (2,1) MaxPooling Kernel Size: (2,1) Stride: (2,1) Convolution Outputs: 28 Kernel Size: (7,3) Convolution Outputs: 28 Kernel Size: (7,3) Convolution Outputs: 28 Kernel Size: (7,3) ReLU ReLU ReLU MaxPooling Kernel Size: (2,1) Stride: (2,1) MaxPooling Kernel Size: (2,1) Stride: (2,1) MaxPooling Kernel Size: (2,1) Stride: (2,1) Convolution Outputs: 36 Kernel Size: (7,3) Convolution Outputs: 36 Kernel Size: (7,3) Convolution Outputs: 36 Kernel Size: (7,3) ReLU ReLU ReLU MaxPooling Kernel Size: (2,1) Stride: (2,1) MaxPooling Kernel Size: (2,1) Stride: (2,1) MaxPooling Kernel Size: (2,1) Stride: (2,1) InnerProduct Outputs: 196 InnerProduct Outputs: 196 InnerProduct Outputs: 196 ReLU ReLU ReLU Dropout Dropout Dropout InnerProduct Outputs: 128 ReLU Dropout InnerProduct Outputs: 11 Gradient Reversal InnerProduct Outputs: 1024 ReLU Dropout InnerProduct Outputs: 1024 ReLU Dropout InnerProduct Outputs: 1 Sigmoid Cross Entropy Loss Split Silence InnerProduct Outputs: 11 Softmax w/ Loss Target Features Source Features
16
17
N/A FSI on
N/A FSI on In domain The expectation: CNN in domain will perform better than CNN out of domain.
18
19
N/A FSI on Out of domain
N/A FSI on In domain
FSI off(1.2M) FSI on Out of domain with in domain DANN partner The expectation: CNN in domain will perform better than CNN out of domain. The expectation: though model is trained “out of domain”, it would show the similar performance as “CNN in domain” since we consider “in domain” DANN
partner.
20
21
N/A FSI on Out of domain
N/A FSI on In domain
FSI off(1.2M) FSI on Out of domain with in domain DANN partner FSI off(0.6M) FSI off(0.6M) FSI on
Out of domain with in domain DANN partner(half sample)
The expectation: CNN in domain will perform better than CNN out of domain. The expectation: though model is trained “out of domain”, it would show the similar performance as “CNN in domain” since we consider “in domain” DANN
22
23
24
25
Upstream of Target 1
Between target 1 and 2
Between target 2 and 3
Between target 3 and 4
Between target 4 and 5
Downstream of target 5