Full Stack Deep Learning
Troubleshooting Deep Neural Networks Josh Tobin, Sergey Karayev, Pieter Abbeel
Full Stack Deep Learning Troubleshooting Deep Neural Networks Josh - - PowerPoint PPT Presentation
Full Stack Deep Learning Troubleshooting Deep Neural Networks Josh Tobin, Sergey Karayev, Pieter Abbeel Lifecycle of a ML project Cross-project Per-project infrastructure activities Planning & Team & hiring project setup Data
Troubleshooting Deep Neural Networks Josh Tobin, Sergey Karayev, Pieter Abbeel
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
2
Planning & project setup Data collection & labeling Training & debugging Deploying & testing Team & hiring Per-project activities Infra & tooling Cross-project infrastructure
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
3
XKCD, https://xkcd.com/1838/
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
4
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
5
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
6
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
7
He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
8
Poor model performance
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
9
Poor model performance Implementation bugs
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
10
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
11
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
12
Poor model performance Implementation bugs
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
13
Poor model performance Implementation bugs Hyperparameter choices
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
14
Andrej Karpathy, CS231n course notes
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Andrej Karpathy, CS231n course notes He, Kaiming, et al. "Delving deep into rectifiers: Surpassing human-level performance on imagenet classification." Proceedings of the IEEE international conference on computer vision. 2015.
15
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
16
Poor model performance Implementation bugs Hyperparameter choices
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
17
Data/model fit Poor model performance Implementation bugs Hyperparameter choices
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
18
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
19
Data/model fit Poor model performance Implementation bugs Hyperparameter choices
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
20
Dataset construction Data/model fit Poor model performance Implementation bugs Hyperparameter choices
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
21
Amount of lost sleep over...
PhD Tesla
Slide from Andrej Karpathy’s talk “Building the Software 2.0 Stack” at TrainAI 2018, 5/10/2018
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
22
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
23
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
24
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
25
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
26
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
27
Tune hyper- parameters Implement & debug Start simple Evaluate Improve model/data Meets re- quirements
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
28
Overview Start simple
(e.g., LeNet on a subset of your data)
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
29
Overview Implement & debug Start simple
(e.g., LeNet on a subset of your data)
reproduce a known result
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
30
Overview Implement & debug Start simple Evaluate
(e.g., LeNet on a subset of your data)
reproduce a known result
decide what to do next
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
31
Overview Tune hyp- eparams Implement & debug Start simple Evaluate
(e.g., LeNet on a subset of your data)
reproduce a known result
decide what to do next
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Tune hyp- eparams
32
Implement & debug Start simple Evaluate Improve model/data
(e.g., LeNet on a subset of your data)
reproduce a known result
decide what to do next
data or regularize if you overfit Overview
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
33
performance, published results, previous baselines, etc
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
34
0 (no pedestrian) 1 (yes pedestrian)
Goal: 99% classification accuracy
Running example
performance, published results, previous baselines, etc
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Tune hyper- parameters Implement & debug Start simple Evaluate Improve model/data Meets re- quirements
35
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
36
Normalize inputs Choose a simple architecture Simplify the problem Use sensible defaults
Steps
b a c d
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
37
Start here Consider using this later Images LeNet-like architecture LSTM with one hidden layer (or temporal convs) Fully connected neural net with one hidden layer ResNet Attention model or WaveNet-like model Problem-dependent Images Sequences Other
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
38
“This” “is” “a” “cat”
Input 2 Input 3 Input 1
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
39
“This” “is” “a” “cat”
Input 2 Input 3 Input 1
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
40
ConvNet Flatten “This” “is” “a” “cat” LSTM
Input 2 Input 3 Input 1
(64-dim)
(72-dim) (48-dim)
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
41
ConvNet Flatten Con“cat” “This” “is” “a” “cat” LSTM
Input 2 Input 3 Input 1
(64-dim)
(72-dim) (48-dim) (184-dim)
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
42
ConvNet Flatten Concat “This” “is” “a” “cat” LSTM FC FC Output T/F
Input 2 Input 3 Input 1
(64-dim)
(72-dim) (48-dim) (184-dim) (256-dim) (128-dim)
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
43
Normalize inputs Choose a simple architecture Simplify the problem Use sensible defaults
Steps
b a c d
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
44
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
45
the number of outputs)
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Normalize inputs
46
Choose a simple architecture Simplify the problem Use sensible defaults b a c
Steps
d
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
47
(e.g., by dividing by 255) [Careful, make sure your library doesn’t do it for you!]
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Normalize inputs
48
Choose a simple architecture Simplify the problem Use sensible defaults b a c
Steps
d
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
49
size, etc.
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
50
0 (no pedestrian) 1 (yes pedestrian)
Goal: 99% classification accuracy
Running example
for training, 1,000 for val, and 500 for test
sigmoid cross-entropy loss
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Normalize inputs
51
Choose a simple architecture Simplify the problem Use sensible defaults b a c
Steps
d
Summary
version of your problem (e.g., smaller dataset)
regularization
by std, or just divide by 255 (ims)
Connected
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Tune hyper- parameters Implement & debug Start simple Evaluate Improve model/data Meets re- quirements
52
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
53
Get your model to run Compare to a known result Overfit a single batch b a c Steps
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
54
Can fail silently! E.g., accidental broadcasting: x.shape = (None,), y.shape = (None, 1), (x+y).shape = (None, None)
E.g., Forgetting to normalize, or too much pre-processing
E.g., softmaxed outputs to a loss that expects logits
E.g., toggling train/eval, controlling batch norm dependencies
Often stems from using an exp, log, or div operation
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
55
Lightweight implementation
for v1
are fine)
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
56
Use off-the-shelf components, e.g.,
instead of tf.nn.relu(tf.matmul(W, x))
instead of writing out the exp Lightweight implementation
for v1
are fine)
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
57
Build complicated data pipelines later
memory Use off-the-shelf components, e.g.,
instead of tf.nn.relu(tf.matmul(W, x))
instead of writing out the exp Lightweight implementation
for v1
are fine)
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
58
Get your model to run Compare to a known result Overfit a single batch b a c Steps
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Get your model to run a
Shape mismatch Casting issue OOM Other Common issues Recommended resolution Scale back memory intensive
Standard debugging toolkit (Stack Overflow + interactive debugger)
59
Step through model creation and inference in a debugger
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Get your model to run a
Shape mismatch Casting issue OOM Other Common issues Recommended resolution Scale back memory intensive
Standard debugging toolkit (Stack Overflow + interactive debugger)
60
Step through model creation and inference in a debugger
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
61
Option 1: step through graph creation
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
62
Option 2: step into training loop
Evaluate tensors using sess.run(…)
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
63
Option 3: use tfdb
Stops execution at each sess.run(…) and lets you inspect
python -m tensorflow.python.debug.examples.debug_mnist --debug
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Get your model to run a
Shape mismatch Casting issue OOM Other Common issues Recommended resolution Scale back memory intensive
Standard debugging toolkit (Stack Overflow + interactive debugger)
64
Step through model creation and inference in a debugger
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Shape mismatch
Undefined shapes Incorrect shapes Common issues Most common causes
dimension
shape is (None, 1, 1, 4)
loaded (e.g., stored a float64 numpy array, and loaded it as a float32)
65
tensor.get_shape()
when loading data from a file)
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Casting issue
Data not in float32 Common issues Most common causes
66
cast to float32
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
OOM
Too big a tensor Too much data Common issues Most common causes Duplicating
Other processes
the same session
function that gets called over and over again)
during evaluation)
than using an input queue
67
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Other common errors
Other bugs Common issues Most common causes
68
training
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Get your model to run Compare to a known result Overfit a single batch b a c Steps
69
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Overfit a single batch b
Error goes up Error oscillates
Common issues
Error explodes Error plateaus
Most common causes
70
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Overfit a single batch b
Error goes up Error oscillates
Common issues
Error explodes Error plateaus
Most common causes
71
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Overfit a single batch b
Error goes up Error oscillates
Common issues
Error explodes Error plateaus
Most common causes
72
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Overfit a single batch b
Error goes up Error oscillates
Common issues
Error explodes Error plateaus
Most common causes
shuffled, or preprocessed incorrectly)
73
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Overfit a single batch b
Error goes up Error oscillates
Common issues
Error explodes Error plateaus
Most common causes
logits, accidentally add ReLU on output)
74
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Overfit a single batch b
Error goes up Error oscillates
Common issues
Error explodes Error plateaus
Most common causes
shuffled)
logits)
75
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Get your model to run Compare to a known result Overfit a single batch b a c Steps
76
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
77
More useful Less useful
You can:
ensure you have the same output
with expectations
dataset to yours
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
dataset to yours
(e.g., MNIST)
78
More useful Less useful
You can:
ensure you have the same output
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
to yours
(e.g., MNIST)
MNIST)
regression)
79
More useful Less useful
You can:
confidence
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
80
More useful Less useful
You can:
with expectations
dataset to yours
(e.g., MNIST)
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
81
More useful Less useful
dataset to yours
(e.g., MNIST)
MNIST) You can:
simpler setting
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
to yours
(e.g., MNIST)
MNIST)
regression)
82
More useful Less useful
You can:
performance can be expected
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
83
More useful Less useful
to yours
(e.g., MNIST)
MNIST)
regression) You can:
anything at all
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
84
More useful Less useful
to yours
(e.g., MNIST)
MNIST)
regression)
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
85
Get your model to run Compare to a known result Overfit a single batch b a c Steps Summary
regularization, broadcasting errors
up to expectations
for shape, casting, and OOM errors
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
86
Tune hyper- parameters Implement & debug Start simple Evaluate Improve model/data Meets re- quirements
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
87
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
88
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
89
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
25 2 27 5 32 2 34 I r r e d u c i b l e e r r
A v
d a b l e b i a s ( i . e . , u n d e r f i t t i n g ) T r a i n e r r
V a r i a n c e ( i . e . ,
e r f i t t i n g ) V a l e r r
V a l s e t
e r f i t t i n g T e s t e r r
5 10 15 20 25 30 35 40 Breakdown of test error by source
90
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
91
Test error = irreducible error + bias + variance + val overfitting This assumes train, val, and test all come from the same
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
92
Test data Use two val sets: one sampled from training distribution and
Train data
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
93
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
94
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
95
25 2 27 2 29 3 32 2 34 I r r e d u c i b l e e r r
A v
d a b l e b i a s ( i . e . , u n d e r f i t t i n g ) T r a i n e r r
V a r i a n c e T r a i n v a l e r r
D i s t r i b u t i
s h i f t T e s t v a l e r r
V a l
e r f i t t i n g T e s t e r r
5 10 15 20 25 30 35 40 Breakdown of test error by source
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
96
Error source Value Goal performance 1% Train error 20% Validation error 27% Test error 28% Train - goal = 19% (under-fitting)
0 (no pedestrian) 1 (yes pedestrian)
Goal: 99% classification accuracy
Running example
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
97
Error source Value Goal performance 1% Train error 20% Validation error 27% Test error 28% Val - train = 7% (over-fitting)
0 (no pedestrian) 1 (yes pedestrian)
Goal: 99% classification accuracy
Running example
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
98
Error source Value Goal performance 1% Train error 20% Validation error 27% Test error 28% Test - val = 1% (looks good!)
0 (no pedestrian) 1 (yes pedestrian)
Goal: 99% classification accuracy
Running example
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
99
Test error = irreducible error + bias + variance + distribution shift + val overfitting
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
100
Tune hyper- parameters Implement & debug Start simple Evaluate Improve model/data Meets re- quirements
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Address distribution shift
101
Address under-fitting Re-balance datasets (if applicable) Address over-fitting b a c
Steps
d
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
102
Try first Try later
A. Make your model bigger (i.e., add layers or use more units per layer) B. Reduce regularization C. Error analysis D. Choose a different (closer to state-of-the art) model architecture (e.g., move from LeNet to ResNet) E. Tune hyper-parameters (e.g., learning rate) F . Add features
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
103
0 (no pedestrian) 1 (yes pedestrian)
Goal: 99% classification accuracy (i.e., 1% error)
Error source Value Value Goal performance 1% 1% Train error 20% 7% Validation error 27% 19% Test error 28% 20%
Add more layers to the ConvNet
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
104
0 (no pedestrian) 1 (yes pedestrian)
Goal: 99% classification accuracy (i.e., 1% error)
Error source Value Value Value Goal performance 1% 1% 1% Train error 20% 7% 3% Validation error 27% 19% 10% Test error 28% 20% 10%
Switch to ResNet-101
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
105
0 (no pedestrian) 1 (yes pedestrian)
Goal: 99% classification accuracy (i.e., 1% error)
Error source Value Value Value Value Goal performance 1% 1% 1% 1% Train error 20% 7% 3% 0.8% Validation error 27% 19% 10% 12% Test error 28% 20% 10% 12%
Tune learning rate
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Address distribution shift
106
Address under-fitting Re-balance datasets (if applicable) Address over-fitting b a c
Steps
d
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
107
Try first Try later
A. Add more training data (if possible!) B. Add normalization (e.g., batch norm, layer norm) C. Add data augmentation D. Increase regularization (e.g., dropout, L2, weight decay) E. Error analysis F . Choose a different (closer to state-of-the-art) model architecture
H. Early stopping I. Remove features J. Reduce model size
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
108
Try first Try later Not recommended!
A. Add more training data (if possible!) B. Add normalization (e.g., batch norm, layer norm) C. Add data augmentation D. Increase regularization (e.g., dropout, L2, weight decay) E. Error analysis F . Choose a different (closer to state-of-the-art) model architecture
H. Early stopping I. Remove features J. Reduce model size
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
109
Error source Value Goal performance 1% Train error 0.8% Validation error 12% Test error 12%
0 (no pedestrian) 1 (yes pedestrian)
Goal: 99% classification accuracy
Running example
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
110
Error source Value Value Goal performance 1% 1% Train error 0.8% 1.5% Validation error 12% 5% Test error 12% 6%
Increase dataset size to 250,000
0 (no pedestrian) 1 (yes pedestrian)
Goal: 99% classification accuracy
Running example
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
111
Error source Value Value Value Goal performance 1% 1% 1% Train error 0.8% 1.5% 1.7% Validation error 12% 5% 4% Test error 12% 6% 4%
Add weight decay
0 (no pedestrian) 1 (yes pedestrian)
Goal: 99% classification accuracy
Running example
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
112
Error source Value Value Value Value Goal performance 1% 1% 1% 1% Train error 0.8% 1.5% 1.7% 2% Validation error 12% 5% 4% 2.5% Test error 12% 6% 4% 2.6%
Add data augmentation
0 (no pedestrian) 1 (yes pedestrian)
Goal: 99% classification accuracy
Running example
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
113
Error source Value Value Value Value Value Goal performance 1% 1% 1% 1% 1% Train error 0.8% 1.5% 1.7% 2% 0.6% Validation error 12% 5% 4% 2.5% 0.9% Test error 12% 6% 4% 2.6% 1.0%
Tune num layers, optimizer params, weight initialization, kernel size, weight decay
0 (no pedestrian) 1 (yes pedestrian)
Goal: 99% classification accuracy
Running example
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Address distribution shift
114
Address under-fitting Re-balance datasets (if applicable) Address over-fitting b a c
Steps
d
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
115
Try first Try later A. Analyze test-val set errors & collect more training data to compensate B. Analyze test-val set errors & synthesize more training data to compensate C. Apply domain adaptation techniques to training & test distributions
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
116
Test-val set errors (no pedestrian detected) Train-val set errors (no pedestrian detected)
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
117
Test-val set errors (no pedestrian detected) Train-val set errors (no pedestrian detected)
Error type 1: hard-to-see pedestrians
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
118
Test-val set errors (no pedestrian detected) Train-val set errors (no pedestrian detected)
Error type 2: reflections
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
119
Test-val set errors (no pedestrian detected) Train-val set errors (no pedestrian detected)
Error type 3 (test-val only): night scenes
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
120
Error type Error % (train-val) Error % (test-val) Potential solutions Priority
pedestrians 0.1% 0.1%
Low
0.3% 0.3%
Medium
scenes 0.1% 1%
High
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
121
What is it? Techniques to train on “source” distribution and generalize to another “target” using only unlabeled data or limited labeled data When should you consider using it?
distribution is limited
plentiful
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
122
Type Use case Example techniques Supervised You have limited data from target domain
trained model
train set Un-supervised You have lots of un- labeled data from target domain
(CORAL)
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
Address distribution shift
123
Address under-fitting Re-balance datasets (if applicable) Address over-fitting b a c
Steps
d
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
124
you overfit to the val set
parameter tuning
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
125
Tune hyper- parameters Implement & debug Start simple Evaluate Improve model/data Meets re- quirements
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
126
Model & optimizer choices? Network: ResNet
Optimizer: Adam
Regularization
0 (no pedestrian) 1 (yes pedestrian)
Goal: 99% classification accuracy
Running example
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
127
Choosing hyper-parameters
(e.g., if you are using all-zeros weight initialization or vanilla SGD, changing to the defaults will make a big difference)
Hyperparameter Approximate sensitivity Learning rate High Learning rate schedule High Optimizer choice Low Other optimizer params (e.g., Adam beta1) Low Batch size Low Weight initialization Medium Loss function High Model depth Medium Layer size High Layer params (e.g., kernel size) Medium Weight of regularization Medium Nonlinearity Low
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
128
How it works
less stable training
evaluate
(e.g., manually select parameter ranges to
Advantages Disadvantages
computation to get good result
algorithm
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
129
Hyperparameter 1 (e.g., batch size) Hyperparameter 2 (e.g., learning rate) How it works Disadvantages
cross-combos of hyper-parameters
parameters to get good results Advantages
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
130
Hyperparameter 1 (e.g., batch size) Hyperparameter 2 (e.g., learning rate) How it works Disadvantages Advantages
search
parameters to get good results
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
131
Hyperparameter 1 (e.g., batch size) Hyperparameter 2 (e.g., learning rate) How it works Disadvantages Advantages
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
132
Hyperparameter 1 (e.g., batch size) Hyperparameter 2 (e.g., learning rate) How it works Disadvantages Advantages Best performers
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
133
Hyperparameter 1 (e.g., batch size) Hyperparameter 2 (e.g., learning rate) How it works Disadvantages Advantages
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
134
Hyperparameter 1 (e.g., batch size) Hyperparameter 2 (e.g., learning rate) How it works Disadvantages Advantages
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
135
Hyperparameter 1 (e.g., batch size) Hyperparameter 2 (e.g., learning rate) How it works Disadvantages
hyperparameters
Advantages etc.
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
136
How it works (at a high level)
distributions
relationship between hyper-parameter values and model performance
values that maximize the expected improvement
probabilistic model
Advantages Disadvantages
to choose hyperparameters
tools
https://towardsdatascience.com/a-conceptual-explanation-of-bayesian-model-based-hyperparameter-optimization-for-machine-learning-b8172278050f
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
137
How it works (at a high level)
distributions
relationship between hyper-parameter values and model performance
values that maximize the expected improvement
probabilistic model
Advantages Disadvantages
to choose hyperparameters
tools More on tools to do this automatically in the infrastructure & tooling lecture!
https://towardsdatascience.com/a-conceptual-explanation-of-bayesian-model-based-hyperparameter-optimization-for-machine-learning-b8172278050f
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
138
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
139
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
140
competing sources of error
building our model as an iterative process
easier and catch errors as early as possible
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
141
Overview Tune hyp- eparams Implement & debug Start simple Evaluate Improve model/data
(e.g., LeNet on a subset of your data)
reproduce a known result
decide what to do next
data or regularize if you overfit
Full Stack Deep Learning (March 2019) Pieter Abbeel, Sergey Karayev, Josh Tobin L6: Troubleshooting
142