15-388/688 - Practical Data Science: Deep learning
- J. Zico Kolter
Carnegie Mellon University Fall 2019
1
15-388/688 - Practical Data Science: Deep learning J. Zico Kolter - - PowerPoint PPT Presentation
15-388/688 - Practical Data Science: Deep learning J. Zico Kolter Carnegie Mellon University Fall 2019 1 Outline Recent history in machine learning Machine learning with neural networks Training neural networks Specialized neural network
1
2
3
4
5
6
Kilimanjaro is 19,710 feet of the mountain covered with snow, and it is said that the highest mountain in
the Maasai language, has been referred to as the house of God. The top close to the west, there is a dry, frozen carcass
what the demand at that altitude, there is no that nobody explained. Kilimanjaro is a mountain of 19,710 feet covered with snow and is said to be the highest mountain in Africa. The summit of the west is called “Ngaje Ngai” in Masai, the house of
is a dry and frozen dead body of
what leopard wanted at that altitude.
https://www.nytimes.com/2016/12/14/magazine/the-great-ai-awakening.html
7
8
9
10
11
푒2푥+1
1 1+푒−푥
12
x1 x2 xn
. . .
z1 z2 zk
. . .
y W1, b1 W2, b2
13
z1 = x
. . . . . .
W1, b1 z5
. . . . . .
z2 z3 z4 W3, b3 W4, b4 = hθ(x) W2, b2
14
15
16
Better models and algorithms Lots of data Lots of computing power
17
18
휃
푖=1 푚
19
푖=1 푚
20
푖=1 푚
21
22
23
24
25
26
27
max
28
29
30
z(1)
1
z(2)
1
z(3)
1
z(1)
3
z(2)
3
z(3)
3
z(1)
2
z(2)
2
z(3)
2
· · · W1 W1 W1 W2 W h
1
W h
1
W h
1
W2 W2
𝑨푖+1
푡
= 𝑔푖 𝑋푖𝑦 푡 + 𝑋푖
ℎ𝑨 푡−1 + 𝑐푖
ℎ휃 𝑦 푡 = 𝑨푘
푡
31
Figure from (Jozefowicz et al., 2015)
32
33
Unsolvable problems (50%) Solvable problems (50%) Problems that need, e.g. deep learning (5%) Problems that can use “simple” machine learning (45%)
34
Unsolvable problems (50%) Solvable problems (50%) Problems that need, e.g. ne new deep learning (5%) Problems that can use “simple” machine learning (45%)
35
36
37
LU activation function is not shown for brevity.
ConvNet Configuration A A-LRN B C D E 11 weight 11 weight 13 weight 16 weight 16 weight 19 weight layers layers layers layers layers layers input (224 × 224 RGB image) conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 conv3-64 LRN conv3-64 conv3-64 conv3-64 conv3-64 maxpool conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 conv3-128 maxpool conv3-256 conv3-256 conv3-256 conv3-256 conv3-256 conv3-256 conv3-256 conv3-256 conv3-256 conv3-256 conv3-256 conv3-256 conv1-256 conv3-256 conv3-256 conv3-256 maxpool conv3-512 conv3-512 conv3-512 conv3-512 conv3-512 conv3-512 conv3-512 conv3-512 conv3-512 conv3-512 conv3-512 conv3-512 conv1-512 conv3-512 conv3-512 conv3-512 maxpool conv3-512 conv3-512 conv3-512 conv3-512 conv3-512 conv3-512 conv3-512 conv3-512 conv3-512 conv3-512 conv3-512 conv3-512 conv1-512 conv3-512 conv3-512 conv3-512 maxpool FC-4096 FC-4096 FC-1000 soft-max
Figure from Simonyan and Zisserman, 2015
38
!"#$ %&'(#)))))))))))'*+,-.#/+&))))))+(#'(# !"#01$ !"#02$ !"#32$ !"#31$
l architecture. The training objective i
Figure from Mikolov, et al., 2013
39
Figure from Devlin, et al., 2018