2018.8/7 @高エネルギー宇宙物理学研究会2018
深層学習:
現状のレビューと宇宙物理学・天文学への応用の可能性
(Deep Learning: Review and Possible Applications)
瀧 雅人 (Masato Taki)
RIKEN, iTHEMS
- - PowerPoint PPT Presentation
2018.8/7 @ 2018 (Deep Learning: Review and Possible Applications) (Masato Taki) RIKEN , iTHEMS
2018.8/7 @高エネルギー宇宙物理学研究会2018
深層学習:
現状のレビューと宇宙物理学・天文学への応用の可能性
(Deep Learning: Review and Possible Applications)
瀧 雅人 (Masato Taki)
RIKEN, iTHEMS
深層学習 Deep Learning 機械学習 Machine Learning 人工知能 AI
深層学習 Deep Learning 機械学習 Machine Learning 人工知能 AI
Many approach (Or any approach) An approach A set of concrete methods
深層学習 Deep Learning 機械学習 Machine Learning 人工知能 AI Big progress for the past 6 years
‘the time is ripe for them’
Deep Learning=A Machine Learning
Improving computer program(=machine)’s ability to solve tasks through experience/ data
Learning/Training?
~ x, ~ y ∼ P (x, y)
Cat = (0, 0, 0, 1, 0, 0, …)
Supervised Learning
~ x, ~ y ∼ P (x, y)
This is a pen. これはペンです。
Supervised Learning
~ x, ~ y ∼ P (x, y)
Predict output from input Supervised Learning
Model
Supervised Learning
Model
Which model should we employ?
Supervised Learning
Model
Which model should we employ?
Brain … ??
Modeling Brain!?
Modeling Brain!?
Modeling Brain!?
(Artificial) Neural Network
Modeling Brain!?
Neural Network / Deep Learning
Neural Network / Deep Learning
~ y
Model = Directed Graph
x1 x2 y2 y1
input
Neural Network / Deep Learning ・We can solve various problem by designing ‘good graph’. ・Intuitive (=geometric) design of network Model = Directed Graph
Neural Network / Deep Learning
Deep Learning = Geometrization of Machine Learning !
c.f. General Relativity = Geometrization of Gravity
Details on Neural Network
u = X
i
xi
Input from neuron1 Full input Output to other neurons
x1 x2
a(u)
McCulloch-Pitts’s Artificial Neuron (1943)
Input from neuron2
9
17
= 9 + 17
= 26
McCulloch-Pitts’s Artificial Neuron (1943)
a(26)
9
17
= 9 + 17
= 26
McCulloch-Pitts’s Artificial Neuron (1943)
x1 x2
u = X
i
xi
Activation function
ReLU
u
Threshold
a(u)
a(u)
McCulloch-Pitts’s Artificial Neuron (1943)
9
17
26 26
McCulloch-Pitts’s Artificial Neuron (1943)
−17
9 −8
McCulloch-Pitts’s Artificial Neuron (1943)
u = X
i
xi
Any logical circuit
x1 x2
McCulloch-Pitts’s Artificial Neuron (1943)
a(u)
Rosenblatt’s Perceptron (1957) Tunable parameters
w1 w2
u = X
i
wixi ~ Connection-Strength between a pair
x1 x2
w
a(u)
Multi-layer (Artificial) Neural Network Rosenblatt’s Perceptron (1957)
Multi-layer Neural Net
Rosenblatt’s Perceptron (1957)
Fit=Train these parameters
w1 w2 w3 w4 w5 w6
Rosenblatt’s Perceptron (1957)
(フィッティング=訓練・学習)
ˆ y(x; w)
x
Supervised Learning
(x1, y1) (x2, y2) (xN, yN)
. . .
Observed Data Set
Supervised Learning Observed Data Set Prediction Error Measure
Supervised Learning Observed Data Set Prediction Error Measure
E(w) = 1 N
N
X
n=1
y(xn; w) − yn 2
E.g. Mean Square Error
w∗ = argminwE(w)
‘pseudo’-optimization (regularization, modified optimization algorithms, …)
Neural Network / Deep Learning
Why high performance ? still open question
Achievement in Image Recognition
ILSVRC (ImageNet Large Scale Visual Recognition Challenge) 14 milion 1000 classes
Image recognition competition by using ImageNet dataset (2010-2017)
ILSVRC (ImageNet Large Scale Visual Recognition Challenge)
2011年 2010年 26% 28% Error Rate(Top5 Error)
ILSVRC (ImageNet Large Scale Visual Recognition Challenge)
2011年 2010年 26% 28% pre-DL
ILSVRC (ImageNet Large Scale Visual Recognition Challenge)
2012年 Tronto Univ 2011年 2010年 26% 28% 16% pre-DL
https://www.wired.com/2013/03/google_hinton/ILSVRC (ImageNet Large Scale Visual Recognition Challenge)
2011年 2010年 2013年 Clarifai 26% 28% 16% 11.7% pre-DL 2012年 Tronto Univ
https://clarifai.com/aboutILSVRC (ImageNet Large Scale Visual Recognition Challenge)
2011年 2010年 26% 28% 16% 11.7% 2014年 Google 6.6% pre-DL 2013年 Clarifai 2012年 Tronto Univ
ILSVRC (ImageNet Large Scale Visual Recognition Challenge)
2011年 2010年 26% 28% 16% 11.7% 2014年 Google 6.6% 2015年 Microsoft 3.57% pre-DL 2013年 Clarifai 2012年 Tronto Univ
ILSVRC (ImageNet Large Scale Visual Recognition Challenge)
2011年 2010年 26% 28% 16% 11.7% 2014年 Google 6.6% 2015年 Microsoft 3.57% 2016年 Trimps-Soushen 2.99% pre-DL 2013年 Clarifai 2012年 Tronto Univ
ILSVRC (ImageNet Large Scale Visual Recognition Challenge)
2011年 2010年 26% 28% 16% 11.7% 2014年 Google 6.6% 2015年 Microsoft 3.57% 2016年 Trimps-Soushen 2.99% 2017年 momenta.ai pre-DL 2013年 Clarifai 2.25% 2012年 Tronto Univ
ILSVRC (ImageNet Large Scale Visual Recognition Challenge)
2011年 2010年 26% 28% 16% 11.7% 2014年 Google 6.6% 2015年 Microsoft 3.57% 2016年 Trimps-Soushen 2.99% 2017年 momenta.ai 2.25% pre-DL 2013年 Clarifai 2012年 Tronto Univ Human’s Error 5.1%
ILSVRC (ImageNet Large Scale Visual Recognition Challenge)
2011年 2010年 26% 28% 16% 11.7% 2014年 Google 6.6% 2015年 Microsoft 3.57% 2016年 Trimps-Soushen 2.99% 2017年 momenta.ai 2.25% Human’s Error 5.1% pre-DL 2013年 Clarifai 152 layers 22 layers 8 layers 8 layers 2012年 Tronto Univ
ILSVRC (ImageNet Large Scale Visual Recognition Challenge)
2011年 2010年 26% 28% 11.7% (ZF-Net) 2014年 Google 6.6% (GoogLeNet → Inception) pre-DL 2013年 Clarifai Oxford 7.4% (VGG16/19) 16% (AlexNet) Well used models for research 2012年 Tronto Univ
Funny Experiments which demonstrate DL’s high generalization ability
Caption Generator (Image2Text) [Google, 2014]
Text2Image [Mansimov et al, 2015]
Generative Adversarial Net (GAN)
GAN [Goodfellow et al., 2014]~
DCGAN [Radford-Metz-Chintala, 2015]
policeman cheater/ counterfeiter
random noise (fake)data
V.S. generator
discriminater
min
G
max
D
V (D, G)
V (D, G) = Ex∼Pdata ⇥ log D(x) ⇤ + Ez∼Pnoise ⇥ log
⇤
互いに競い合う
fake data true/fake
GAN [Goodfellow et al., 2014]~
Deterministic NN
GAN [Goodfellow et al., 2014]~
Noise Fake data Lightening Monte-Carlo Computations
(学習データ由来のバイアスは不可避)
ラップトップでもできるレベルの簡単な実験(FashionMNIST使用)
GAN [Goodfellow et al., 2014]~
pix2pix : 画像翻訳 [Goodfellow et al., 2014]~
Googleマップ
Translated image
’衛星写真’
Original image Image T/F
pix2pix : 画像翻訳 [Goodfellow et al., 2014]~
pix2pix : 画像翻訳 [Goodfellow et al., 2014]~
Oxford Univ’s flower datasetを加工 1000枚足らず
pix2pix : 画像翻訳 [Goodfellow et al., 2014]~
4枚のGPU使用 *雑にやらずにもっとモデルを調整すると、もっと良くなる
Neural Conversation Model
Seq2Seq = encoder + decoder
source sentence context target sentence
~ c
encode decode
If you choose it for conversation, you get “chat machine” How are you ? I am fine .
Neural Conversation Model [Vinyals-Le, ’15]
Human: hello ! Machine: hello ! Human: how are you ? Machine: i ’m good . Human: what ’s your name ? Machine: i ’m julia . Human: when were you born ? Machine: july 20th . Human:what year were you born ? Machine: 1977 .
OpenSubtitlesという映画のシナリオから取られたデータセットが使わ れた
Human: where are you from ? Machine: i ’m out in the boonies. Human: are you a man or a woman ? Machine: i ’m a woman . Human: why are we here ? Machine: i ’m not sure . Human: okay, bye ! Machine: bye . Human: see you later . Machine: bye .
自分自身はJuliaだと「名乗る」
Neural Conversation Model [Vinyals-Le, ’15]
Juliaとの哲学談義
Human: what is the purpose of life ? Machine: to serve the greater good . Human: what is the purpose of living ? Machine: to live forever . Human: what is the purpose of existence ? Machine: to find out what happens when we get to the planet earth . Human: where are you now ? Machine: i ’m in the middle of nowhere . Human: what is the purpose of dying ? Machine: to have a life . Human: what is the purpose of being intelligent ? Machine: to find out what it is . Human: what is the purpose of emotions ? Machine: i don ’t know .
Neural Style Transfer
画風 neural style
Neural Style Transfer [Gatys et al., 2015]
高周波成分 低周波成分
Image2 Image2’ Image1 画風2
Neural Style Transfer [Gatys et al., 2015]
画風1
Neural Style Transfer [Gatys et al., 2015]
Neural Style Transfer [Gatys et al., 2015]
Neural Style Transfer [Gatys et al., 2015]
[Gatys et al., 2015]
Neural Style Transfer [Gatys et al., 2015]
Deep Q-Network [DeepMind,`16]
Deep Mind
神経科学研究者D.ハサビスらが率い るベンチャー(2014年、Googleが 500億円程で買収)
Deep Q-Network [DeepMind,`16]
https://www.youtube.com/watch?v=iqXKQf2BOSE
次元の呪い
Curse of Dimensionality & Lazy Learning
Near Far
Without learning, just recording the whole data. To find pattern/ classify new data, measure similarity (distance) between them and new data point. (eg. K-mean clustering)
new sample
VD(R) = RD × VD(1)
Volume of radius R ball in D-dimension
Curse of Dimensionality & Lazy Learning
VD(1) − VD(0.99) VD(1) = 1 − (0.99)D
VD(R) = RD × VD(1)
Ratio of vol. of thin skin ( ) in radius1 ball
0.99 ≤ r ≤ 1
A ball whose radius is one
Curse of Dimensionality & Lazy Learning
VD(1) − VD(0.99) VD(1) = 1 − (0.99)D Ratio of vol. of thin skin ( ) in radius1 ball
1 − (0.99)D
D 1 2 3 10
0.01 = 1% 0.0199 = 2% 0.0297 = 3% 0.0956 = 10%
. . . . . .
0.99 ≤ r ≤ 1
Curse of Dimensionality & Lazy Learning
A ball whose radius is one
Curse of Dimensionality & Lazy Learning
Thin skin dominates the volume!! Opposite to intuition
1 − (0.99)D
D 1000
0.9999 = 99.99..%
Maximally Separated
1 2 3 10
0.01 = 1% 0.0199 = 2% 0.0297 = 3% 0.0956 = 10%
. . . . . . In higher dim, data is too sparse to extract a geometric structure even in ‘big data’ situation.
Representation Learning & Dim. Compression
Good information representation Clustering in lower dim. Anomaly detection
How to make Good Representation
Good rep.
Good rep.
y
探索
Reinforcement Learning (強化学習)
State s Action a Reward r
Agent (program)
(点数) ’
(点数)
Qπ(s, a)
Action value function
Total reward under the policy π
a ∼ π(a|s)
’
Q-Learning
State s Action a Reward r
Agent (program)
policy
Deep Q-Learning
Qπ(s, a)
State s Action value
Qπ(s, a3) Qπ(s, a2) Qπ(s, a1)
Action an Optimization We know deep learning is powerful learner.
MonteCarlo + Reinforcement Learning
検出
YOLO
Language is Python basically
Many Libraries for DL are Python-based. Coding is not the main purpose. Analyzing data and doing Machine Learning is (basically) our job.
Libraries for DL
by Google by Facebook by Preferred Network by Googler
Libraries I have used are for instance + Theano etc. Which is the best? → Whichever you like.
http://www.timqian.com/star-history/#tensorflow/tensorflow&BVLC/caffe&caffe2/caffe2&Microsoft/CNTK&apache/ incubator-mxnet&torch/torch7&pytorch/pytorch&deeplearning4j/deeplearning4j&Theano/Theano&amzn/amazon- dsstne&chainer/chainer
A criterion for usual user
Libraries for DL
https://towardsdatascience.com/battle-of-the-deep-learning- frameworks-part-i-cff0e3841750
Keras is my recommendation
Libraries for DL
New Model for segmentation
[M.T & Murata, TBA]
Normal(正常) Benign(良性) In Situ carcinoma(上皮内がん) Invasive carcinoma(浸潤がん)
segmentation
= pixel-wise classification
Application to Segmentation of Breast Cancers
Performance of our new model
DL’s prediction
U-Net
doctor’s diagnosis (segmentation)
Performance of our new model
DL’s prediction
U-Net
doctor’s diagnosis (segmentation)
Performance of our new model
DL’s prediction
U-Net
doctor’s diagnosis (segmentation)
Performance of our new model
DL’s prediction
U-Net
doctor’s diagnosis (segmentation)
Performance of our new model
DL’s prediction
U-Net
doctor’s diagnosis (segmentation)
Performance of our new model
DL’s prediction
U-Net
doctor’s diagnosis (segmentation)
Performance of our new model
DL’s prediction
U-Net
doctor’s diagnosis (segmentation)
Performance of our new model
DL’s prediction
U-Net
doctor’s diagnosis (segmentation)
Performance of our new model
DL’s prediction
U-Net
doctor’s diagnosis (segmentation)
Stable performance without fine tuning Application to retina, 3D cancer CT images, etc
colon cancer retina
Application of our new model
Generator of GAN architecture
Applications
scientific data (collider, astro, material, …) simulated data (lightening MC by NN, …) another practical domains (medical, drug design,…)
What you physicists can do: suggestion
You physicists can treat math, code, have numeric sense, …
Study of DL
Improve DL’s algorithms. Solve mysteries of DL.
(Don’t apply physics directly…. Solve proper problems)