- - PowerPoint PPT Presentation

deep learning review and possible applications
SMART_READER_LITE
LIVE PREVIEW

- - PowerPoint PPT Presentation

2018.8/7 @ 2018 (Deep Learning: Review and Possible Applications) (Masato Taki) RIKEN , iTHEMS


slide-1
SLIDE 1

2018.8/7 @高エネルギー宇宙物理学研究会2018

深層学習:

現状のレビューと宇宙物理学・天文学への応用の可能性

(Deep Learning: Review and Possible Applications)

瀧 雅人 (Masato Taki)

RIKEN, iTHEMS

slide-2
SLIDE 2

深層学習 Deep Learning 機械学習 Machine Learning 人工知能 AI

slide-3
SLIDE 3

深層学習 Deep Learning 機械学習 Machine Learning 人工知能 AI

Many approach (Or any approach) An approach A set of concrete methods

slide-4
SLIDE 4

深層学習 Deep Learning 機械学習 Machine Learning 人工知能 AI Big progress for the past 6 years

‘the time is ripe for them’

slide-5
SLIDE 5

Deep Learning=A Machine Learning

Improving computer program(=machine)’s ability to solve tasks through experience/ data

Learning/Training?

slide-6
SLIDE 6
  • 1. Deep Learning
slide-7
SLIDE 7

~ x, ~ y ∼ P (x, y)

Cat = (0, 0, 0, 1, 0, 0, …)

Supervised Learning

slide-8
SLIDE 8

~ x, ~ y ∼ P (x, y)

This is a pen. これはペンです。

Supervised Learning

slide-9
SLIDE 9

~ x, ~ y ∼ P (x, y)

Predict output from input Supervised Learning

x

ˆ y

Model

slide-10
SLIDE 10

Supervised Learning

x

ˆ y

Model

Which model should we employ?

slide-11
SLIDE 11

Supervised Learning

x

ˆ y

Model

Which model should we employ?

slide-12
SLIDE 12

Brain … ??

slide-13
SLIDE 13

?

Modeling Brain!?

slide-14
SLIDE 14

?

Network =

Modeling Brain!?

slide-15
SLIDE 15

= Network

Modeling Brain!?

slide-16
SLIDE 16

(Artificial) Neural Network

Modeling Brain!?

slide-17
SLIDE 17

Neural Network / Deep Learning

x

ˆ y

slide-18
SLIDE 18

Neural Network / Deep Learning

~ y

~ x

Model = Directed Graph

x1 x2 y2 y1

input

  • utput
slide-19
SLIDE 19

Neural Network / Deep Learning ・We can solve various problem by designing ‘good graph’. ・Intuitive (=geometric) design of network Model = Directed Graph

slide-20
SLIDE 20

Neural Network / Deep Learning

Deep Learning = Geometrization of Machine Learning !

c.f. General Relativity = Geometrization of Gravity

slide-21
SLIDE 21

Details on Neural Network

slide-22
SLIDE 22

u = X

i

xi

Input from neuron1 Full input Output to other neurons

x1 x2

a(u)

McCulloch-Pitts’s Artificial Neuron (1943)

Input from neuron2

slide-23
SLIDE 23

9

17

= 9 + 17

= 26

u

McCulloch-Pitts’s Artificial Neuron (1943)

slide-24
SLIDE 24

a(26)

9

17

= 9 + 17

= 26

u

McCulloch-Pitts’s Artificial Neuron (1943)

slide-25
SLIDE 25

x1 x2

u = X

i

xi

Activation function

ReLU

u

Threshold

a(u)

a(u)

McCulloch-Pitts’s Artificial Neuron (1943)

slide-26
SLIDE 26

9

17

26 26

McCulloch-Pitts’s Artificial Neuron (1943)

slide-27
SLIDE 27

−17

9 −8

McCulloch-Pitts’s Artificial Neuron (1943)

slide-28
SLIDE 28

u = X

i

xi

Any logical circuit

x1 x2

McCulloch-Pitts’s Artificial Neuron (1943)

a(u)

slide-29
SLIDE 29

Rosenblatt’s Perceptron (1957) Tunable parameters

w1 w2

u = X

i

wixi ~ Connection-Strength between a pair

x1 x2

w

a(u)

slide-30
SLIDE 30
  • Rough network design.
  • Learning tunes the behavior of the net.

Multi-layer (Artificial) Neural Network Rosenblatt’s Perceptron (1957)

slide-31
SLIDE 31

Multi-layer Neural Net

  • Layer structure is key to performance!
  • Many layers=Deep

Rosenblatt’s Perceptron (1957)

slide-32
SLIDE 32

Fit=Train these parameters

w1 w2 w3 w4 w5 w6

Rosenblatt’s Perceptron (1957)

(フィッティング=訓練・学習)

ˆ y(x; w)

x

slide-33
SLIDE 33

Supervised Learning

(x1, y1) (x2, y2) (xN, yN)

. . .

Observed Data Set

slide-34
SLIDE 34

Supervised Learning Observed Data Set Prediction Error Measure

slide-35
SLIDE 35

Supervised Learning Observed Data Set Prediction Error Measure

E(w) = 1 N

N

X

n=1

  • ˆ

y(xn; w) − yn 2

E.g. Mean Square Error

w∗ = argminwE(w)

‘pseudo’-optimization (regularization, modified optimization algorithms, …)

slide-36
SLIDE 36

Neural Network / Deep Learning

Why high performance ? still open question

slide-37
SLIDE 37
  • 2. Progress in DL
slide-38
SLIDE 38

Achievement in Image Recognition

slide-39
SLIDE 39

ILSVRC (ImageNet Large Scale Visual Recognition Challenge) 14 milion 1000 classes

Image recognition competition by using ImageNet dataset (2010-2017)

slide-40
SLIDE 40

ILSVRC (ImageNet Large Scale Visual Recognition Challenge)

2011年 2010年 26% 28% Error Rate(Top5 Error)

slide-41
SLIDE 41

ILSVRC (ImageNet Large Scale Visual Recognition Challenge)

2011年 2010年 26% 28% pre-DL

slide-42
SLIDE 42

ILSVRC (ImageNet Large Scale Visual Recognition Challenge)

2012年 Tronto Univ 2011年 2010年 26% 28% 16% pre-DL

https://www.wired.com/2013/03/google_hinton/
slide-43
SLIDE 43

ILSVRC (ImageNet Large Scale Visual Recognition Challenge)

2011年 2010年 2013年 Clarifai 26% 28% 16% 11.7% pre-DL 2012年 Tronto Univ

https://clarifai.com/about
slide-44
SLIDE 44

ILSVRC (ImageNet Large Scale Visual Recognition Challenge)

2011年 2010年 26% 28% 16% 11.7% 2014年 Google 6.6% pre-DL 2013年 Clarifai 2012年 Tronto Univ

slide-45
SLIDE 45

ILSVRC (ImageNet Large Scale Visual Recognition Challenge)

2011年 2010年 26% 28% 16% 11.7% 2014年 Google 6.6% 2015年 Microsoft 3.57% pre-DL 2013年 Clarifai 2012年 Tronto Univ

slide-46
SLIDE 46

ILSVRC (ImageNet Large Scale Visual Recognition Challenge)

2011年 2010年 26% 28% 16% 11.7% 2014年 Google 6.6% 2015年 Microsoft 3.57% 2016年 Trimps-Soushen 2.99% pre-DL 2013年 Clarifai 2012年 Tronto Univ

slide-47
SLIDE 47

ILSVRC (ImageNet Large Scale Visual Recognition Challenge)

2011年 2010年 26% 28% 16% 11.7% 2014年 Google 6.6% 2015年 Microsoft 3.57% 2016年 Trimps-Soushen 2.99% 2017年 momenta.ai pre-DL 2013年 Clarifai 2.25% 2012年 Tronto Univ

slide-48
SLIDE 48

ILSVRC (ImageNet Large Scale Visual Recognition Challenge)

2011年 2010年 26% 28% 16% 11.7% 2014年 Google 6.6% 2015年 Microsoft 3.57% 2016年 Trimps-Soushen 2.99% 2017年 momenta.ai 2.25% pre-DL 2013年 Clarifai 2012年 Tronto Univ Human’s Error 5.1%

slide-49
SLIDE 49

ILSVRC (ImageNet Large Scale Visual Recognition Challenge)

2011年 2010年 26% 28% 16% 11.7% 2014年 Google 6.6% 2015年 Microsoft 3.57% 2016年 Trimps-Soushen 2.99% 2017年 momenta.ai 2.25% Human’s Error 5.1% pre-DL 2013年 Clarifai 152 layers 22 layers 8 layers 8 layers 2012年 Tronto Univ

slide-50
SLIDE 50

ILSVRC (ImageNet Large Scale Visual Recognition Challenge)

2011年 2010年 26% 28% 11.7% (ZF-Net) 2014年 Google 6.6% (GoogLeNet → Inception) pre-DL 2013年 Clarifai Oxford 7.4% (VGG16/19) 16% (AlexNet) Well used models for research 2012年 Tronto Univ

slide-51
SLIDE 51

Funny Experiments which demonstrate DL’s high generalization ability

slide-52
SLIDE 52

Image → Caption

slide-53
SLIDE 53

Caption Generator (Image2Text) [Google, 2014]

slide-54
SLIDE 54

Caption → Image

slide-55
SLIDE 55

Text2Image [Mansimov et al, 2015]

slide-56
SLIDE 56

Generative Adversarial Net (GAN)

slide-57
SLIDE 57

GAN [Goodfellow et al., 2014]~

DCGAN [Radford-Metz-Chintala, 2015]

slide-58
SLIDE 58

policeman cheater/ counterfeiter

random noise (fake)data

V.S. generator

discriminater

min

G

max

D

V (D, G)

V (D, G) = Ex∼Pdata ⇥ log D(x) ⇤ + Ez∼Pnoise ⇥ log

  • 1 − D
  • G(z)

互いに競い合う

fake data true/fake

GAN [Goodfellow et al., 2014]~

slide-59
SLIDE 59

Deterministic NN

GAN [Goodfellow et al., 2014]~

Noise Fake data Lightening Monte-Carlo Computations

(学習データ由来のバイアスは不可避)

slide-60
SLIDE 60

ラップトップでもできるレベルの簡単な実験(FashionMNIST使用)

GAN [Goodfellow et al., 2014]~

slide-61
SLIDE 61

pix2pix : 画像翻訳 [Goodfellow et al., 2014]~

Googleマップ

Translated image

’衛星写真’

Original image Image T/F

slide-62
SLIDE 62

pix2pix : 画像翻訳 [Goodfellow et al., 2014]~

slide-63
SLIDE 63

pix2pix : 画像翻訳 [Goodfellow et al., 2014]~

Oxford Univ’s flower datasetを加工 1000枚足らず

slide-64
SLIDE 64

pix2pix : 画像翻訳 [Goodfellow et al., 2014]~

4枚のGPU使用 *雑にやらずにもっとモデルを調整すると、もっと良くなる

slide-65
SLIDE 65

Neural Conversation Model

slide-66
SLIDE 66

Seq2Seq = encoder + decoder

source sentence context target sentence

~ c

encode decode

If you choose it for conversation, you get “chat machine” How are you ? I am fine .

slide-67
SLIDE 67

Neural Conversation Model [Vinyals-Le, ’15]

Human: hello ! Machine: hello ! Human: how are you ? Machine: i ’m good . Human: what ’s your name ? Machine: i ’m julia . Human: when were you born ? Machine: july 20th . Human:what year were you born ? Machine: 1977 .

OpenSubtitlesという映画のシナリオから取られたデータセットが使わ れた

Human: where are you from ? Machine: i ’m out in the boonies. Human: are you a man or a woman ? Machine: i ’m a woman . Human: why are we here ? Machine: i ’m not sure . Human: okay, bye ! Machine: bye . Human: see you later . Machine: bye .

自分自身はJuliaだと「名乗る」

slide-68
SLIDE 68

Neural Conversation Model [Vinyals-Le, ’15]

Juliaとの哲学談義

Human: what is the purpose of life ? Machine: to serve the greater good . Human: what is the purpose of living ? Machine: to live forever . Human: what is the purpose of existence ? Machine: to find out what happens when we get to the planet earth . Human: where are you now ? Machine: i ’m in the middle of nowhere . Human: what is the purpose of dying ? Machine: to have a life . Human: what is the purpose of being intelligent ? Machine: to find out what it is . Human: what is the purpose of emotions ? Machine: i don ’t know .

slide-69
SLIDE 69

Neural Style Transfer

slide-70
SLIDE 70

画風 neural style

Neural Style Transfer [Gatys et al., 2015]

高周波成分 低周波成分

slide-71
SLIDE 71

Image2 Image2’ Image1 画風2

Neural Style Transfer [Gatys et al., 2015]

画風1

slide-72
SLIDE 72

Neural Style Transfer [Gatys et al., 2015]

slide-73
SLIDE 73

Neural Style Transfer [Gatys et al., 2015]

slide-74
SLIDE 74

Neural Style Transfer [Gatys et al., 2015]

slide-75
SLIDE 75

[Gatys et al., 2015]

Neural Style Transfer [Gatys et al., 2015]

slide-76
SLIDE 76

‘Solving’ Games

slide-77
SLIDE 77

Deep Q-Network [DeepMind,`16]

Deep Mind

神経科学研究者D.ハサビスらが率い るベンチャー(2014年、Googleが 500億円程で買収)

slide-78
SLIDE 78

Deep Q-Network [DeepMind,`16]

https://www.youtube.com/watch?v=iqXKQf2BOSE

slide-79
SLIDE 79
  • 3. Useful methods

for Basic Science

slide-80
SLIDE 80
  • 1. Curse of Dimensionality

次元の呪い

slide-81
SLIDE 81

Curse of Dimensionality & Lazy Learning

Near Far

Without learning, just recording the whole data. To find pattern/ classify new data, measure similarity (distance) between them and new data point. (eg. K-mean clustering)

new sample

slide-82
SLIDE 82

VD(R) = RD × VD(1)

Volume of radius R ball in D-dimension

Curse of Dimensionality & Lazy Learning

slide-83
SLIDE 83

VD(1) − VD(0.99) VD(1) = 1 − (0.99)D

VD(R) = RD × VD(1)

Ratio of vol. of thin skin ( ) in radius1 ball

0.99 ≤ r ≤ 1

A ball whose radius is one

Curse of Dimensionality & Lazy Learning

slide-84
SLIDE 84

VD(1) − VD(0.99) VD(1) = 1 − (0.99)D Ratio of vol. of thin skin ( ) in radius1 ball

1 − (0.99)D

D 1 2 3 10

0.01 = 1% 0.0199 = 2% 0.0297 = 3% 0.0956 = 10%

. . . . . .

0.99 ≤ r ≤ 1

Curse of Dimensionality & Lazy Learning

A ball whose radius is one

slide-85
SLIDE 85

Curse of Dimensionality & Lazy Learning

Thin skin dominates the volume!! Opposite to intuition

1 − (0.99)D

D 1000

0.9999 = 99.99..%

Maximally Separated

1 2 3 10

0.01 = 1% 0.0199 = 2% 0.0297 = 3% 0.0956 = 10%

. . . . . . In higher dim, data is too sparse to extract a geometric structure even in ‘big data’ situation.

slide-86
SLIDE 86

Representation Learning & Dim. Compression

Good information representation Clustering in lower dim. Anomaly detection

slide-87
SLIDE 87

How to make Good Representation

Good rep.

  • 1. Unsupervised: Auto-Encoder
  • 2. Supervised

Good rep.

x

y

x x

slide-88
SLIDE 88
  • 2. Search

探索

slide-89
SLIDE 89

Reinforcement Learning (強化学習)

State s Action a Reward r

Agent (program)

(点数) ’

slide-90
SLIDE 90

(点数)

Qπ(s, a)

Action value function

Total reward under the policy π

a ∼ π(a|s)

Q-Learning

State s Action a Reward r

Agent (program)

policy

slide-91
SLIDE 91

Deep Q-Learning

Qπ(s, a)

State s Action value

Qπ(s, a3) Qπ(s, a2) Qπ(s, a1)

Action an Optimization We know deep learning is powerful learner.

MonteCarlo + Reinforcement Learning

slide-92
SLIDE 92
  • 3. Detection

検出

slide-93
SLIDE 93

YOLO

slide-94
SLIDE 94
  • 4. Libraries
slide-95
SLIDE 95

Language is Python basically

Many Libraries for DL are Python-based. Coding is not the main purpose. Analyzing data and doing Machine Learning is (basically) our job.

slide-96
SLIDE 96

Libraries for DL

by Google by Facebook by Preferred Network by Googler

Libraries I have used are for instance + Theano etc. Which is the best? → Whichever you like.

slide-97
SLIDE 97

http://www.timqian.com/star-history/#tensorflow/tensorflow&BVLC/caffe&caffe2/caffe2&Microsoft/CNTK&apache/ incubator-mxnet&torch/torch7&pytorch/pytorch&deeplearning4j/deeplearning4j&Theano/Theano&amzn/amazon- dsstne&chainer/chainer

A criterion for usual user

Libraries for DL

slide-98
SLIDE 98

https://towardsdatascience.com/battle-of-the-deep-learning- frameworks-part-i-cff0e3841750

Keras is my recommendation

Libraries for DL

slide-99
SLIDE 99

Scientific Application

  • breast cancer-

5.

slide-100
SLIDE 100

New Model for segmentation

[M.T & Murata, TBA]

slide-101
SLIDE 101

Normal(正常) Benign(良性) In Situ carcinoma(上皮内がん) Invasive carcinoma(浸潤がん)

segmentation

= pixel-wise classification

Application to Segmentation of Breast Cancers

slide-102
SLIDE 102

Performance of our new model

DL’s prediction

U-Net

doctor’s diagnosis (segmentation)

slide-103
SLIDE 103

Performance of our new model

DL’s prediction

U-Net

doctor’s diagnosis (segmentation)

slide-104
SLIDE 104

Performance of our new model

DL’s prediction

U-Net

doctor’s diagnosis (segmentation)

slide-105
SLIDE 105

Performance of our new model

DL’s prediction

U-Net

doctor’s diagnosis (segmentation)

slide-106
SLIDE 106

Performance of our new model

DL’s prediction

U-Net

doctor’s diagnosis (segmentation)

slide-107
SLIDE 107

Performance of our new model

DL’s prediction

U-Net

doctor’s diagnosis (segmentation)

slide-108
SLIDE 108

Performance of our new model

DL’s prediction

U-Net

doctor’s diagnosis (segmentation)

slide-109
SLIDE 109

Performance of our new model

DL’s prediction

U-Net

doctor’s diagnosis (segmentation)

slide-110
SLIDE 110

Performance of our new model

DL’s prediction

U-Net

doctor’s diagnosis (segmentation)

slide-111
SLIDE 111

Stable performance without fine tuning Application to retina, 3D cancer CT images, etc

colon cancer retina

Application of our new model

Generator of GAN architecture

slide-112
SLIDE 112

Conclusion

6.

slide-113
SLIDE 113

Applications

scientific data (collider, astro, material, …) simulated data (lightening MC by NN, …) another practical domains (medical, drug design,…)

What you physicists can do: suggestion

You physicists can treat math, code, have numeric sense, …

Study of DL

Improve DL’s algorithms. Solve mysteries of DL.

(Don’t apply physics directly…. Solve proper problems)