Machine learning for bounce calculation Ryusuke Jinno (IBS-CTPU) - - PowerPoint PPT Presentation

machine learning for bounce calculation
SMART_READER_LITE
LIVE PREVIEW

Machine learning for bounce calculation Ryusuke Jinno (IBS-CTPU) - - PowerPoint PPT Presentation

Machine learning for bounce calculation Ryusuke Jinno (IBS-CTPU) Based on 1805.12153 2018/12/10 @ ICTP workshop on machine learning landscape 01 / 29 SELF INTRODUCTION Ryusuke ( ) Jinno ( ) - 2016/3 : Ph.D. @ Univ. of Tokyo -


slide-1
SLIDE 1

Machine learning for bounce calculation

Based on 1805.12153 2018/12/10 @ ICTP workshop on machine learning landscape

01 / 29

Ryusuke Jinno (IBS-CTPU)

slide-2
SLIDE 2

Ryusuke Jinno / 29

1805.12153

02

Ryusuke (隆介) Jinno (神野)

SELF INTRODUCTION

  • 2016/3 : Ph.D. @ Univ. of Tokyo
  • 2016/4-8 : KEK, Japan
  • 2016/9- : IBS-CTPU, Korea
  • 2019/4- : DESY, Germany (planned)

Research interests & recent works

  • Machine learning : Application of machine learning to QFT tunneling problem
  • Gravitational waves : Analytic approach to GW production in phase transitions
  • Inflation : Hillcliming inflation (an inflationary attractor)

Hillclimbing Higgs inflation (new realization of Higgs inflation with hillclimbing scheme)

  • (P)reheating : Preheating in Higgs inflation (discovery of “spike preheating” phenomena)
slide-3
SLIDE 3

Ryusuke Jinno / 29

1805.12153

02

Ryusuke (隆介) Jinno (神野)

SELF INTRODUCTION

  • 2016/3 : Ph.D. @ Univ. of Tokyo
  • 2016/4-8 : KEK, Japan
  • 2016/9- : IBS-CTPU, Korea
  • 2019/4- : DESY, Germany (planned)

Research interests & recent works

  • Machine learning : Application of machine learning to QFT tunneling problem
  • Gravitational waves : Analytic approach to GW production in phase transitions
  • Inflation : Hillcliming inflation (an inflationary attractor)

Hillclimbing Higgs inflation (new realization of Higgs inflation with hillclimbing scheme)

  • (P)reheating : Preheating in Higgs inflation (discovery of “spike preheating” phenomena)
slide-4
SLIDE 4

Ryusuke Jinno / 29

1805.12153

TALK PLAN

  • 1. Machine learning : lightning introduction
  • 2. Machine learning meets tunneling problem in QFT
  • 3. Data taking / Machine setup / Training process / Results
  • 4. Summary
slide-5
SLIDE 5

Ryusuke Jinno / 29

1805.12153

MACHINE LEARNING: LIGHTNING INTRODUCTION

[ https://www.slideshare.net/awahid/big-data-and-machine-learning-for-businesses ]

03

slide-6
SLIDE 6

Ryusuke Jinno / 29

1805.12153 [ https://www.slideshare.net/awahid/big-data-and-machine-learning-for-businesses ]

Can I apply this technique to problems in high-energy physics?

MACHINE LEARNING: LIGHTNING INTRODUCTION

03

slide-7
SLIDE 7

Ryusuke Jinno / 29

1805.12153

Terminology?

[ J. McCarthy ]

Machines that can perform tasks that are characteristic of human intelligence A way of achieving AI: lerning without being explicitly programmed

[ https://medium.com/iotforall/the-difference-between-artificial-intelligence-machine-learning-and-deep-learning-3aa67bff5991 ] [ https://en.wikipedia.org/wiki/Artificial_intelligence ] Note : the speaker’s major is not machine learning !! e.g.

Artificial intelligence (AI) Machine learning (ML) Neural network (NN) Deep neural network Deep learning Machine learning with artificial neurons (→ later) Neural network with deep (= many) layers of neurons

x1 xn x2 ... w1 w2 wn

MACHINE LEARNING: LIGHTNING INTRODUCTION

04

slide-8
SLIDE 8

Ryusuke Jinno / 29

1805.12153

Linear problem

: # of "discount" in the email x1 : # of "luxuary" in the email x2

05

: spam : not spam Your data Find such that a, b x2 = ax1 + b is the boundary

  • f spam & not spam

Question

LINEAR VS. NONLINEAR: SPAM EMAIL EXAMPLE

slide-9
SLIDE 9

Ryusuke Jinno / 29

1805.12153

Linear problem

05

: spam : not spam Your data Find such that a, b x2 = ax1 + b is the boundary

  • f spam & not spam

Question "Linear problem" → Good solution found easily Answer

LINEAR VS. NONLINEAR: SPAM EMAIL EXAMPLE

: # of "luxuary" in the email x2 : # of "discount" in the email x1

slide-10
SLIDE 10

Ryusuke Jinno / 29

1805.12153

Nonlinear problem

06

: spam : not spam Your data Find such that a, b x2 = ax1 + b is the boundary

  • f spam & not spam

Question

LINEAR VS. NONLINEAR: SPAM EMAIL EXAMPLE

: # of "luxuary" in the email x2 : # of "discount" in the email x1

slide-11
SLIDE 11

Ryusuke Jinno / 29

1805.12153

Nonlinear problem

: spam : not spam Your data Find such that a, b x2 = ax1 + b is the boundary

  • f spam & not spam

Question

LINEAR VS. NONLINEAR: SPAM EMAIL EXAMPLE

06

: # of "luxuary" in the email x2 : # of "discount" in the email x1

slide-12
SLIDE 12

Ryusuke Jinno / 29

1805.12153

Nonlinear problem

: spam : not spam Your data Find such that a, b x2 = ax1 + b is the boundary

  • f spam & not spam

Question

LINEAR VS. NONLINEAR: SPAM EMAIL EXAMPLE

06

: # of "luxuary" in the email x2 : # of "discount" in the email x1

slide-13
SLIDE 13

Ryusuke Jinno / 29

1805.12153

Nonlinear problem

: spam : not spam Your data Find such that a, b x2 = ax1 + b is the boundary

  • f spam & not spam

Question "Nonlinear problem" → No good solution Answer

LINEAR VS. NONLINEAR: SPAM EMAIL EXAMPLE

06

: # of "luxuary" in the email x2 : # of "discount" in the email x1

slide-14
SLIDE 14

Ryusuke Jinno / 29

1805.12153

Nonlinear problem

r θ With some effort, you may find & useful: r = q x2

1 + x2 2

θ = arctan(x2/x1)

LINEAR VS. NONLINEAR: SPAM EMAIL EXAMPLE

07

slide-15
SLIDE 15

Ryusuke Jinno / 29

1805.12153

Nonlinear problem

r θ With some effort, you may find & useful: r = q x2

1 + x2 2

θ = arctan(x2/x1) "Feature engineering"

  • In this specific case you are successful
  • But you cannot always find such good quantities
  • Any good strategy to this kind of problem?

(i.e. to capture nonlinearity) Good solution found, but...

LINEAR VS. NONLINEAR: SPAM EMAIL EXAMPLE

07

slide-16
SLIDE 16

Ryusuke Jinno / 29

1805.12153

NEURAL NETWORK ?

Biological neuron

08

[ https://medium.com/autonomous-agents/ mathematical-foundation-for-activation-functions-in-artificial-neural-networks-a51c9dd7c089 ]

synapse axon synapse electric signal neuron

  • 1. Each neuron collects electric signals through synapses
  • 2. When the total signal exceeds a threshold,

electric signal is sent to next neuron through axon

slide-17
SLIDE 17

Ryusuke Jinno / 29

1805.12153

NEURAL NETWORK ?

Artificial neuron mimics biological neuron

09

x1 x2 xn wn w1 w2 X xiwi f z

input weight nonlinear function

  • utput

sum z = f ⇣X xiwi + b ⌘ f(y) y Diagramatic notation Equation f : ReLU (rectified linear unit) wi b : weight : bias

slide-18
SLIDE 18

Ryusuke Jinno / 29

1805.12153

NEURAL NETWORK ?

Neural network = network of artificial neurons

= =

neuron neural network

10

slide-19
SLIDE 19

Ryusuke Jinno / 29

1805.12153

NEURAL NETWORK ?

Neural network = network of artificial neurons

11

x1 = f (W1xin + b1) xn = f (Wnxn−1 + bn) xout = WoutxN + bout

(2 ≤ n ≤ N)

f(y) = f(y1) f(y2) f(yn)

· · ·

       

Note1 : Wn

xn

bn

= matrix / = vector / = vector Note2 :

Wnxn−1 + bn = (Wn)ij(xn−1)j + (bn)i

slide-20
SLIDE 20

Ryusuke Jinno / 29

1805.12153

NEURAL NETWORK: SUPERVISED LEARNING

How to train the neural network with "supervised learning"

b → b − α∂E ∂b

W → W − α ∂E ∂W

  • Training of neural network = update of weights and biases using
  • Then we can define "how poorly the machine predicts"

Note : there are more sophisticated algorithms, e.g. AdaGrad, Adam, ...

Error function E =

e.g.

E α : constant

  • Suppose we have many data of (xin, x(true)
  • ut

) X

data

X

i:component

  • (xout)i − (x(true)
  • ut

)i

  • 12

b

W

slide-21
SLIDE 21

Ryusuke Jinno / 29

1805.12153

NEURAL NETWORK: HOW POWERFUL?

Ability of neural network to capture nonlinearity

  • My data : 100 points sampled from nonlinear function
  • My neural network : 2 layers, 20 neurons per each, trained for 10 sec.

(xin, x(true)

  • ut

) = (xin, xin(xin − 0.3)(xin − 0.6)(xin − 0.9))

0.2 0.4 0.6 0.8 1.0x

  • 0.01

0.01 0.02 0.03 Pred 0.2 0.4 0.6 0.8 1.0x

  • 0.01

0.01 0.02 0.03 Ans

xin xout x(true)

  • ut
  • r

13

slide-22
SLIDE 22

Ryusuke Jinno / 29

1805.12153

NEURAL NETWORK: HOW POWERFUL?

Ability of neural network to capture nonlinearity

  • My data : 100 points sampled from nonlinear function
  • My neural network : 2 layers, 20 neurons per each, trained for 10 sec.

(xin, x(true)

  • ut

) = (xin, xin(xin − 0.3)(xin − 0.6)(xin − 0.9))

0.2 0.4 0.6 0.8 1.0x

  • 0.01

0.01 0.02 0.03 Pred 0.2 0.4 0.6 0.8 1.0x

  • 0.01

0.01 0.02 0.03 Ans

xin xout x(true)

  • ut
  • r

13

Neural network is extremely useful in capturing nonlinearity

slide-23
SLIDE 23

Ryusuke Jinno / 29

1805.12153

NEURAL NETWORK: IMAGE RECOGNITION

cat dog

M

machine

1

label for cat label for dog

14

Image classifier: nolinear relation btwn. input (= image) and output (= label)

slide-24
SLIDE 24

Ryusuke Jinno / 29

1805.12153

NEURAL NETWORK: IMAGE RECOGNITION

1

0.8 0.1 0.7 Input layer Output layer

Note : precisely, output layer is log-odds log [P(cat)/P(dog)] Note : actual image recognition is not this simple, e.g. CNN

14

Image classifier: nolinear relation btwn. input (= image) and output (= label)

slide-25
SLIDE 25

Ryusuke Jinno / 29

1805.12153

TALK PLAN

  • 1. Machine learning : lightning introduction
  • 2. Machine learning meets tunneling problem in QFT
  • 3. Data taking / Machine setup / Training process / Results
  • 4. Summary

slide-26
SLIDE 26

Ryusuke Jinno / 29

1805.12153

TUNNELING PROBLEM IN QFT

−V

φ

slide-27
SLIDE 27

Ryusuke Jinno / 29

1805.12153

WHY TUNNELING PROBLEM IN QFT?

Gravitational waves (GWs) from BH & NS binaries have been detected

15

Tunneling & bubble dynamics in the early Universe also produce GWs Such GWs may be detected in (near) future

Pulsar timing arrays Space Ground 0.01-1Hz 10 Hz

2

10 Hz

  • 8

[http://rhcole.com/apps/GWplotter/]

/ 29

slide-28
SLIDE 28

Ryusuke Jinno / 29

1805.12153

WHY TUNNELING PROBLEM IN QFT?

Gravitational waves (GWs) from BH & NS binaries have been detected

15

Tunneling & bubble dynamics in the early Universe also produce GWs Such GWs may be detected in (near) future

Pulsar timing arrays Space Ground 0.01-1Hz 10 Hz

2

10 Hz

  • 8

[http://rhcole.com/apps/GWplotter/]

signal

/ 29

slide-29
SLIDE 29

Ryusuke Jinno / 29

1805.12153

TUNNELING PROBLEM IN QFT

Quantum tunneling in vacuum in 1+3 dim.

  • Nucleation rate is dominantly determined by

Γ

with boundary conditions

d¯ φ dr (r = 0) = 0

¯ φ(r = ∞) = 0

,

Γ ∝ e−SE[ ¯

φ]

SE[¯ φ] = Z dtE Z d3x 1 2(∂E ¯ φ)2 + V (¯ φ)

  • ,

−V

φ

rate Γ

  • Bounce configuration : solution of EOM with inverted potential −V

¯ φ ¯ φ

Euclidean action of "bounce configuration" −V

φ

r = 0

r = ∞

16

[ Coleman '77 ]

d2 ¯ φ dr2 + 3 r d¯ φ dr − dV d¯ φ = 0

slide-30
SLIDE 30

Ryusuke Jinno / 29

1805.12153

TIME-CONSUMING CALCULATION

−V

φ

  • vershoot

undershoot

Calculation of requires many times of iterations

¯ φ

Every time we have new particle physics setup, we re-calculate the EOM.

Note : there are many approaches, e.g.

17

[ Duncan et al. '92, Dutta et al. '12, Guada et al. '18 : Piecewise linear bounce ] [ Kusenko '95, Moreno et al. '98 : Improved action ] [ Konstandin et al. '06 : Damping injection ] [ Cline et al. '99, Wainwright '11 : Path deformation ] [ Espinosa '18 : Auxiliary potential ] [ Masoumi et al. '16 : Multiple shooting ]

Isn't it nonsense? Can we use machine lerning technique?

slide-31
SLIDE 31

Ryusuke Jinno / 29

1805.12153

MAIN IDEA

Potential is image

V

=

Then, the problem becomes image recognition

18

slide-32
SLIDE 32

Ryusuke Jinno / 29

1805.12153

MAIN IDEA

Potential is image

V

=

Then, the problem becomes image recognition

18

slide-33
SLIDE 33

Ryusuke Jinno / 29

1805.12153

MACHINE LEARNING MEETS TUNNELING IN QFT

19

Machine-learning approach

  • Such a machine does not have to solve EOM:

cat-dog classifier does not have to recognize them as humans do

  • Can we construct a machine which gives for input potential ?

SE V

  • Advantages: 1. faster than any other method / 2. we can share the trained machine
slide-34
SLIDE 34

Ryusuke Jinno / 29

1805.12153

TALK PLAN

  • 1. Machine learning : lightning introduction
  • 2. Machine learning meets tunneling problem in QFT
  • 3. Data taking / Machine setup / Training process / Results
  • 4. Summary

✔ ✔

slide-35
SLIDE 35

Ryusuke Jinno / 29

1805.12153

DATA TAKING PROCESS

We use 3 classes of potentials C1-C3:

  • Bounce action is calculated with traditional overshoot/undershoot method

C1 C2 C3 Potential Bounce action

V (φ) =

7

X

n=1

a(1)

n φn+1

Class 1 (C1) :

V (φ) =

7

X

n=1

a(2)

n φ2n

Class 2 (C2) :

V (φ) = a(3)

1 φ2 + 7

X

n=2

a(3)

n φ2n−1

Class 3 (C3) :

                          

  • Coefficients generated randomly

a(i)

n

(→ backup)

  • Each class contains 10,000 sets of potential and bounce action

20

slide-36
SLIDE 36

Ryusuke Jinno / 29

1805.12153

TRAINING & TEST & APPLICATION DATASET

We construct training & test & application dataset

  • Test dataset : used to check that there is no overfitting
  • Application dataset : machine is finally applied to this

e.g. 8,000 data from C1 2,000 data from C1 10,000 data from C1 Application t Test Training &

M

Step1 Step2

  • Training dataset : used for training (→ next slide)

21

slide-37
SLIDE 37

Ryusuke Jinno / 29

1805.12153

TRAINING & TEST & APPLICATION DATASET

We construct training & test & application dataset

  • Application dataset : machine is finally applied to this

e.g. 8,000 data from C2 10,000 data from C2 Application t Test Training &

M

Step1 Step2

  • Training dataset : used for training (→ next slide)

21

  • Test dataset : used to check that there is no overfitting

2,000 data from C2

slide-38
SLIDE 38

Ryusuke Jinno / 29

1805.12153

TRAINING & TEST & APPLICATION DATASET

We construct training & test & application dataset

  • Application dataset : machine is finally applied to this

e.g. 8,000 data from C3 10,000 data from C3 Application t Test Training &

M

Step1 Step2

  • Training dataset : used for training (→ next slide)

21

  • Test dataset : used to check that there is no overfitting

2,000 data from C3

slide-39
SLIDE 39

Ryusuke Jinno / 29

1805.12153

TRAINING & TEST & APPLICATION DATASET

We construct training & test & application dataset

  • Test dataset : used to check that there is no overfitting
  • Application dataset : machine is finally applied to this

e.g. 24,000 data from C1+C2+C3 6,000 data from C1+C2+C3 30,000 data from C1+C2+C3 Application t Test Training &

M

Step1 Step2

  • Training dataset : used for training (→ next slide)

21

slide-40
SLIDE 40

Ryusuke Jinno / 29

1805.12153

TRAINING & TEST & APPLICATION DATASET

We construct training & test & application dataset

  • Test dataset : used to check that there is no overfitting
  • Application dataset : machine is finally applied to this

e.g. 16,000 data from C2+C3 4,000 data from C2+C3 10,000 data from C1 Application t Test Training &

M

Step1 Step2

  • Training dataset : used for training (→ next slide)

21

slide-41
SLIDE 41

Ryusuke Jinno / 29

1805.12153

TRAINING & TEST & APPLICATION DATASET

We construct training & test & application dataset

  • Test dataset : used to check that there is no overfitting
  • Application dataset : machine is finally applied to this

e.g. Application t Test Training &

M

Step1 Step2

  • Training dataset : used for training (→ next slide)

21

16,000 data from C3+C1 4,000 data from C3+C1 10,000 data from C2

slide-42
SLIDE 42

Ryusuke Jinno / 29

1805.12153

TRAINING & TEST & APPLICATION DATASET

We construct training & test & application dataset

  • Test dataset : used to check that there is no overfitting
  • Application dataset : machine is finally applied to this

e.g. Application t Test Training &

M

Step1 Step2

  • Training dataset : used for training (→ next slide)

21

16,000 data from C1+C2 4,000 data from C1+C2 10,000 data from C3

slide-43
SLIDE 43

Ryusuke Jinno / 29

1805.12153

MACHINE SETUP

N = 2 We try a simple machine :

M

22

slide-44
SLIDE 44

Ryusuke Jinno / 29

1805.12153

  • In the following, & are understood as rescaled

MACHINE SETUP

xin = ⇢

⇢ V (φsample)

  • φsample = 1

16, · · · , 15 16

⇢ V 0(φsample)

  • φsample = 1

16, · · · , 15 16

⇢ V 00(φsample)

  • φsample = 0

16, · · · , 16 16

  • Input : sampled values of potential & its derivatives

(xin)i ! (xin)i h(xin)ii σ(xin)i xout ! xout hxouti σxout

Note : implicit rescaling of input & output

...

V φ

Output : predicted value of logarithmic bounce action

xin xout hi σ & : mean & variance calculated over training & test dataset

  • xout = ln S(pred)

4

23

slide-45
SLIDE 45

Ryusuke Jinno / 29

1805.12153

TRAINING PROCESS

Error function = how poorly the machine predicts

x(true)

  • ut

= ln S(true)

4

xout = ln S(pred)

4

: predicted value of logarithmic bounce action : true value of logarithmic bounce action

Training = update of weights and biases using error function

Note : In the actual training we use a slightly more sophisticated algorithm Adam

E = 1 (# of data passed to the machine) X

data

  • xout − x(true)
  • ut
  • 24

b → b − α∂E ∂b

W → W − α ∂E ∂W

slide-46
SLIDE 46

Ryusuke Jinno / 29

1805.12153

DETAILS OF TRAINING PROCESS

Mini-batch training

  • We feed the machine with 1/10 of the training data (= mini-batch) for one time
  • 10 times of this process use the whole training data = 1 epoch
  • We train the machine for 10,000 epochs

Implementation

  • Above process is implemented with TensorFlow (r1.17)

training dataset € € € € € € € € € €

M

25

slide-47
SLIDE 47

Ryusuke Jinno / 29

1805.12153

DETAILS OF TRAINING PROCESS

Mini-batch training

  • We feed the machine with 1/10 of the training data (= mini-batch) for one time
  • 10 times of this process use the whole training data = 1 epoch
  • We train the machine for 10,000 epochs

Implementation

  • Above process is implemented with TensorFlow (r1.17)

M

training dataset € € € € € € € € € 1 update

25

slide-48
SLIDE 48

Ryusuke Jinno / 29

1805.12153

DETAILS OF TRAINING PROCESS

Mini-batch training

  • We feed the machine with 1/10 of the training data (= mini-batch) for one time
  • 10 times of this process use the whole training data = 1 epoch
  • We train the machine for 10,000 epochs

Implementation

  • Above process is implemented with TensorFlow (r1.17)

M

training dataset € € € € € € € € 2 update

25

slide-49
SLIDE 49

Ryusuke Jinno / 29

1805.12153

DETAILS OF TRAINING PROCESS

Mini-batch training

  • We feed the machine with 1/10 of the training data (= mini-batch) for one time
  • 10 times of this process use the whole training data = 1 epoch
  • We train the machine for 10,000 epochs

Implementation

  • Above process is implemented with TensorFlow (r1.17)

M

training dataset € € € € € € € 3 update

25

slide-50
SLIDE 50

Ryusuke Jinno / 29

1805.12153

DETAILS OF TRAINING PROCESS

Mini-batch training

  • We feed the machine with 1/10 of the training data (= mini-batch) for one time
  • 10 times of this process use the whole training data = 1 epoch
  • We train the machine for 10,000 epochs

Implementation

  • Above process is implemented with TensorFlow (r1.17)

M

training dataset 10 update = 1 epoch

25

slide-51
SLIDE 51

Ryusuke Jinno / 29

1805.12153

RESULTS

Case A : 1 class for training & test & application

Training : 8,000 data from C1 Test : 2,000 data from C1 Application : 10,000 data from C1 Average of 10 times trial Scatter plot for machine’s performance /

26

slide-52
SLIDE 52

Ryusuke Jinno / 29

1805.12153

RESULTS

Case A : 1 class for training & test & application

Training : 8,000 data from C2 Test : 2,000 data from C2 Application : 10,000 data from C2 Average of 10 times trial Scatter plot for machine’s performance /

26

slide-53
SLIDE 53

Ryusuke Jinno / 29

1805.12153

RESULTS

Case A : 1 class for training & test & application

Training : 8,000 data from C3 Test : 2,000 data from C3 Application : 10,000 data from C3 Average of 10 times trial Scatter plot for machine’s performance /

26

slide-54
SLIDE 54

Ryusuke Jinno / 29

1805.12153

RESULTS

Case B : mixture of 3 classes

Training : 24,000 data from C1+C2+C3 Test : 6,000 data from C1+C2+C3 Application : 30,000 data from C1+C2+C3 Average of 10 times trial Scatter plot for machine’s performance /

26

slide-55
SLIDE 55

Ryusuke Jinno / 29

1805.12153

RESULTS

Case C : training & test over 2 classes / application to the other 1 class

Average of 10 times trial Scatter plot for machine’s performance Training : 16,000 data from C2+C3 Test : 4,000 data from C2+C3 Application : 10,000 data from C1 /

26

slide-56
SLIDE 56

Ryusuke Jinno / 29

1805.12153

RESULTS

Average of 10 times trial Scatter plot for machine’s performance

26

Training : 16,000 data from C3+C1 Test : 4,000 data from C3+C1 Application : 10,000 data from C2 /

Case C : training & test over 2 classes / application to the other 1 class

slide-57
SLIDE 57

Ryusuke Jinno / 29

1805.12153

RESULTS

Average of 10 times trial Scatter plot for machine’s performance

26

Training : 16,000 data from C1+C2 Test : 4,000 data from C1+C2 Application : 10,000 data from C3 /

Case C : training & test over 2 classes / application to the other 1 class

slide-58
SLIDE 58

Ryusuke Jinno / 29

1805.12153

DISCUSSION

How much precision can we expect in practical use? How much is the speedup?

  • Overshoot/undershoot typically takes O(1-10) sec in my code

Potential shapes in particle physics are not that many → If we train with such potentials, the resulting precision will be C1+C2+C3 or better

  • Other approaches take e.g. O(10 ) sec
  • 2

[Guada et al. '18, "Polygonal bounces" (private communication)]

  • Our machine takes O(10) sec for training,
  • 4

while after training it takes O(10 ) sec to calculate the bounce

27

slide-59
SLIDE 59

Ryusuke Jinno / 29

1805.12153

DISCUSSION

Generalizations?

  • Different spacetime dimensions → trivial
  • Multidimensional transitions → needs good ideas

2 dim. : convolutional neural network (CNN) may help ML may be used for 1 dim. part in existing multidimensional public codes e.g. 1) 2) ML may also be used for "initial position suggestor" in such public codes 3) by identifying the output as the initial position

28

slide-60
SLIDE 60

Ryusuke Jinno / 29

1805.12153

  • Different spacetime dimensions → trivial

DISCUSSION

Generalizations?

  • Multidimensional transitions → needs good ideas

input hidden1 input hidden1 input hidden1

2 dim. : convolutional neural network (CNN) may help ML may be used for 1 dim. part in existing multidimensional public codes e.g. 1) 2) ML may also be used for "initial position suggestor" in such public codes 3) by identifying the output as the initial position

28

slide-61
SLIDE 61

Ryusuke Jinno / 29

1805.12153

  • Different spacetime dimensions → trivial

DISCUSSION

Generalizations?

  • Multidimensional transitions → needs good ideas

input hidden1 input hidden1 input hidden1

2 dim. : convolutional neural network (CNN) may help ML may be used for 1 dim. part in existing multidimensional public codes e.g. 1) 2) ML may also be used for "initial position suggestor" in such public codes 3) by identifying the output as the initial position

28

slide-62
SLIDE 62

Ryusuke Jinno / 29

1805.12153

  • Different spacetime dimensions → trivial

DISCUSSION

Generalizations?

  • Multidimensional transitions → needs good ideas

input hidden1 input hidden1 input hidden1 input hidden1 input hidden1 input hidden1

2 dim. : convolutional neural network (CNN) may help ML may be used for 1 dim. part in existing multidimensional public codes e.g. 1) 2) ML may also be used for "initial position suggestor" in such public codes 3) by identifying the output as the initial position

28

slide-63
SLIDE 63

Ryusuke Jinno / 29

1805.12153

SUMMARY

Calculation of quantities from scalar potential can be regarded as image recognition problem We proposed using machine learning technique for such calculations, and demonstrated its usefulness in one-dim. transition We explained possible ideas for generalization to multi-dimensional transitions

29

slide-64
SLIDE 64

Backup

slide-65
SLIDE 65

Ryusuke Jinno / 29

1805.12153

V 00

  • is sampled from (flat distribution in log space)

Vmax

10−2 ≤ Vmax ≤ 10−0.5

DETAILS ABOUT POTENTIAL GENERATING PROCESS

Random seeds generation (Vmax, φ0, φ1−, φ1+, φ2)

  • 4 numbers are generated in , and identified with

[0, 1]

  • r (probability 0.5 for each)

φ1+ < φ0 < φ2 < φ1− φ1+ < φ2 < φ0 < φ1−

V

V 0

Added to data if there is no local maximum/minimum other than φ = φ0, 0, 1 Coefficients are determined so that

a(i)

n

φ = 1 ⇢

  • takes local minimum @ φ = φ2
  • takes local

minimum or or

φ = 0

Vmax

@ maximum

φ = φ0

−1

φ = φ1+ φ = φ1−

  • takes local

minimum @ maximum

⇢ ⇢ ⇢ ⇢ ⇢ ⇢