Deep learning to diagnose the neutron star Kenji Fukushima The - - PowerPoint PPT Presentation

deep learning to diagnose the neutron star
SMART_READER_LITE
LIVE PREVIEW

Deep learning to diagnose the neutron star Kenji Fukushima The - - PowerPoint PPT Presentation

Deep learning to diagnose the neutron star Kenji Fukushima The University of Tokyo Based on work with Yuki Fujimoto, Koichi Murase Deep Learning and Physics 2018 June 2, 2018 @ Osaka 1 Disclaimer I am a user of the deep


slide-1
SLIDE 1

June 2, 2018 @ Osaka

Deep learning to diagnose
 the neutron star

Kenji Fukushima

The University of Tokyo

Based on work with Yuki Fujimoto, Koichi Murase — Deep Learning and Physics 2018 —

1

slide-2
SLIDE 2

June 2, 2018 @ Osaka

Disclaimer

2

I am a “user” of the deep learning…

slide-3
SLIDE 3

3

From the point of view of physics users…

Useful ? Advantageous?

Sounds fancy is not enough…

slide-4
SLIDE 4

June 2, 2018 @ Osaka

Conventional Physics Approach

4

MODEL Input Data Nonlinear Mapping Output Data EXPERIMENT Very Uncertain Limited Information Easy Hard Compare… Exclude?

Theory Exp.

slide-5
SLIDE 5

5

One another to infinity, so what ?

slide-6
SLIDE 6

June 2, 2018 @ Osaka

“Model Independent” Analysis

6

Input Data Nonlinear Mapping Output Data EXPERIMENT Limited Information Easy Hard

Exp.

Not unique… What is the
 “most likely”

  • ne?

Inverse Problem

slide-7
SLIDE 7

7

slide-8
SLIDE 8

June 2, 2018 @ Osaka

Neutron Star EoS

8

0.0 7 8 9 10 11 Radius (km) 12 13 14 15 0.5 1.0 1.5 2.0

AP4

J1903+0327 J1909-3744 systems Double neutron s Double neutron star sy sy

J1614-2230 AP3 ENG MPA1 GM3 GS1 PAL6 FSU SQM3 SQM1 PAL1 MS0 MS2 MS1

2.5 GR Causality Rotation P < ∞ Mass (M()

a

–40 –30 –20 –10 10 20 30

Demorest et al. (2010) Precise determination of
 NS mass using Shapiro delay 1.928(17) Msun

(slightly changed in 2016)

(J1614-2230)

2.01(4) Msun (PSRJ0348+0432) Antoniadis et al. (2013)

slide-9
SLIDE 9

June 2, 2018 @ Osaka

Neutron Star EoS

9

Equation of State M-R Relation Pressure : p Mass density : r

p = p(ρ)

NS mass : M NS radius : R

Tolman-Oppenheimer-


  • Volkoff (TOV) Eqs

Mathematically one-to-one correspondence

gravity pressure diff

(Energy density : ε = ρc2)

M = M(ρmax) R = R(ρmax)

Input Data Output Data Nonlinear Mapping

slide-10
SLIDE 10

June 2, 2018 @ Osaka

Neutron Star EoS

10

Lindblom (1992)

Brute-force solution of the inverse problem

Test data put by hand Solve TOV Reconstructed

Thanks to Y. Fujimoto

Radius Density Mass Pressure

The answer exists!

slide-11
SLIDE 11

11

No magic box… Only “solvable” problem can be solved…

slide-12
SLIDE 12

June 2, 2018 @ Osaka

Neutron Star EoS (Side Remark)

12

R is fixed by TOV with p(R)=0 (“surface” condition)

dp/dr(r = R) = 0 d2p/dr2(r = R) ∝ M 2/R2

Determination of R is very uncertain, and on top of that, R itself is anyway very uncertain…

People do not care assuming that NS mass > 1.2 Msun

Very uncertain
 “by definition”

slide-13
SLIDE 13

June 2, 2018 @ Osaka

Model Independent Analysis

13

Bayesian Analysis

P(A|B)P(B) = P(B|A)P(A)

(Bayes’ theorem)

B : M-R Observation A : EoS Parameters

Want to know Normalization Likelihood Calculable by TOV prior Model Model must be assumed. EoS parametrization must be introduced. Integration in parameter space must be defined.

If infinite observations, prior dependence should be gone.

slide-14
SLIDE 14

June 2, 2018 @ Osaka

Model Independent Analysis

14

Raithel-Ozel-Psaltis (2017) Mock data (SLy + Noises) Prior
 Dep. Black curve True EoS Magenta curve Guessed EoS Gray band 68% credibility

slide-15
SLIDE 15

June 2, 2018 @ Osaka

Model Independent Analysis

15

Several M-R


  • bservation points


with errors Several parameters
 to characterize EoS

Nonlinear Mapping

{Mi, Ri} {Pi}

{Pi} = F({Mi, Ri})

~ 5 Points ~ 15 Points (observations)

Too precise parametrization of EoS is useless
 (beyond the uncertainty from observations)

Bayesian Analysis Supervised Learning

slide-16
SLIDE 16

June 2, 2018 @ Osaka

Deep Learning

16

x1=x1

(0)

x2=x2

(0)

x3=x3

(0)

xN=xN

(0)

x1

(2)

x2

(2)

x3

(2)

x4

(2)

x1

(L)=y1

x2

(L)=y2

x3

(L)=y3

xM

(L)=yM

x(k+1)

i

= σ(k+1) Nk X

i=1

W (k+1)

ij

x(k)

j

+ a(k+1)

i

!

{Mi, Ri} {Pi}

Parameters to be tuned

  • ns” and the typical
  • n σ(x) = 1/(ex + 1),

sigmoid func.

clude the sigmoid f LU σ(x) = max{0, x}, h

ReLU

(e + 1), the ReLU t σ(x) = tanh(x),

tanh

Backpropagation

~ 15 ~ 5

slide-17
SLIDE 17

June 2, 2018 @ Osaka

Deep Learning

17

Layer index Nodes Activation 1 30 N/A 2 60 ReLU 3 40 ReLU 4 40 ReLU 5 5 tanh

Our Neural Network Design Probably we don’t need such many hidden layers
 and such many nodes… anyway, this is one
 working example…

slide-18
SLIDE 18

June 2, 2018 @ Osaka

Deep Learning

18

For good learning, the “textbook” choice is important… Training data (200000 sets in total) Randomly generate 5 sound velocities → EoS × 2000 sets Solve TOV to identify the corresponding M-R curve Randomly pick up 15 observation points × (ns = 100) sets The machine learns the M-R data have error fluctuations Validation data (200 sets) Generate independently of the training data

(with ∆M = 0.1M, ∆R = 0.5 km)

slide-19
SLIDE 19

June 2, 2018 @ Osaka

Deep Learning

19

With fluctuations in the training data, the learning goes quickly

“Loss Function"
 = deviation from the
 true answers (msle) Monotonically decrease
 for the training data, but
 not necessarily so for
 the validation data

Once the over-fitting occurs,
 the model becomes more stupid…

0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 1 10 100 1000 10000 Loss function (msle) Epochs ns = 100 ns = 1

slide-20
SLIDE 20

June 2, 2018 @ Osaka

Deep Learning

20

Test with the validation data
 (parameters not optimized to fit the validation data)

Fujimoto-Fukushima-Murase (2017)

: randomly generated original EoS : reconstructed EoS and associated M-R Two Typical Examples (not biased choice)

10-2 10-1 100 0.1 0.2 0.5 1 p / ρc2 ρc2 [GeV/fm3]

1 2 3 10 12 14 16 M / M⊙ R [km]

slide-21
SLIDE 21

June 2, 2018 @ Osaka

Deep Learning

21

Overall performance test

Mass (M) 0.6 0.8 1.0 1.2 1.4 1.6 1.8 RMS (km) 0.16 0.12 0.10 0.099 0.11 0.11 0.12

(with ∆M = 0.1M, ∆R = 0.5 km)

Very promising!

Credibility estimate has not been done for simplicity, but
 it can be included in the learning process.

slide-22
SLIDE 22

22

Usefulness confirmed, easy implementation but advantageous ?

Bayesian or NN, which to choose?

slide-23
SLIDE 23

June 2, 2018 @ Osaka

Baysian vs NN

23

y θ := {c2

s,i}

EoS

  • Obs. D = {(Mi, Ri)}

D|θ) for each EoS.

fMAP(D) = arg max

θ

[Pr(θ) Pr(D|θ)]

Bayesian NN

h`[f]i = Z dθdD Pr(θ) Pr(D|θ)`(θ, f(D))

minimizes

Pr(θ|D)

<latexit sha1_base64="tX4yBQlgRSFv2wqlUxTw0zZmEAI=">ACjnichVG/TxRBFP5YUfH8waKNCc3GCwayxti4CAxEqWgPMADEpZcZpeB2zD7I7tzl5zr/QM2lBRWkhBjqGmxsPEfsOBPMJaY2Fj4du+MsUDfZOZ9873vXkz4yU6yAzRxYh1bfT6jZtjtyq379y9N25P3N/I4k7q6Yf6zjd8mSmdBCpgmMVltJqmToabXpHbwo4ptdlWZBHL0vUTthHI/CvYCXxqmWvZM7qah0j7064X692sF7LXdNWRvZfu6E0bV/qfLk/07KrVCMiIYRTADE/RwWFuqzou6IsRWxdAasf0eLnYRw0cHIRQiGMYaEhmPbQgQEuZ2kDOXMgrKuEIfFdZ2OEtxhmT2gNd93m0P2Yj3Rc2sVPt8iuaZstLBFH2hD3RJn+mUvtLPK2vlZY2ilx57b6BVSWv8zcP1H/9VhewN2n9U/+zZYA/1steAe09KpriFP9B3Xx1dri+uTeWP6Zi+cf/v6I+8Q2i7nf/ZFWtvUWFP+D3KztXg43ZmqCaWH1SXo+/IoxTOIRpvm957GEFTQ5HMPcYZzfLRsa856aj0bpFojQ80D/GXWyi+WfJrJ</latexit><latexit sha1_base64="tX4yBQlgRSFv2wqlUxTw0zZmEAI=">ACjnichVG/TxRBFP5YUfH8waKNCc3GCwayxti4CAxEqWgPMADEpZcZpeB2zD7I7tzl5zr/QM2lBRWkhBjqGmxsPEfsOBPMJaY2Fj4du+MsUDfZOZ9873vXkz4yU6yAzRxYh1bfT6jZtjtyq379y9N25P3N/I4k7q6Yf6zjd8mSmdBCpgmMVltJqmToabXpHbwo4ptdlWZBHL0vUTthHI/CvYCXxqmWvZM7qah0j7064X692sF7LXdNWRvZfu6E0bV/qfLk/07KrVCMiIYRTADE/RwWFuqzou6IsRWxdAasf0eLnYRw0cHIRQiGMYaEhmPbQgQEuZ2kDOXMgrKuEIfFdZ2OEtxhmT2gNd93m0P2Yj3Rc2sVPt8iuaZstLBFH2hD3RJn+mUvtLPK2vlZY2ilx57b6BVSWv8zcP1H/9VhewN2n9U/+zZYA/1steAe09KpriFP9B3Xx1dri+uTeWP6Zi+cf/v6I+8Q2i7nf/ZFWtvUWFP+D3KztXg43ZmqCaWH1SXo+/IoxTOIRpvm957GEFTQ5HMPcYZzfLRsa856aj0bpFojQ80D/GXWyi+WfJrJ</latexit><latexit sha1_base64="tX4yBQlgRSFv2wqlUxTw0zZmEAI=">ACjnichVG/TxRBFP5YUfH8waKNCc3GCwayxti4CAxEqWgPMADEpZcZpeB2zD7I7tzl5zr/QM2lBRWkhBjqGmxsPEfsOBPMJaY2Fj4du+MsUDfZOZ9873vXkz4yU6yAzRxYh1bfT6jZtjtyq379y9N25P3N/I4k7q6Yf6zjd8mSmdBCpgmMVltJqmToabXpHbwo4ptdlWZBHL0vUTthHI/CvYCXxqmWvZM7qah0j7064X692sF7LXdNWRvZfu6E0bV/qfLk/07KrVCMiIYRTADE/RwWFuqzou6IsRWxdAasf0eLnYRw0cHIRQiGMYaEhmPbQgQEuZ2kDOXMgrKuEIfFdZ2OEtxhmT2gNd93m0P2Yj3Rc2sVPt8iuaZstLBFH2hD3RJn+mUvtLPK2vlZY2ilx57b6BVSWv8zcP1H/9VhewN2n9U/+zZYA/1steAe09KpriFP9B3Xx1dri+uTeWP6Zi+cf/v6I+8Q2i7nf/ZFWtvUWFP+D3KztXg43ZmqCaWH1SXo+/IoxTOIRpvm957GEFTQ5HMPcYZzfLRsa856aj0bpFojQ80D/GXWyi+WfJrJ</latexit><latexit sha1_base64="tX4yBQlgRSFv2wqlUxTw0zZmEAI=">ACjnichVG/TxRBFP5YUfH8waKNCc3GCwayxti4CAxEqWgPMADEpZcZpeB2zD7I7tzl5zr/QM2lBRWkhBjqGmxsPEfsOBPMJaY2Fj4du+MsUDfZOZ9873vXkz4yU6yAzRxYh1bfT6jZtjtyq379y9N25P3N/I4k7q6Yf6zjd8mSmdBCpgmMVltJqmToabXpHbwo4ptdlWZBHL0vUTthHI/CvYCXxqmWvZM7qah0j7064X692sF7LXdNWRvZfu6E0bV/qfLk/07KrVCMiIYRTADE/RwWFuqzou6IsRWxdAasf0eLnYRw0cHIRQiGMYaEhmPbQgQEuZ2kDOXMgrKuEIfFdZ2OEtxhmT2gNd93m0P2Yj3Rc2sVPt8iuaZstLBFH2hD3RJn+mUvtLPK2vlZY2ilx57b6BVSWv8zcP1H/9VhewN2n9U/+zZYA/1steAe09KpriFP9B3Xx1dri+uTeWP6Zi+cf/v6I+8Q2i7nf/ZFWtvUWFP+D3KztXg43ZmqCaWH1SXo+/IoxTOIRpvm957GEFTQ5HMPcYZzfLRsa856aj0bpFojQ80D/GXWyi+WfJrJ</latexit>

Approximated estimated → Baysian

NN allows for more general choice of loss functions. Baysian assumes parametrized likelihood functions.

slide-24
SLIDE 24

June 2, 2018 @ Osaka

Conclusion

24

Yes, useful! Maybe, less biased ?

Developing a toolkit for real data like not discrete data with error,
 but regions of credibility Error analysis (credibility estimate) in the output side