Strong Gravitational Lensing and ML: generative models for galaxies - - PowerPoint PPT Presentation

strong gravitational lensing and ml generative models for
SMART_READER_LITE
LIVE PREVIEW

Strong Gravitational Lensing and ML: generative models for galaxies - - PowerPoint PPT Presentation

Strong Gravitational Lensing and ML: generative models for galaxies Adam Coogan Dark Machines workshop ICTP , 8-12 April 2019 Observed galaxy p ( src | x ) Generative model p ( lens | x ) Bayesian inference p ( sub | x ) Model


slide-1
SLIDE 1

Strong Gravitational Lensing and ML: generative models for galaxies

Adam Coogan

Dark Machines workshop ICTP , 8-12 April 2019

slide-2
SLIDE 2

Observed galaxy

p(θsrc|x) p(θlens|x) p(θsub|x)

Generative model

Model physics when possible, use machine learning for the rest

Bayesian inference

slide-3
SLIDE 3

Observed galaxy

http://great3.jb.man.ac.uk/

p(θsrc|x) p(θlens|x) p(θsub|x)

Generative model Bayesian inference

slide-4
SLIDE 4

Observed galaxy

Machine learning for generatively modeling the source light

http://great3.jb.man.ac.uk/

p(θsrc|x) p(θlens|x) p(θsub|x)

Generative model Bayesian inference

slide-5
SLIDE 5

Observed galaxy

Machine learning for generatively modeling the source light

  • Galaxies have diverse, complex morphologies (especially z≳2)

http://great3.jb.man.ac.uk/

p(θsrc|x) p(θlens|x) p(θsub|x)

Generative model Bayesian inference

slide-6
SLIDE 6

Observed galaxy

Machine learning for generatively modeling the source light

  • Galaxies have diverse, complex morphologies (especially z≳2)
  • Complex source → more accurate lens parameter inference

http://great3.jb.man.ac.uk/

p(θsrc|x) p(θlens|x) p(θsub|x)

Generative model Bayesian inference

slide-7
SLIDE 7

Observed galaxy

Machine learning for generatively modeling the source light

  • Galaxies have diverse, complex morphologies (especially z≳2)
  • Complex source → more accurate lens parameter inference

http://great3.jb.man.ac.uk/

p(θsrc|x) p(θlens|x) p(θsub|x)

Generative model Bayesian inference

slide-8
SLIDE 8

Observed galaxy

Machine learning for generatively modeling the source light

  • Galaxies have diverse, complex morphologies (especially z≳2)
  • Complex source → more accurate lens parameter inference

http://great3.jb.man.ac.uk/

p(θsrc|x) p(θlens|x) p(θsub|x)

Generative model Bayesian inference

slide-9
SLIDE 9

Source model

  • Low-dimensional representation of data that:

➡ Captures range of galaxy morphologies ➡ Has a latent space compatible with Bayesian inference

Kingma & Welling 2013, Rezende et al 2014

slide-10
SLIDE 10

Source model

  • Low-dimensional representation of data that:

➡ Captures range of galaxy morphologies ➡ Has a latent space compatible with Bayesian inference

Latent space z

x

Data

Kingma & Welling 2013, Rezende et al 2014

Variational autoencoder

slide-11
SLIDE 11

Source model

  • Low-dimensional representation of data that:

➡ Captures range of galaxy morphologies ➡ Has a latent space compatible with Bayesian inference

Latent space z

x

Data

Kingma & Welling 2013, Rezende et al 2014

Variational autoencoder

qϕ(z|x)

Encoder

slide-12
SLIDE 12

Source model

  • Low-dimensional representation of data that:

➡ Captures range of galaxy morphologies ➡ Has a latent space compatible with Bayesian inference

Latent space z

x

Data

Kingma & Welling 2013, Rezende et al 2014

Variational autoencoder

qϕ(z|x)

Encoder

pθ(x|z)

Decoder

slide-13
SLIDE 13

Source model

  • Low-dimensional representation of data that:

➡ Captures range of galaxy morphologies ➡ Has a latent space compatible with Bayesian inference

Latent space z

x

Data

p(z) = N(0, I)

Kingma & Welling 2013, Rezende et al 2014

Variational autoencoder

qϕ(z|x)

Encoder

pθ(x|z)

Decoder

slide-14
SLIDE 14

Source model

  • Low-dimensional representation of data that:

➡ Captures range of galaxy morphologies ➡ Has a latent space compatible with Bayesian inference

Latent space z

x

Data

p(z) = N(0, I)

Kingma & Welling 2013, Rezende et al 2014

Variational autoencoder

qϕ(z|x)

Encoder

pθ(x|z)

Decoder

slide-15
SLIDE 15

Source model

  • Low-dimensional representation of data that:

➡ Captures range of galaxy morphologies ➡ Has a latent space compatible with Bayesian inference

Latent space z

x

Data

p(z) = N(0, I)

Kingma & Welling 2013, Rezende et al 2014

Train encoder, decoder by maximizing lower bound on p(data)

Variational autoencoder

qϕ(z|x)

Encoder

pθ(x|z)

Decoder

slide-16
SLIDE 16

Galaxy VAE

  • Dataset: ~56,000 galaxies, redshifts ~ 1

http://great3.jb.man.ac.uk/

slide-17
SLIDE 17

Galaxy VAE

  • Dataset: ~56,000 galaxies, redshifts ~ 1

S/N < 10 S/N ~ 20 S/N > 100

http://great3.jb.man.ac.uk/

slide-18
SLIDE 18

Galaxy VAE

  • Dataset: ~56,000 galaxies, redshifts ~ 1

S/N < 10 S/N ~ 20 S/N > 100

This talk: train on ~10,000 images with S/N = 15 - 50

http://great3.jb.man.ac.uk/

slide-19
SLIDE 19
  • Encoder, decoder: deep convolutional neural networks

Galaxy VAE

  • Dataset: ~56,000 galaxies, redshifts ~ 1

Radford et al 2015 (DCGAN)

S/N < 10 S/N ~ 20 S/N > 100

This talk: train on ~10,000 images with S/N = 15 - 50 Eg, decoder:

http://great3.jb.man.ac.uk/

slide-20
SLIDE 20

Galaxy VAE: reconstructions

x

z ∼ qϕ(z|x)

x′

slide-21
SLIDE 21

Galaxy VAE: reconstructions

Original Reconstruction Original Reconstruction Original Reconstruction

x

z ∼ qϕ(z|x)

x′

slide-22
SLIDE 22

Galaxy VAE: reconstructions

Original Reconstruction Original Reconstruction Original Reconstruction

x

z ∼ qϕ(z|x)

x′

slide-23
SLIDE 23

Galaxy VAE: reconstructions

Original Reconstruction Original Reconstruction Original Reconstruction

x

z ∼ qϕ(z|x)

x′

slide-24
SLIDE 24

Galaxy VAE: reconstructions

Original Reconstruction Original Reconstruction Original Reconstruction

1. 2. 1. 2.

x

z ∼ qϕ(z|x)

x′

slide-25
SLIDE 25

Galaxy VAE: reconstructions

Original Reconstruction Original Reconstruction Original Reconstruction

1. 2. 1. 2.

x

z ∼ qϕ(z|x)

x′

slide-26
SLIDE 26

Galaxy VAE: reconstructions

Original Reconstruction Original Reconstruction Original Reconstruction

1. 2. 1. 2.

Rezende & Viola 2018, Zhao et al 2017

x

z ∼ qϕ(z|x)

x′

slide-27
SLIDE 27

Galaxy VAE: samples

z ∼ p(z) = N(0, I)

Hoffman & Johnson 2016, Alemi et al 2018

slide-28
SLIDE 28

Galaxy VAE: samples

z ∼ p(z) = N(0, I)

Hoffman & Johnson 2016, Alemi et al 2018

slide-29
SLIDE 29

Galaxy VAE: samples

z ∼ p(z) = N(0, I)

Hoffman & Johnson 2016, Alemi et al 2018

slide-30
SLIDE 30

Galaxy VAE: samples

z ∼ p(z) = N(0, I)

z distribution for training data

Hoffman & Johnson 2016, Alemi et al 2018

slide-31
SLIDE 31

Galaxy VAE: samples

z ∼ p(z) = N(0, I)

z distribution for training data

Hoffman & Johnson 2016, Alemi et al 2018

≠ N(0, I) →open issue with VAEs!

slide-32
SLIDE 32

Galaxy VAE: samples

z ∼ p(z) = N(0, I)

Our approach: sample z from here to generate better galaxies z distribution for training data

Hoffman & Johnson 2016, Alemi et al 2018

≠ N(0, I) →open issue with VAEs!

slide-33
SLIDE 33

Normalizing flows

z ∼ N(0, I) z′ ∼

Rezende & Mohamed 2015, Kingma et al 2016, …

slide-34
SLIDE 34

Normalizing flows

  • Compose invertible transformations with simple

Jacobians to reshape distributions

z ∼ N(0, I) z′ ∼

fT ∘ . . . ∘ f2 ∘ f1(z)

Rezende & Mohamed 2015, Kingma et al 2016, …

slide-35
SLIDE 35

Normalizing flows

  • Compose invertible transformations with simple

Jacobians to reshape distributions

  • Parametrize with neural networks

z ∼ N(0, I) z′ ∼

fT ∘ . . . ∘ f2 ∘ f1(z)

Rezende & Mohamed 2015, Kingma et al 2016, …

slide-36
SLIDE 36

Normalizing flows

  • Compose invertible transformations with simple

Jacobians to reshape distributions

  • Parametrize with neural networks
  • For our purposes: inverse autoregressive flows (IAFs),

which enable efficient sampling of the latent variable

z ∼ N(0, I) z′ ∼

fT ∘ . . . ∘ f2 ∘ f1(z)

Rezende & Mohamed 2015, Kingma et al 2016, …

slide-37
SLIDE 37

Galaxy VAE: samples

z distribution for training data z samples from IAF fit

x

z ∼ IAF(z)

slide-38
SLIDE 38

Galaxy VAE: samples

Generated galaxies z distribution for training data z samples from IAF fit

x

z ∼ IAF(z)

slide-39
SLIDE 39

Lensing galaxies

True source Observation

*Very preliminary, simplified analysis

slide-40
SLIDE 40

Lensing galaxies

True source Observation

*Very preliminary, simplified analysis

Best-fit source

slide-41
SLIDE 41

Lensing galaxies

True source Observation

True Einstein radius: 2.3 Best-fit value: 2.29 *Very preliminary, simplified analysis

Best-fit source

slide-42
SLIDE 42

Conclusions

  • Integrate galaxy VAE with full analysis pipeline
  • Improve prior/latent distribution mismatch:
  • Fully incorporate flows with VAE
  • Fix blurriness:
  • More flexible encoder? β/conditional-VAE, …?
  • Example of “differentiable programming” for physics + ML

Tomczak & Welling 2018 Higgins et al 2017

slide-43
SLIDE 43

Conclusions

  • Integrate galaxy VAE with full analysis pipeline
  • Improve prior/latent distribution mismatch:
  • Fully incorporate flows with VAE
  • Fix blurriness:
  • More flexible encoder? β/conditional-VAE, …?
  • Example of “differentiable programming” for physics + ML

Tomczak & Welling 2018 Higgins et al 2017

Thanks!

slide-44
SLIDE 44

Lensing MNIST digits

True source

Simplified lens with one parameter, rein Poissonian

  • bservation

noise

Observation

Outputs from simplified analysis

slide-45
SLIDE 45

Lensing MNIST digits

True source

Simplified lens with one parameter, rein Poissonian

  • bservation

noise

Observation Best-fit source from VAE

Outputs from simplified analysis

slide-46
SLIDE 46

Lensing MNIST digits

True source

Simplified lens with one parameter, rein Poissonian

  • bservation

noise

Observation Best-fit source from VAE

1.63 1.64 1.65 1.66 1.67 rein 20 40 60 80 100 p(rein|obs)

True rein HMC SVI

Lens parameter inference

Outputs from simplified analysis

slide-47
SLIDE 47

Training VAEs

  • Maximize a lower bound on :

ELBO(x(i)) = 𝔽e(z|x(i)) [log d(x(i)|z)] − KL [e(z|x(i))||m(z)]

Encoded means for MNIST

p(x(i))

Source

slide-48
SLIDE 48

Training VAEs

  • Maximize a lower bound on :

ELBO(x(i)) = 𝔽e(z|x(i)) [log d(x(i)|z)] − KL [e(z|x(i))||m(z)]

Encoded means for MNIST

p(x(i))

Source

slide-49
SLIDE 49

Training VAEs

  • Maximize a lower bound on :

ELBO(x(i)) = 𝔽e(z|x(i)) [log d(x(i)|z)] − KL [e(z|x(i))||m(z)]

Encoded means for MNIST

p(x(i))

Source

slide-50
SLIDE 50

Training VAEs

  • Maximize a lower bound on :

ELBO(x(i)) = 𝔽e(z|x(i)) [log d(x(i)|z)] − KL [e(z|x(i))||m(z)]

Encoded means for MNIST

p(x(i))

Source