Time-domain Astrophysics in the Era of Big Data V. Ashley Villar - - PowerPoint PPT Presentation

time domain astrophysics in the era of big data
SMART_READER_LITE
LIVE PREVIEW

Time-domain Astrophysics in the Era of Big Data V. Ashley Villar - - PowerPoint PPT Presentation

Time-domain Astrophysics in the Era of Big Data V. Ashley Villar Center for Astrophysics | Harvard & Smithsonian Ford Foundation Dissertation Fellow ML @ Ringberg 2019 Transients connect to all branches of astrophysics HST


slide-1
SLIDE 1

Time-domain Astrophysics in the Era of Big Data

  • V. Ashley Villar

Center for Astrophysics | Harvard & Smithsonian Ford Foundation Dissertation Fellow

ML @ Ringberg 2019

slide-2
SLIDE 2

Transients connect to all branches of astrophysics

How does the zoo of observed transients connect with the underlying (astro)physics?

Scolnic+ 2018 Soares-Santos+2017

  • L. Singer

HST

slide-3
SLIDE 3
slide-4
SLIDE 4

Time w/ offset (Days) Type Ia Supernova

slide-5
SLIDE 5

ashleyvillar.com/dlps

SNe powered by 56Ni Radioactive Decay

slide-6
SLIDE 6

VAV+ 2017a

slide-7
SLIDE 7

The Large Synoptic Survey Telescope

LSST

slide-8
SLIDE 8

Data from OSC; Guillochon+ 2018

LSST

2023 ~106

LSST will discover >1 million supernovae annually!

slide-9
SLIDE 9

Data from OSC; Guillochon+ 2018

2023

We will follow up some of these supernovae

~1000s

  • f spectra
slide-10
SLIDE 10

The LSST Needles & the Haystack

~Million SNe / Year ~1000s / Year With spec. classification ~100 SNe we actively follow with other resources

slide-11
SLIDE 11

A Christmas list for SN classification:

  • 1. Meaningful feature extraction which can handle noisy,

sparse data

  • 2. Feature extraction which can utilize unclassified data
  • 3. Classification which can work on incoming data
  • 4. A method which can search for needles in real time
slide-12
SLIDE 12

A Christmas list for SN classification:

  • 1. Meaningful feature extraction which can handle noisy,

sparse data

  • 2. Feature extraction which can utilize unclassified data
  • 3. Classification which can work on incoming data
  • 4. A method which can search for needles in real time

Recurrent neuron-based autoencoder

slide-13
SLIDE 13
  • ~5200 SNe-like transients in PS1 MDS (Jones+2017)
  • ~3200 SNe have host redshift measurements
  • ~520 SNe are spectroscopically classified with host

redshift measurements

Pan-STARRS Medium Deep Survey is a milliLSST

Chambers+ 2016

slide-14
SLIDE 14

T x 4 x 3 Time, flux, error

Decoder Encoder

A semi-supervised method to encode/classify SNe

1x10 Encoded LC VAV+ in prep.

slide-15
SLIDE 15

Use a GP to deal with uneven sampling in filters

slide-16
SLIDE 16

Input: [T, Fg, Fr, Fi, Fz, σg,σr,σi,σz] h Encoding state

Recurrent neurons update the encoded light curve

slide-17
SLIDE 17

Repeat encoded LC with a new set of times

1x10 Encoded LC VAV+ in prep. t1 t2 t3 t4 tn ...

Decoder

slide-18
SLIDE 18

VAV+ in prep.

Decoded light curve updated with new data

slide-19
SLIDE 19

VAV+ in prep.

Decoded light curve updated with new data

slide-20
SLIDE 20

VAV+ in prep.

Decoded light curve updated with new data

slide-21
SLIDE 21

VAV+ in prep.

Decoded light curve updated with new data

slide-22
SLIDE 22

VAV+ in prep.

Decoded light curve updated with new data

slide-23
SLIDE 23

Why use a RNN autoencoder?

  • Semi-supervised methods allow us to use information

from the full dataset

  • We can extract unique, nonlinear features directly from

the light curves

  • Actively makes forecasting predictions, which may be

used to hunt for anomalies aka the needles

slide-24
SLIDE 24

Using a random forest classifier, we classify the full sample of 3200 SNe

VAV+ in prep.

slide-25
SLIDE 25

Time-domain Astrophysics in the Era of Big Data

  • LSST will bring TDA into a new era of big data, thanks to both a deep and

wide survey strategy

  • LSST light curves will be noisy and sparse, but simple features correlate

with underlying physics

  • RNN-based AEs are a promising strategy to classify SNe in real time
  • RNN-based AEs may be a promising strategy for real time anomaly

detection

slide-26
SLIDE 26

High Ia purity! I m a g e

  • b

a s e d C N N Online learning SNPCC W a v e l e t d e c

  • m

p

  • s

i t i

  • n

P L A s T i C C RAPID avocado P E L I C A N

CLASSIFICATION

https://tinyurl.com/transienttable

slide-27
SLIDE 27

Do we have a suitable training set for classification?

Real datastream! gr(i) filters Depth ~21 mag

PLAsTiCC

Simulated dataset LSST filters/cadence

see e.g., Bellm+ 2019; Kessler+ 2019