Finding Periodicities in Astronomical Light Curves using Information - - PowerPoint PPT Presentation

finding periodicities in astronomical light curves using
SMART_READER_LITE
LIVE PREVIEW

Finding Periodicities in Astronomical Light Curves using Information - - PowerPoint PPT Presentation

Finding Periodicities in Astronomical Light Curves using Information Theoretic Learning Pablo Huijse H. Department of Electrical Engineering Universidad de Chile Joint work with: Pavlos Protopapas, Harvard University Jose Pr ncipe,


slide-1
SLIDE 1

Finding Periodicities in Astronomical Light Curves using Information Theoretic Learning

Pablo Huijse H. Department of Electrical Engineering Universidad de Chile

Joint work with: Pavlos Protopapas, Harvard University Jose Pr´ ıncipe, University of Florida Pablo Est´ evez, Universidad de Chile (PhD Advisor) Pablo Zegers, Universidad de los Andes

December 13, 2011

slide-2
SLIDE 2

Introduction Methods Results Conclusions

Introduction

Statement of the problem To find periodic light curves automatically in large astronomical databases Find the period of a light curve Discriminate if it is truly periodic ... in reasonable computational time Relevance The fundamental period of light curves can be used for: Stellar classification Stellar parameter estimation Extrasolar planet detection

Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

slide-3
SLIDE 3

Introduction Methods Results Conclusions

Statement of the problem

Challenges Light curves are unevenly sampled and noisy Astronomical databases are huge Current situation: Period detection schemes rely too much on visual inspection. Goals To develop a fully automated, efficient and robust method for period detection and estimation based on information theoretic learning

Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

slide-4
SLIDE 4

Introduction Methods Results Conclusions

ITL and Renyi’s quadratic entropy

Information theoretic learning Information theoretic concepts of Entropy and Mutual Information applied to machine learning. Replace conventional second-order metrics (variance, correlation) with IT metrics estimated directly from samples. Renyi’s quadratic entropy (RQE) Entropy quantifies uncertainty of a system. Using Parzen windows the RQE (and the PDF) is estimated directly from the sample data ˆ HR2(X) = − log +∞

−∞

p2(x)dx

  • = − log (IP(X))

IP(X) = 1 N2

N

  • i=1

N

  • j=1

Gσ(xi − xj)

Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

slide-5
SLIDE 5

Introduction Methods Results Conclusions

Correntropy

Correntropy is an ITL metric that takes in account the time structure of random processes. Generalization of correlation. It measures similarities in a kernel space between samples sep- arated by a time lag τ in the input space. The autocorrentropy function:

  • V (τ) =

1 N − τ + 1

N−1

  • n=τ

Gσ(xn − xn−τ) Translation-invariant Gaussian kernel with kernel size σ Gσ(x − y) = 1 √ 2πσ exp

  • −x − y2

2σ2

  • .

σ controls the width of the kernel and it is usually selected wrt the data properties.

Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

slide-6
SLIDE 6

Introduction Methods Results Conclusions

Period Estimator: Slotted Correntropy

A correntropy estimator for unevenly sampled time series using the slotting technique (Edelson & Krolik, Mayo). Time lag k is defined as: k∆τ = [(k − 0.5)∆τ, (k + 0.5)∆τ].

  • V (k∆τ) =

N

i=1

N

j=1 Gσ(xi − xj) · Bk∆τ(ti, tj)

N

i

N

j=1 Bk∆τ(ti, tj)

, where Bk∆τ(ti, tj) = 1 if (ti − tj) fall in slotted lag k. The bin size ∆τ has to be carefully set to avoid undefined slots Fourier transform of slotted correntropy: slotted correntropy spectrum

Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

slide-7
SLIDE 7

Introduction Methods Results Conclusions

Previous Work

Results of this investigation published in IEEE SPL Period estimation in light curves from the MACHO survey Gold standard provided by the Harvard TSC Slotted correntropy was compared with the LS periodogram, AoV, String Length and slotted correlation The slotted correntropy outperformed the other methods on EB period estimation, and performed equally well on RRL/Cepheid period estimation

Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

slide-8
SLIDE 8

Introduction Methods Results Conclusions

New ITL based metric for period detection

Include the time structure in the kernel function Spatio-temporal kernel function

Gaussian kernel to evaluate ∆x Periodic kernel to evaluate ∆t, no folding required Multiplication of Mercer kernels is also a Mercer kernel

A periodogram based on the centered correntropy with spatio- temporal kernel function H(Pt) =

N

  • i=1

N

  • j=1

[Gσm(∆xij) − IP] · Gσt;Pt(∆tij), By maximizing H wrt to P we obtain the period associated to the most similar set of sample pairs The H periodogram has two free parameters: σm and σt. Kernel sizes control the observation window in which similarity is assessed.

Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

slide-9
SLIDE 9

Introduction Methods Results Conclusions

Results on periodic versus non-periodic discriminator

Automatic periodic light curve discrimination based on H Test on light curves from the MACHO and EROS survey We need a training dataset (EROS): We have to build one

Choose a field of the survey Obtain sets of trial periods using: H, LS, AoV periodogram, etc Visually check the folded light curves Come up with a clean training set: Future generations will be grateful

Then we can run on bigger dataset

False positive rate: Below 0.1% Careful with spurious periodicities: sidereal day, moon phase, ... Computational efficiency: 0.1 s per light curve

Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

slide-10
SLIDE 10

Introduction Methods Results Conclusions

ROC curve on MACHO subset

Figure: Periodic light curve discrimination using H metric on MACHO subset, ROC curve, 966 periodic, 775 non periodic light curves, 510 non variables, α: significance periodicity test

False positives: Spurious day and moon phase periods and mis- labeled light curves

Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

slide-11
SLIDE 11

Introduction Methods Results Conclusions

ROC curve on EROS subset

Figure: Periodic light curve discrimination using H metric. Preliminary results on EROS subset, 819 periodic (field of 72k light curves), 4000 non periodic light curves, θ: periodogram threshold

Training dataset: False False Negatives + False False Positives

Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

slide-12
SLIDE 12

Introduction Methods Results Conclusions

Efficient computation and scalability

EROS: 20 million light curves Training Field: 71937 light curves, 600 samples per light curve Description Time One light curve (CPU) 36 s Using desktop GPU (480 cores) 0.76 s Full Training dataset (with GPU) 17 h On full EROS (with GPU) 176 days! Full EROS on GPU cluter (32) 5.5 days χ2 Variability filter: Even with a very low threshold (100% TPR and big FPR), times would be reduced by half Trial period selection: LS, AoV periodogram, Correntropy, etc Code optimizations: Max. GPU occupancy Reduce complexity: FGT, Cholesky decomposition

Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

slide-13
SLIDE 13

Introduction Methods Results Conclusions

Conclusions

From signal processing/machine learning viewpoint: Interest- ing, relevant and challenging problem Contribution

New information theoretic criteria for periodicity detection Not used in the astronomy field before Working on fully automated and efficient analysis of large databases

Preliminary results are promising Questions?

Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

slide-14
SLIDE 14

Introduction Methods Results Conclusions

There is always a period But most of the time it is something like this:

Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

slide-15
SLIDE 15

Introduction Methods Results Conclusions

Preliminary results

Eclipsing binary star, MACHO 1.3449.948, P = 14.0055 days

0.1 0.2 0.3 0.4 0.5 0.2 0.4 0.6 0.8 1

Frequency [1/days] Power Spectral Density

PSD True period (2) Period [days] (1) 7.0024 (2) 3.5012 (1) 0.1 0.2 0.3 0.4 0.5 0.2 0.4 0.6 0.8 1

Frequency [1/days] Correntropy Spectral Density

CSD True period (1) (2) (3) Period [days] (1) 14.0056 (2) 7.0024 (3) 3.5012

Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

slide-16
SLIDE 16

Introduction Methods Results Conclusions

Preliminary results

Influence of the higher order moments included in the slotted correntropy estimated through the Gaussian kernel. Gσ(x − y) = 1 √ 2πσ

  • k=0

(−1)k 2kσ2kk!E

  • x − y2k

Even moments included Hits Multiples Misses 0 to 2 49.22% 48.70% 2.07% 0 to 4 61.66% 36.27% 2.07% 0 to 6 62.18% 35.75% 2.07% 0 to 8 64.25% 34.72% 1.04% 0 to 10 67.36% 31.61% 1.04% 0 to ∞ 73.06% 26.42% 0.52%

Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL

slide-17
SLIDE 17

Introduction Methods Results Conclusions

Preliminary results

Comparison between established methods in a subset of 200 periodic light curves of eclipsing binary stars drawn from the MACHO survey. Period estimation methods Hits[%] Multiples[%] Misses[%] Slotted correntropy + IP 74.0 25.5 0.5 Slotted correlation + IP 50.0 48.5 1.5 VarTools LS 11.0 89.0 0.0 VarTools LS + IP 18.0 82.0 0.0 VarTools AoV 39.5 60.5 0.0 SigSpec 11.0 88.5 0.5 SLLK 42.5 54.5 3.0 SLLK +IP 65.0 34.5 0.5

Pablo Huijse, Universidad de Chile Finding periodicities in astronomical light curves using ITL