Hubble Catalog of variables - Presentation @Napoli Observatory - - PDF document

hubble catalog of variables presentation napoli
SMART_READER_LITE
LIVE PREVIEW

Hubble Catalog of variables - Presentation @Napoli Observatory - - PDF document

See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/309634589 Hubble Catalog of variables - Presentation @Napoli Observatory Presentation October 2016 CITATIONS READS 0 17 1 author:


slide-1
SLIDE 1 See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/309634589 Hubble Catalog of variables - Presentation @Napoli Observatory Presentation · October 2016 CITATIONS READS 17 1 author: Some of the authors of this publication are also working on these related projects: The CPMDS catalogue of common proper motion double stars in the Bordeaux Carte du Ciel zone View project Gaia Mission View project Panagiotis Gavras RHEA Group for ESA 53 PUBLICATIONS 2,918 CITATIONS SEE PROFILE All content following this page was uploaded by Panagiotis Gavras on 03 November 2016. The user has requested enhancement of the downloaded file.
slide-2
SLIDE 2

Hubble Catalog of Variables

Panagiotis Gavras

  • n behalf of the HCV team

Napoli , 19 Oct 2016

slide-3
SLIDE 3

Outline

  • The HCV project
  • Hubble Source Catalog
  • Variability detection
  • Pipeline
  • First Results
slide-4
SLIDE 4

Hubble Catalog of Variables

  • 3-Year ESA funded project (2015-2018)
  • Goal of the project is to develop a set of algorithms which will identify

candidate variables among the sources included in Hubble Source Catalog (HSC).

  • At the end of the project we will produce the first version of Hubble Catalog of

Variables (HCV). This catalog will be ingested to MAST portal and ESA Science Archives.

  • Finally the pipeline will be deployed in STScI and run regularly in order to

produce updated versions of HCV.

slide-5
SLIDE 5

Hubble Source Catalog

  • Hubble Source Catalog (Whitmore et al., 2016) is a catalog with the

majority of all objects ever observed by Hubble Space Telescope (HST).

  • It is developed and maintained by the Space Telescope Science

Institute (STScI).

  • The HSC is designed to optimise science from the Hubble Space

Telescope by combining the tens of thousands of visit-based source lists in the Hubble Legacy Archive (HLA) into a single master catalog.

slide-6
SLIDE 6

HSC : What’s inside

  • HSC v2.0 released 29 Sep 2016.
  • 90 million sources and 383 million detections.
  • Photometry from WFPC2, ACS/WFC, WFC3/UVIS, and WFC3/IR.
  • In total there are 112 filter/detector combinations.
slide-7
SLIDE 7

HSC : What’s inside

  • The mean photometric accuracy is

better than 0.10 mag (may go down to 0.02mag).

  • The absolute astrometric accuracy is

better than 0.1 arcsec.

  • 91% of HSC has coverage from

Pan-STARRS, SDSS, or 2MASS.

slide-8
SLIDE 8

HSC: Things one should know

  • Coverage can be very non-uniform (unlike surveys such as SDSS).
  • Current WFPC2, ACS/WFC and WFC3 source lists are of variable quality.
  • The default is to show all HSC objects. This may include a large number of
  • artifacts. Requesting Numimages > 1 (or more) should filter out many artifacts.
  • Doubling: There are occasionally cases where not all the detections of the same

source are matched together into a single objects.

  • Bad Images: Images taken when Hubble has lost lock on guide stars (generally after

an earth occultation) are the primary cause of bad images.

slide-9
SLIDE 9

HSC: Access

  • MAST Discovery portal:

https://mast.stsci.edu/portal/Mashup/Clients/Mast/Portal.html

  • Online form :

https://archive.stsci.edu/hst/hsc_sum/search.php

  • CasJobs :

http://mastweb.stsci.edu/hcasjobs/home.aspx

slide-10
SLIDE 10

Variability

slide-11
SLIDE 11

Variability detection methods

  • Direct image comparison (transient detection)

SN1987a

Before After

slide-12
SLIDE 12

Variability detection methods

  • Direct image comparison (transient detection)
  • Periodicity search
slide-13
SLIDE 13

Variability detection methods

  • Direct image comparison (transient detection)
  • Periodicity search
  • Lightcurve analysis using variability indexes
slide-14
SLIDE 14

Variability Indexes (VI)

  • They are numerical parameters characterizing the degree of variability of an
  • bject.
  • Different VIs are sensitive to different type of variability.
  • One expect a variable to have a significant different value in some VI than

non-variables.

  • 2 Types of Variability Indexes
  • Scatter-based
  • Correlation-based
slide-15
SLIDE 15
  • reduced χ2 test
  • Standard deviation σ
  • MAD
  • Interquartile range (IQR)
  • Robust Median Statistics (RoMS)
  • Normalised Excess Variance σ2NXS
  • Peak-to-Peak variability v


Scatter-based Variability Indexes

RoMS = (N − 1)−1

N

  • i=1

|mi − median(mi)| σi . For a non-variable object, the expected value

σ2

NXS =

1 N ¯ m2

N

  • i=1

[(mi − ¯ m)2 − σ2

i ].

Here we use the symbol for the n

v = (mi − σi)max − (mi + σi)min (mi − σi)max + (mi + σi)min where (m ) and (m )

See more at Sokolovsky et al, 2016

slide-16
SLIDE 16
  • Welch-Stetson I
  • Stetson’s indexes J,K,L
  • and variations… time weighted,

magnitude limited

  • Consecutive same-sign deviations from

mean magnitude (CSSD)

Correlation-based Variability Indexes

measurements obtained in two filters b I =

  • 1

n(n − 1)

n

  • i=1

bi − ¯ b σbi vi − ¯ v σvi

  • where b (v ) are the measured magnit

J =

n

  • k=1 wk sgn(Pk) √|Pk|

n

  • k=1 wk

where sgn is the sign fu

togram: K = 1/N

N

  • i=1
  • nv

nv−1 vi−¯ v σvi

  • 1/N

N

  • i=1

nv nv−1 vi−¯ v σvi

2 . For a Gaussian magnitude dist

L =

  • π/2JK(
  • w/wall)

( where ( w w ) is the ratio

See more at Sokolovsky et al, 2016

slide-17
SLIDE 17
  • Excursions, Ex
  • Von Neumann ratio, η
  • Excess Abbe value ΕΑ
  • SB variability detection statistics

Correlation-based Variability Indexes

η = δ2 σ2 =

N−1

  • i=1(mi+1 − mi)2/(N − 1)

N

  • i=1(mi − ¯

m)2/(N − 1) . tection statistic is defined as S B = 1 NM

  • M
  • i=1

ri,1 σi,1 + ri,2 σi,2 + ... + ri,ki σi,ki 2 where N represents the total number of data p

Abbe value EA ≡ Asub − A where is the m

The index Ex is computed according to the equation: Ex = 2 Nscan(Nscan − 1)

Nscan−1

  • i=1

Nscan

  • j>i,
  • mediani − median j
  • σ2

i + σ2 j

where N is the number of scans, N (N 1) 2

See more at Sokolovsky et al, 2016

slide-18
SLIDE 18

Οk but how do you define which source is variable?

  • We have more than 90 million sources divided in Groups (targets).
  • Less than 10% of the sources are variable.
  • General Idea : We have a sea of constant sources and few variables

that should stand out in some variability indexes.

slide-19
SLIDE 19

Οk but how do you define which source is variable?

slide-20
SLIDE 20

Οk but how do you define which source is variable?

slide-21
SLIDE 21

Οk but how do you define which source is variable?

slide-22
SLIDE 22

Οk but how do you define which source is variable?

slide-23
SLIDE 23

Οk but how do you define which source is variable?

slide-24
SLIDE 24

How does a source behave in different VIs?

slide-25
SLIDE 25

How does a source behave in different VIs?

Remember : Different VIs are sensitive to different type of variability

slide-26
SLIDE 26

How does a source behave in different VIs? …. and different filters

slide-27
SLIDE 27

Selection of Candidates (1st method)

slide-28
SLIDE 28

Selection of Candidates (1st method)

slide-29
SLIDE 29

Selection of Candidates (1st method)

slide-30
SLIDE 30

Selection of Candidates (1st method)

slide-31
SLIDE 31

Selection of Candidates (1st method)

slide-32
SLIDE 32

VI Performance

  • Using this selection method we evaluated the performance of each

variability index

  • Completeness
  • Purity
  • F-Score

(2014), we compute the completeness C and pu C = Number of selected variables Total number of confirmed variables P = Number of selected variables Total number of selected candidates as well as the fidelity F -score11 which is the har

F1 = 2(C × P)/(C + P). F reaches a maximum o

slide-33
SLIDE 33

VI Performance

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 5 10 15 20 25 30 35 40 45 50

χ2

red cut-off in σ

C(Fmax)=0.706 P(Fmax)=0.740 Fmax=0.723 at 10.8σ C P F 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 5 10 15 20 25 30 35 40 45 50

σw cut-off in σ

C(Fmax)=0.569 P(Fmax)=0.899 Fmax=0.697 at 6.0σ C P F 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 5 10 15 20 25 30 35 40 45 50

1/η cut-off in σ

C(Fmax)=0.821 P(Fmax)=0.825 Fmax=0.823 at 16.6σ C P F

slide-34
SLIDE 34

VI Performance

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 50 100 150 200 250 300

F1 max N σ IQR 1/η

slide-35
SLIDE 35

Selection of Candidates (2nd method)

  • Principal Component Analysis (PCA) on the normalized variability indexes
  • Principal component analysis (Pearson 1901) linearly and orthogonally transforms a dataset onto

a new set of un-correlated axes (the eigenvectors of the variance-covariance matrix of the data), where the data variance is being high-lighted.

  • These eigenvectors are called the principal components (PCs). Each observation xj of the original

data, composed of m variables, is expressed as where ai is the admixture coefficient of the principal component PCi. The coefficients ai are the data coordinates in the new axes. There exist a maximum of m PCs.

xj =

m

  • i=1

aj,i · PCi

slide-36
SLIDE 36

PCA on the normalized indexes

Scree plot representing the variances for the 15 most significant Principal Components as applied in the M31 Halo11 field. PC1 & PC2 for the M31 Halo11 PCA implementation in two filters.

Moretti et al., in prep

F606W F814W

slide-37
SLIDE 37

Selection of Candidates (2nd method)

Moretti et al., in prep

RR Lyrae RR Lyrae candidates Eclipsing binaries Dwarf Cepheids LPV/Semiregulars Anomalous Cepheids

*

Post-AGB stars

slide-38
SLIDE 38

The System

  • 2 VM with Apache’s Hadoop file system

to serve a distributed file system over the two physical machines.

  • Apache Spark to split a computation

tasks into subtasks and perform them

  • ver several nodes.
  • Apache Mesos to schedule and
  • rchestrate the computation processes
  • ver the spark nodes.
slide-39
SLIDE 39

The pipeline-Detection Algorithm

slide-40
SLIDE 40

The pipeline-Detection Algorithm

slide-41
SLIDE 41

The Pipeline-Validation Algorithm

slide-42
SLIDE 42

First Results

  • Principal Components Analysis and variability search: a promising

combination - Moretti et al., in prep

  • Stellar variability in the Key project galaxy NGC 4535 - Zoi Spetsieris
  • Identification of Active Galactic Nuclei in GOODS South through
  • ptical variability - Ektoras Pouliasis
slide-43
SLIDE 43

Principal Components Analysis and variability search: a promising combination

  • Application & evaluation of PCA in variability search
  • Study of M31 fields
  • Halo11
  • Stream
  • Disk
  • Create data for HCV control sample

14.000 13.000 12.000 11.000 10.000 9.000 8.000 43.000 42.000 41.000 40.000 39.000

RA (deg) Dec (deg)

M31 Disk Stream Halo11

DSS image of M31fields

slide-44
SLIDE 44

Principal Components Analysis and variability search: a promising combination

RR Lyrae RR Lyrae candidates Eclipsing binaries Dwarf Cepheids LPV/Semiregulars Anomalous Cepheids

*

Post-AGB stars PCA Confirmed Variables

Known variables from Brown et al.,2004

a1, a2 plot for the Halo11 sources.

PCA Candidate Variables

CMD of sources in M31, Halo 11.

slide-45
SLIDE 45

Principal Components Analysis and variability search: a promising combination

a1, a2 plot for the Halo11 sources. CMD of sources in M31, Halo 11.

RR Lyrae RR Lyrae candidates Eclipsing binaries Dwarf Cepheids LPV/Semiregulars Anomalous Cepheids

*

Post-AGB stars

Known variables from Brown et al.,2004

89% Recovery 58 New candidates

PCA Confirmed Variables PCA Candidate Variables

slide-46
SLIDE 46

Stellar variability in the Key project galaxy NGC 4535

  • Re-analysis of NGC 4535 (WFPC2).
  • PSF photometry with Dolphot.
  • Recover the published variables (50 Cepheids,

Macri et al., 1999) and detect new variables.

  • Investigate massive star variability in this Virgo

Cluster Galaxy and re-derive period-luminosity relationship.

slide-47
SLIDE 47

Stellar variability in the Key project galaxy NGC 4535

Phased light curves for the Cepheids C1 and C2 with published periods by Macri et al. 1999.

slide-48
SLIDE 48

The Key project galaxy NGC4535

New Variables

slide-49
SLIDE 49

Identification of AGN in GOODS South through

  • ptical variability
  • Re-analysis of ACS/WFC with SExtractor.
  • Add a field with extended sources in our control sample.
  • Point-like/Extended sources separation: CI ~1.24.
  • Total sample: 11862 sources with more than 3 epochs.
slide-50
SLIDE 50

Identification of Active Galactic Nuclei through

  • ptical variability selection
slide-51
SLIDE 51

Identification of Active Galactic Nuclei through

  • ptical variability selection
  • 150 candidates with known

redshift after false-positive removals

  • Concentration Index: 93% of

the variable candidates are extended, indicating AGN activity.

slide-52
SLIDE 52

Identification of Active Galactic Nuclei through

  • ptical variability selection
slide-53
SLIDE 53

Future work

  • Include Principal Component Analysis in the pipeline
  • Test other Machine Learning methods
  • Investigate other fields
  • Deliver the 1st version of HCV
slide-54
SLIDE 54

Grazie Ευχαριστώ

View publication stats View publication stats