Noisy-input classification of Fermi-LAT unidentified point-like - PowerPoint PPT Presentation

Noisy-input classification of Fermi-LAT unidentified point-like sources Bryan Zaldivar I F T / U A M M a d r i d work in progress with: Machine Learning Group Carlos Villacampa-Calvo, Javier Coronado-Blázquez, Eduardo Garrido Merchán Viviana Gammaldi Daniel Hernández-Lobato Miguel A. Sánchez-Conde

Motivation and Data origin Fermi-catalog 4 F G L circa 5000 point-like sources, out of which ~ 1500 are unidentified (unID) Can we classify those unIDs alla supervised learning ? Spectrum of a particular blazar L o g - p a r a b o l a Available data contains 3 known classes: pulsars, quasars, blazars and 4 features: significance of improvement in fitting detection log-parabola vs. power-law What if some of the unIDs are better classified as dark matter? Include the dark matter into the -plane! 1

Data visualization dark matter class credits to Javier Coronado-Blázquez 2

Data visualization identified un identified credits to Javier Coronado-Blázquez point size: value of - unID’s seem to be distributed similarly to the ID’s - error bars on partially correlated also with 3

Machine Learning procedure Step # 1

Standard classification without input uncertainties warm up: want to know what the simplest thing to do can give you Considered classifiers: - N a i v e B a y e s - L o g i s t i c r e g r e s s i o n - R a n d o m F o r e s t work in progress, but conceptually trivial... 5

Next steps: - search for an out-of-the-box classifier dealing with noisy inputs - search for a paper addressing the classification with noisy inputs - call your ML-expert colleagues, ask them for references! - do it ourselves!!

Machine Learning procedure Step # 2: incorporating input uncertainties

Bayesian classification with parametric models : one-hot-encoding of Parametric models assume a specific form for the Likelihood of data Cross-entropy (softmax function) and assume a specific form for the function (e.g. a neural network) In Bayesian approach, we build the predictive distribution for a new point 7

8 Gaussian Process - Rasmussen & Williams, 2006 GP approach is non-parametric: no predefined form for Instead you have a Gaussian distribution over functions (in case of regression)

Classification with noisy input using Gaussian Processes - As usual: introduce one output latent variable per point i per class k , - NEW: introduce one input latent variable per point i The predictive distribution for a class at a test point usual term Gaussian posterior (new term) Costly: Sparse GP Non-Gaussian Likelihood Variational intractable Inference 9

Sparse Gaussian Process - here inspired in Titsias (2009) involves inverting an N x N matrix, cost Idea is to make inference on a smaller set of function points, which represent approximately the entire posterior over the N function points. If were sufficient statistics for , we were left with Cost: 10

Variational Inference - Jordan, Ghahramani, Jaakkola & Saul, 1999 Idea is approximate the exact posterior distribution by an easier one (e.g. Gaussians) according to the variational principle Minimize w.r.t. Kullback-Leibler divergence: 11

12 Likelihood of the model Common form in parametric models: “generalized Bernoulli” (the -log of which is the cross entropy) e.g. if 3 classes: 0.05 0.80 0.15 “misclassification noise” Instead here Misclassification noise included in the prior for Labelling noise (with probability e ) also included: Labelling rule: Likelihood for label at point i : (noiseless) D. Hernandez-Lobato, J.M. Hernandez-Lobato & P. Dupont, 2011

Results ( p y t h o n + T e n s o r F l o w )

Results on toy data ex. of dataset - Found no published model against which to compare! - we compare with a standard GP classif. without noise - we modify an existing GP noise model for regression McHutchon & Rasmussen, 2011 Generate a set (~100) of synthetic datasets to evaluate average performance Input noise level Input noise level Input noise level 0.1 0.25 0.5 Err. rate Err. rate Err. rate Noiseless model 0.76 0.113 1.14 0.164 1.54 0.218 0.321 0.109 0.209 Rasmussen-like 0.53 0.158 0.77 0.259 0.108 0.37 0.158 0.50 0.210 This work 14

Conclusions/work in progress - Unidentified point-like sources can be classified among predefined known classes ( including the potential dark matter class ) - Interestingly, including the dark matter class into the well-known beta-plane for point-like sources results in a reasonably good separability - Only non-straightforward issue with this problem: inputs come with their own error bars surprisingly not yet explicitly addressed in the context of ML classification! - A warm-up classification exercise w/o error bars is being conducted - Error bars are incorporated in a Gaussian Process model for multiclass classification, by treating the input as a noisy realization of extra latent variables to be learned. - Very satisfactory preliminary results with synthetic data - Time to apply it to real Fermi-LAT data! Thank you!

Classification with error bars in the input (parametric approach) Suppose you have data e.g. If in class 2 “one-hot-encoding” assume Are noisy samples from unknown means Then the (- log) joint Likelihood of data can be written as e.g. a linear model, or a NN model

Noisy-input classification of Fermi-LAT unidentified point-like - PowerPoint PPT Presentation

Noisy-input classification of Fermi-LAT unidentified point-like sources Bryan Zaldivar I F T / U A M M a d r i d work in progress with: Machine Learning Group Carlos Villacampa-Calvo, Javier Coronado-Blzquez, Eduardo Garrido

Steep Spectrum Radio Steep Spectrum Radio Sources with the Fermi-LAT Sources with the Fermi-LAT

Fermi /LAT /LAT and the Origin of and the Origin of Fermi Cosmic Rays Cosmic Rays Fermi

Galactic Sources as Seen by Fermi-LAT (KIPAC/SLAC) on behalf of the Fermi-LAT

AGNs with the Fermi with the Fermi- -LAT: LAT: AGNs What we have seen What we have seen

Searching for Radio Pulsars in Unidentified Fermi-LAT Bright Sources Scott Ransom (NRAO) For

A Gamma A Gamma-ray Source ray Source Detected by the Fermi Detected by the Fermi- LAT at the

Diffuse Galactic Emission Diffuse Galactic Emission in the Fermi-LAT Era in the Fermi-LAT Era

New Results from Fermi Simona Murgia, SLAC-KIPAC Representing the Fermi-LAT Collaboration PHENO

Fermi- Fermi -LAT Study of LAT Study of Cosmic Cosmic- -rays/Diffuse Gamma rays/Diffuse

Recent Highlights from AGN Observations with Fermi-LAT Observations with Fermi-LAT

Fermi Fermi- -LAT Study of LAT Study of Diffuse rays and CRs Diffuse -rays and CRs

Results from the Results from the Fermi-LAT Mission: Fermi-LAT Mission: Cosmic Rays and Cosmic

Nobuyuki Kawai and Yoshikazu Kanai (Tokyo Tech) On behalf of Fermi/LAT Collaboration Motivation

Relativistic Jets and AGN in the Fermi Era Lukasz Stawarz ISAS/JAXA on behalf of the

Fermi observations of long-lasting GRB emission at high energies Frdric Piron (IN2P3/LPTA,

Gamma-ray Observations of Pulsars in the Fermi Era Paul S. Ray (NRL) for the Fermi LAT

On estimation of functional causal models: Post - nonlinear causal model as an

Gaussian Noise Mechanism Sensitivity, again The ` 2 sensitivity of f : X n ! R k is ! 1 / 2

Eliminating variables in Boolean equation systems Bjrn Mller Greve 1 , 2 avard Raddum 2 Gunnar

Structural Identifiability of Biological Models Nikki Meshkat Santa Clara University Joint work

Reading Jain, Kasturi, Schunck, Machine Vision . McGraw- Hill, 1995. Sections 4.2-4.4, 4.5(intro),

Bayesian Quadrature for Multiple Related Integrals Fran cois-Xavier Briol University of

24 Implementation of Iso-P Triangular Elements IFEM Ch 24 Slide 1 Introduction to FEM

Nonlinear Filtering using Particles and Outline Nonlinear Quadrature Filtering Monte Carlo