guiding new physics searches with unsupervised learning
play

Guiding New Physics Searches with Unsupervised Learning [DS, - PowerPoint PPT Presentation

IML Working Group, CERN 2018-10-12 Guiding New Physics Searches with Unsupervised Learning [DS, Jacques - 1807.06038] Andrea De Simone andrea.desimone@sissa.it > New Physics ? Searches for New Physics


  1. IML Working Group, CERN 2018-10-12 Guiding New Physics 
 Searches with 
 Unsupervised Learning [DS, Jacques - 1807.06038] Andrea De Simone andrea.desimone@sissa.it

  2. 
 > New Physics ? Searches for New Physics Beyond the Standard Model have been negative so far… MAYBE: 1. New Physics (NP) is not accessible by LHC 
 new particles are too light/heavy 
 or interacting too weakly 
 2. We have not explored all the possibilities 
 new physics may be buried under large bkg 
 or hiding behind unusual signatures 2 A. De Simone

  3. 
 > New Physics ? “Don’t want to miss a thing” (in data) 
 closer look at current data get ready for upcoming data from next run 
 
 Model-independent search 
 searches for specific models may be: 
 - time-consuming 
 - insensitive to unexpected/unknown processes 3 A. De Simone

  4. > New Statistical Test Want a statistical test for NP which is: 1. model-independent: 
 no assumption about underlying physical model to intepret data more general 
 2. non-parametric: 
 compare two samples as a whole (not just their means, etc.) 
 fewer assumptions, no max likelihood estim. 3. un-binned: 
 high-dim feature space partitioned without rectangular bins retain full multi-dim info of data 4 A. De Simone

  5. 
 > Outline 1. Statistical test of dataset compatibility 
 • Nearest-Neighbors Two-Sample Test 
 • Identify Discrepancies 
 • Include Uncertainties 2. Applications to High-Energy Physics 5 A. De Simone

  6. 
 > Outline 1. Statistical test of dataset compatibility 
 • Nearest-Neighbors Two-Sample Test 
 • Identify Discrepancies 
 • Include Uncertainties 2. Applications to High-Energy Physics 6 A. De Simone

  7. > Two-sample Test [a.k.a. “homogeneity test”] Two sets: iid Trial: T = { x 1 , . . . , x N T } ∼ p T i ∈ R D x i , x 0 iid Benchmark: B = { x 0 1 , . . . , x 0 N B } ∼ p B probability distributions p B ,p T unknown e.g.: simulated SM bkg real measured data 7 A. De Simone

  8. > Two-sample Test Two sets: iid Trial: T = { x 1 , . . . , x N T } ∼ p T i ∈ R D x i , x 0 iid Benchmark: B = { x 0 1 , . . . , x 0 N B } ∼ p B probability distributions p B ,p T unknown Are B,T drawn from the same prob. distribution? easy… 8 A. De Simone

  9. > Two-sample Test Two sets: iid Trial: T = { x 1 , . . . , x N T } ∼ p T i ∈ R D x i , x 0 iid Benchmark: B = { x 0 1 , . . . , x 0 N B } ∼ p B probability distributions p B ,p T unknown Are B,T drawn from the same prob. distribution? … hard! 9 A. De Simone

  10. 
 
 
 > Two-sample Test RECIPE: 1. Density Estimator reconstruct PDFs from samples 2. Test Statistic (TS) 
 measure “distance” between PDFs 
 3. TS distribution associate probabilities to TS 
 under null hypothesis H 0 : p B = p T 
 4. p -value 
 accept/reject H 0 10 A. De Simone

  11. > 1. Density Estimator Divide the space in squared bins? ✓ easy B ✓ can use simple statistics (e.g. ) 
 χ 2 ✘ hard/slow/impossible in high- D Need un-binned 
 multivariate approach p B ( x ) , ˆ ˆ p T ( x ) Find PDFs estimators : 
 e.g. based on densities of points: T p B,T ( x ) = ρ B,T ( x ) ˆ N B,T Nearest Neighbors! [Schilling - 1986][Henze - 1988] [Wang et al. - 2005,2006] [Dasu et al. - 2006][Perez-Cruz - 2008] [Sugiyama et al. - 2011][Kremer et al, 2015] 11 A. De Simone

  12. > 1. Density Estimator • Fix integer K. 
 B • Choose query point x j in T and 
 draw it in B. 
 x j T x j 12 A. De Simone

  13. > 1. Density Estimator • Fix integer K. 
 B • Choose query point x j in T and 
 draw it in B. 
 x j r j,B • Find the distance r j,B of the 
 K th -NN of x j in B. 
 T x j 13 A. De Simone

  14. > 1. Density Estimator • Fix integer K. 
 B • Choose query point x j in T and 
 draw it in B. 
 x j r j,B • Find the distance r j,B of the 
 K th -NN of x j in B. 
 • Find the distance r j,T of the 
 K th -NN of x j in T. 
 T r j,T x j 14 A. De Simone

  15. > 1. Density Estimator • Fix integer K. 
 B • Choose query point x j in T and 
 draw it in B. 
 x j r j,B • Find the distance r j,B of the 
 K th -NN of x j in B. 
 • Find the distance r j,T of the 
 K th -NN of x j in T. 
 T • Estimate PDFs: r j,T x j 1 K p B ( x j ) ˆ = ω D r D N B j,B 1 K p T ( x j ) ˆ = ω D r D N T − 1 j,T 15 A. De Simone

  16. 
 
 
 > 2. Test Statistic • Measure of the “distance” between 2 PDFs 
 N T 1 log ˆ p T ( x j ) X • Define Test Statistic : 
 TS( B , T ) = p B ( x j ) ˆ N T (detect under-/over-densities) j =1 TS( B , T ) = ˆ • Related to Kullback-Leibler divergence as: 
 D KL (ˆ p T || ˆ p B ) R D p ( x ) log p ( x ) Z D KL ( p || q ) ≡ q ( x ) d x N T TS obs = D log r j,B N B X • From NN-estimated PDFs: 
 + log N T − 1 N T r j,T j =1 • Theorem: this estimator converges to D KL ( p B || p T ), 
 in large sample limit [Wang et al. - 2005,2006] 16 A. De Simone

  17. > 3. Test Statistic Distribution Permutation test! How is TS distributed? Assume p B =p T . Union set: U = T ∪ B T e U Random reshuffle T Compute the test 
 statistic TS n on: ( ˜ B , ˜ T ) e B B Repeat many times. f (TS | H 0 ) ← { TS n } Distribution of TS under H 0 : [asymptotically normal with zero mean] 17 A. De Simone

  18. 
 
 
 
 
 
 > 4. p -value • mean,variance of TS distribution 
 f (TS | H 0 ) µ, ˆ ˆ σ : TS → TS 0 ≡ TS − ˆ µ • Standardize the TS: 
 ˆ σ f 0 (TS 0 | H 0 ) = ˆ σ TS 0 | H 0 ) σ f (ˆ µ + ˆ • TS’ distributed according to 
 • Two-sided p -value: 
 Z + 1 f 0 (TS 0 | H 0 ) d TS 0 p = 2 | TS 0 obs | • Equivalent significance: Z ≡ Φ − 1 (1 − p/ 2) 18 A. De Simone

  19. > 2D Gaussian Example ✓ 1 ◆ 0 p B = N ( µ B , Σ B ) p T = N ( µ T , Σ T ) Σ B = Σ T = 0 1 ✓ 1 . 0 ◆ ✓ 1 . 2 ◆ µ B = µ T = 1 . 0 1 . 2 exact KL 
 divergence ✓ 1 . 0 ◆ ✓ 1 . 15 ◆ µ B = µ T = 1 . 0 1 . 15 K = 5 , N perm = 1000 more data, more power 19 A. De Simone

  20. > NN2ST: Summary INPUT: iid i ∈ R D x i , x 0 T = { x 1 , . . . , x N T } Trial sample: ∼ p T p B ,p T unknown iid Benchmark sample: B = { x 0 1 , . . . , x 0 N B } ∼ p B K : number of nearest neighbors N perm : number of permutations OUTPUT: 
 p -value of the null hypothesis H 0 : p B = p T [check compatibility between 2 samples] 20 A. De Simone

  21. > NN2ST: Summary Test Statistic Benchmark sample TS obs y t i s n n e o d i t a N m N i - t s K e o i t a r p Trial sample e r m u t a t i o n t e s t -|TS obs | |TS obs | p value TS distribution Python code: github.com/de-simone/NN2ST 21 A. De Simone

  22. 
 > Outline 1. Statistical test of dataset compatibility 
 • Nearest-Neighbors Two-Sample Test 
 • Identify Discrepancies 
 • Include Uncertainties 2. Applications to High-Energy Physics 22 A. De Simone

  23. 
 
 
 
 
 
 
 > Where are the discrepancies? Bonus: Characterize regions with significant discrepancies Z ( x j ) ≡ u ( x j ) − ¯ u u ( x j ) ≡ log r j,B 1. “Score” field over T : with: 
 r j,T σ u TS obs = D ¯ u + const Z x Z ( x ) > c 2. Identify points where 
 They contribute the most to large TS obs high-discrepancy (anomalous) regions 
 3. Apply a clustering algorithm to group them 23 A. De Simone

  24. 
 > Outline 1. Statistical test of dataset compatibility 
 • Nearest-Neighbors Two-Sample Test 
 • Identify Discrepancies 
 • Include Uncertainties 2. Applications to High-Energy Physics 24 A. De Simone

  25. 
 
 
 
 
 > Sample Uncertainties How to include sample uncertainties? B 1. Model feature uncertainties 
 F B ( x ) , F T ( x ) [e.g. zero-mean gaussians] 
 2. New samples by adding random noise 
 sampled from F B,T : 
 { x i + ∆ x i } N T T u = i =1 T i } N B { x 0 i + ∆ x 0 B u = i =1 3. Compute TS on new samples 
 TS u ≡ TS( B u , T u ) = TS obs + U 4. Repeat many times to reconstruct f(U) 25 A. De Simone

  26. 
 > Sample Uncertainties How to include sample uncertainties? • f(TS u ) is a convolution: 
 f (TS u | H 0 ) = f (TS | H 0 ) ∗ f ( U ) f(TS u ) more spread than f(TS) 
 • p -value computed from f(TS u ) 
 
 • weaker significance, 
 power degradation TS obs 26 A. De Simone

  27. > 2D Gaussian with Uncertainties B,T gaussian samples: gaussian uncorrelated errors (diagonal covariance) p B = N ( µ B , Σ B ) p T = N ( µ T , Σ T ) with fixed relative uncertainty ✓ 1 . 0 ◆ ✓ 1 . 15 ◆ µ B = µ T = 1 . 0 1 . 15 � i = ✏ x i ✓ ◆ 1 0 for each feature component i Σ B = Σ T = 0 1 27 A. De Simone

  28. 
 > NN2ST: Summary ✓ general, model-independent 
 ✓ fast, no optimization 
 [ N B,T =20k, K =5, N perm =1k, D =2: t ~ 2 mins N B,T =20k, K =5, N perm =1k, D =8: t ~ 50 mins ] ✓ sensitive to unspecified signals 
 ✓ useful when no variable can separate sig/bkg ✓ helps finding signal regions, optimal cuts, … 
 ✓ flexible to incorporate uncertainties 
 ✘ need to run for each sample pair 
 ✘ permutation test is bottleneck 28 A. De Simone

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend