bayesian perspective on qcd global analysis
play

Bayesian perspective on QCD global analysis In collaboration with: - PowerPoint PPT Presentation

Bayesian perspective on QCD global analysis In collaboration with: Nobuo Sato University of Connecticut/JLab A. Accardi DIS18, E. Nocera Kobe, Japan, W. Melnitchouk April 16-20, 2018 1 / 17 Bayesian methodology in a nutshell In QCD


  1. Bayesian perspective on QCD global analysis In collaboration with: Nobuo Sato University of Connecticut/JLab A. Accardi DIS18, E. Nocera Kobe, Japan, W. Melnitchouk April 16-20, 2018 1 / 17

  2. Bayesian methodology in a nutshell In QCD global analysis PDFs are parametrized at some scale Q 0 . e.g. f ( x ) = Nx a (1 − x ) b (1 + c √ x + dx + ... ) f ( x ) = Nx a (1 − x ) b NN( x ; { θ, w i } ) “fitting” is essentially estimation of � a = ( N, a, b, c, d, ... ) d n a P ( a | data ) f ( a ) E[ f ] = � d n a P ( a | data ) ( f ( a ) − E[ f ]) 2 V[ f ] = The probability density P is given by the Bayes’ theorem P ( f | data ) = 1 Z L ( data | f ) π ( f ) 2 / 17

  3. Bayesian methodology in a nutshell The likelihood function is not unique. A standard choice is the Gaussian likelihood � 2 � � − 1 � d i − thy i ( a ) � L ( d | a ) = exp 2 δd i i Priors are design to veto unphysical regions in parameter space. e.g. � θ ( a i − a min ) θ ( a max π ( a ) = − a i ) i i i How do we compute E[ f ] , V[ f ] ? + Maximum likelihood + Monte Carlo methods 3 / 17

  4. Maximum Likelihood Estimation of expectation value � d n a P ( a | data ) f ( a ) ≃ f ( a 0 ) E[ f ] = a 0 is estimated from optimization algorithm max [ P ( a | data )] = P ( a 0 | data ) max [ L ( data | a ) π ( a )] = L ( data | a 0 ) π ( a 0 ) or equivalently Chi-squared minimization min [ − 2 log ( L ( data | a ) π ( a ))] = − 2 log ( L ( data | a 0 ) π ( a 0 )) � 2 � d i − thy i ( a 0 ) � = − 2 log ( π ( a 0 )) δd i i = χ 2 ( a 0 ) − 2 log ( π ( a 0 )) 4 / 17

  5. Maximum Likelihood Estimation of variance (Hessian method) � d n a P ( a | data ) ( f ( a ) − E[ f ]) 2 V[ f ] = � 2 � f ( t k = 1) − f ( t k = − 1) � ≃ 2 k It relies on factorization of P ( a | data ) along eigen directions − 1 � � 2 t 2 + O (∆ a 3 ) � P ( a | data ) ∝ exp k k and linear approximation of f ( a ) � 2 �� ∂f ( f ( a ) − E[ f ]) 2 = + O ( a 3 ) t k ∂t k k 5 / 17

  6. Maximum Likelihood pros + Very practical. Most PDF groups use this method + It is computationally inexpensive + f and its eigen directions can be precalculated/tabulated cons + Assumes local Gaussian approximation of the likelihood + Assumes linear approximation of the observables O around a 0 + The assumptions are strictly valid for linear models. + Computation of the Hessian matrix is numerically unstable if flat directions are present examples → if f ( x ) = a + bx + cx 2 then E[ f ( x )] = E[ a ] + E[ b ] x + E[ c ] x 2 → but f ( x ) = Nx a (1 − x ) b then E[ f ( x )] � = E[ N ] x E[ a ] (1 − x ) E[ b ] 6 / 17

  7. Monte Carlo Methods Recall that we are interested in computing � d n a P ( a | data ) f ( a ) E[ f ] = � d n a P ( a | data ) ( f ( a ) − E[ f ]) 2 V[ f ] = Any MC method attempts to do this using MC sampling � E[ f ] ≃ w k f ( a k ) k � w k ( f ( a k ) − E[ f ]) 2 V[ f ] ≃ k i.e to construct the sample distribution { w k , a k } of the parent distribution P ( a | data ) 7 / 17

  8. Monte Carlo Methods Resampling + cross validation Nested Sampling (NS) Hybrid Markov chain (HMC); Gabin Gbedo, Mangin-Brinet (2017) x ∆ u + x ∆ d + 0 . 4 0 h u zH ⊥ (1) 1 0 . 4 1 1(fav) 0 . 3 − 0 . 05 0 0 . 2 0 . 2 − 0 . 10 JAM17 0 – 1 0 . 1 JAM15 h d zH ⊥ (1) − 0 . 15 – 0 . 2 – 2 0 1 1(unf) 0 0 . 2 0 . 4 0 . 6 0 . 8 1 0 0 . 2 0 . 4 0 . 6 0 . 8 1 – 0 . 4 – 3 0 . 04 x (∆¯ u + ∆ ¯ 0 . 04 d ) x z 0 0 . 2 0 . 4 0 . 6 0 . 2 0 . 4 0 . 6 0 . 02 0 . 02 0 0 0 − 0 . 02 − 0 . 02 δd normalized yield SIDIS (a) (b) SIDIS+lattice 6 − 0 . 04 − 0 . 04 u − ∆ ¯ SIDIS DSSV09 x (∆¯ d ) – 0 . 4 10 − 3 10 − 2 10 − 1 0 . 4 0 . 8 10 − 3 10 − 2 10 − 1 0 . 4 0 . 8 4 0 . 04 x ∆ s + 0 . 1 x ∆ s − – 0 . 8 2 0 . 02 0 . 05 SIDIS+lattice 0 0 – 1 . 2 0 0 0 . 2 0 . 4 0 0 . 5 1 g T δu − 0 . 02 − 0 . 05 − 0 . 04 − 0 . 1 Nested Sampling, Lin et al (2018) JAM17 + SU(3) 0 . 8 x 0 . 8 x 10 − 3 10 − 2 10 − 1 10 − 3 10 − 2 10 − 1 0 . 4 0 . 4 resampling + CV, Ethier et al (2017) 8 / 17

  9. Resampling+cross validation (R+CV) Resample the data points within fit quoted uncertainties using Gaussian sampler priors posteriors fit statistics fit d (pseudo) = d (exp) + σ (exp) R k,i i i k,i original data Fit each pseudo data sample k = 1 , .., N to obtain parameter pseudo prior data vectors a k : training validation data data P ( a | data ) → { w k = 1 /N, a k } as initial fit guess For large number of parameters, split parameters from validation the data into tranining and validation minimization steps sets and find a k that best describes the validation sample posterior 9 / 17

  10. Nested Sampling (NS) - arXiv:astro-ph/0508461v2 - arXiv:astro-ph/0701867v2 L (data | a ) in a space - arxiv.org/abs/1703.09701 The basic idea : compute � 1 � L (data | a ) π ( a ) d n a = Z = L ( X ) dX 0 + The procedure collects samples from isolikelihoods and they are weighted by their likelihood values + Insensitive to local minima → faithful L ( X ) in X space conversion of P ( a | data ) → { w k , a k } + Multiple runs can be combined into one single run → the procedure can be parallelized 10 / 17

  11. Comparison between the methods Given a likelihood, does the evaluation of E[ f ] and V[ f ] depend on the method? → use stress testing numerical example Setup: + Simulate a synthetic data via rejection sampling + Estimate E[ f ] and V[ f ] using different methods 3 3 2 2 f ( x ) f ( x ) 1 1 0 0 10 − 2 10 − 1 10 0 10 − 2 10 − 1 10 0 x x 11 / 17

  12. Comparison between the methods 0 . 05 NS 3 0 . 04 δf ( x ) f ( x ) 0 . 03 2 HESS 0 . 02 NS 1 0 . 01 R RCV(50 / 50) 0 0 . 00 10 − 3 10 − 2 10 − 1 10 0 0 . 00 0 . 25 0 . 50 0 . 75 1 . 00 x x HESS, NS and R provide the same uncertainty 1 . 4 ( δf/f ) / ( δf/f ) NS R+CV over estimates the 1 . 2 uncertainty by roughly a factor of 2 1 . 0 Uncertainties also depends on x = 0 . 1 0 . 8 x = 0 . 3 training fraction (tf) x = 0 . 5 0 . 6 x = 0 . 7 The results confirmed also within a 40 50 60 70 80 neural net parametrization tf 12 / 17

  13. Beyond gaussian likelihood The Gaussian likelihoods are not adequate to describe uncertainties in the presence of incompatible data sets Example: + Two measurements of a quantity m : ( m 1 , δm 1 ) , ( m 2 , δm 2 ) + The expectation value and variance can be computed exactly E[ m ] = m 1 δm 2 + m 2 δm 1 δm 2 2 + δm 2 1 δm 2 2 δm 2 1 V[ m ] = δm 2 2 + δm 2 1 + note : V[ m ] is independent of | m 1 − m 2 | To obtain more realistic uncertainties, the likelihood function needs to be modified. (e.g. Tolerance criterion) 13 / 17

  14. Likelihood profile in CJ15 10000 0 . 40 24 parameters, 1 1 0 . 35 8000 0 . 30 33 data sets Likelihood 6000 0 . 25 ∆ χ 2 0 . 20 4000 12 0 . 15 5 8 29 0 . 10 17 3 27 18 2000 2 11 Eigen direction 12 0 . 05 34 28 13 5 8 10 30 29 7 31 19 17 3 27 18 33 15 24 6 2 23 25 20 26 4 32 14 21 11 0 . 00 22 9 16 0 28 7 34 10 30 13 26 25 24 22 21 20 19 23 6 16 15 14 9 4 33 32 31 − 100 − 50 0 50 100 − 100 − 50 0 50 100 without 100 0 . 40 incompatibilities 1 0 . 35 80 0 . 30 Likelihood 60 0 . 25 ∆ χ 2 0 . 20 40 12 0 . 15 5 8 29 0 . 10 27 3 17 18 2 20 0 . 05 34 28 30 31 0 . 00 0 − 10 − 5 0 5 10 − 10 − 5 0 5 10 0 . 6 (0) a1uv (12) a2du (0) TOTAL (17) e866pp06xf 8 (1) a2uv (13) a4du (1) HerF2pCut (18) H2 CC em 0 . 4 (2) a4uv (14) a1g (2) slac p (19) d0run2cone (3) d0Lasy13 (20) d0 gamjet1 (3) a1dv (15) a2g 0 . 2 (4) e866pd06xf (21) CDFrun2jet 18 (4) a2dv (16) a3g 1 4 9 15 20 (5) BNS F2nd (22) d0 gamjet3 (5) a3dv (17) a4g Projection 0 . 0 (6) NmcRatCor (23) d0 gamjet2 12 13 19 2 5 6 10 11 14 16 17 21 22 23 (6) a4dv (18) a6dv 3 (7) slac d (24) d0 gamjet4 − 0 . 2 (7) a0ud (19) off1 (8) D0 Z (25) jl00106F2d 0 (8) a1ud (20) off2 (9) H2 NC ep 3 (26) HerF2dCut − 0 . 4 (9) a2ud (21) ht1 (10) H2 NC ep 2 (27) BcdF2dCor (10) a4ud (22) ht2 (11) H2 NC ep 1 (28) CDF Z − 0 . 6 (11) a1du (23) ht3 (12) H2 NC ep 4 (29) D0 Wasy (13) CDF Wasy (30) H2 NC em (14) H2 CC ep (31) jl00106F2p − 0 . 8 (15) cdfLasy05 (32) d0Lasy e15 7 (16) NmcF2pCor (33) BcdF2pCor − 1 . 0 14 / 17

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend