Some Statistical Tools for Particle Physics Particle Physics - PowerPoint PPT Presentation

Some Statistical Tools for Particle Physics Particle Physics Colloquium MPI für Physik u. Astrophysik Munich, 10 May, 2016 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan MPI Seminar 2016 / Statistics for Particle Physics 1 G. Cowan

Outline 1) Brief review of HEP context and statistical tests. 2) Statistical tests based on the profile likelihood ratio 3) A measure of discovery sensitivity is often used to plan a future analysis, e.g., s / √ b , gives approximate expected discovery significance (test of s = 0) when counting n ~ Poisson( s + b ). A measure of discovery significance is proposed that takes into account uncertainty in the background rate. 4) Brief comment on importing tools from Machine Learning & choice of variables for multivariate analysis MPI Seminar 2016 / Statistics for Particle Physics 2 G. Cowan

Data analysis in particle physics Particle physics experiments are expensive e.g. LHC, ~ $10 10 (accelerator and experiments) the competition is intense (ATLAS vs. CMS) vs. many others and the stakes are high: 4 sigma effect 5 sigma effect Hence the increased interest in advanced statistical methods. MPI Seminar 2016 / Statistics for Particle Physics page 3 G. Cowan

Prototypical HEP analyses Select events with properties characteristic of signal process (invariably select some background events as well). Case #1: Existence of signal process already well established (e.g. production of top quarks) Study properties of signal events (e.g., measure top quark mass, production cross section, decay properties,...) Statistics issues: Event selection → multivariate classifiers Parameter estimation (usually maximum likelihood or least squares) Bias, variance of estimators; goodness-of-fit Unfolding (deconvolution). MPI Seminar 2016 / Statistics for Particle Physics 4 G. Cowan

Prototypical analyses (cont.): a “search” Case #2: Existence of signal process not yet established. Goal is to see if it exists by rejecting the background-only hypothesis. H 0 : All of the selected events are background (usually means “standard model” or events from known processes) H 1 : Selected events contain a mixture of background and signal. Statistics issues: Optimality (power) of statistical test. Rejection of H 0 usually based on p -value < 2.9 × 10 - 7 (5 σ ). Some recent interest in use of Bayes factors. In absence of discovery, exclusion limits on parameters of signal models (frequentist, Bayesian, “CLs”,...) MPI Seminar 2016 / Statistics for Particle Physics 5 G. Cowan

(Frequentist) statistical tests Consider test of a parameter µ , e.g., proportional to cross section. Result of measurement is a set of numbers x. To define test of µ , specify critical region w µ , such that probability to find x ∈ w µ is not greater than α (the size or significance level ): (Must use inequality since x may be discrete, so there may not exist a subset of the data space with probability of exactly α .) Equivalently define a p -value p µ equal to the probability, assuming µ , to find data at least as “extreme” as the data observed. The critical region of a test of size α can be defined from the set of data outcomes with p µ < α . Often use, e.g., α = 0.05. If observe x ∈ w µ , reject µ . MPI Seminar 2016 / Statistics for Particle Physics 6 G. Cowan

Test statistics and p -values Often construct a scalar test statistic, q µ ( x ), which reflects the level of agreement between the data and the hypothesized value µ . For examples of statistics based on the profile likelihood ratio, see, e.g., CCGV, EPJC 71 (2011) 1554; arXiv:1007.1727. Usually define q µ such that higher values represent increasing incompatibility with the data, so that the p -value of µ is: observed value of q µ pdf of q µ assuming µ Equivalent formulation of test: reject µ if p µ < α . MPI Seminar 2016 / Statistics for Particle Physics 7 G. Cowan

Confidence interval from inversion of a test Carry out a test of size α for all values of µ . The values that are not rejected constitute a confidence interval for µ at confidence level CL = 1 – α . The confidence interval will by construction contain the true value of µ with probability of at least 1 – α . The interval depends on the choice of the critical region of the test. Put critical region where data are likely to be under assumption of the relevant alternative to the µ that’s being tested. Test µ = 0, alternative is µ > 0: test for discovery. Test µ = µ 0 , alternative is µ = 0: testing all µ 0 gives upper limit. MPI Seminar 2016 / Statistics for Particle Physics 8 G. Cowan

p -value for discovery Large q 0 means increasing incompatibility between the data and hypothesis, therefore p -value for an observed q 0,obs is will get formula for this later From p -value get equivalent significance, MPI Seminar 2016 / Statistics for Particle Physics 9 G. Cowan

Significance from p -value Often define significance Z as the number of standard deviations that a Gaussian variable would fluctuate in one direction to give the same p -value. 1 - TMath::Freq TMath::NormQuantile MPI Seminar 2016 / Statistics for Particle Physics 10 G. Cowan

Prototype search analysis Search for signal in a region of phase space; result is histogram of some variable x giving numbers: Assume the n i are Poisson distributed with expectation values strength parameter where signal background MPI Seminar 2016 / Statistics for Particle Physics 11 G. Cowan

Prototype analysis (II) Often also have a subsidiary measurement that constrains some of the background and/or shape parameters: Assume the m i are Poisson distributed with expectation values nuisance parameters ( θ s , θ b , b tot ) Likelihood function is MPI Seminar 2016 / Statistics for Particle Physics 12 G. Cowan

The profile likelihood ratio Base significance test on the profile likelihood ratio: maximizes L for specified µ maximize L The likelihood ratio of point hypotheses, e.g., λ = L ( µ , θ )/ L (0, θ ), gives optimum test (Neyman-Pearson lemma). But the distribution of this statistic depends in general on the nuisance parameters θ , , and one can only reject µ if it is rejected for all θ . The advantage of using the profile likelihood ratio is that the asymptotic (large sample) distribution of - 2ln λ ( µ ) approaches a chi-square form independent of the nuisance parameters θ . MPI Seminar 2016 / Statistics for Particle Physics 13 G. Cowan

Test statistic for discovery Try to reject background-only ( µ = 0) hypothesis using i.e. here only regard upward fluctuation of data as evidence against the background-only hypothesis. Note that even though here physically µ ≥ 0, we allow ˆ µ to be negative. In large sample limit its distribution becomes Gaussian, and this will allow us to write down simple expressions for distributions of our test statistics. MPI Seminar 2016 / Statistics for Particle Physics 14 G. Cowan

Cowan, Cranmer, Gross, Vitells, arXiv:1007.1727, EPJC 71 (2011) 1554 Distribution of q 0 in large-sample limit Assuming approximations valid in the large sample (asymptotic) limit, we can write down the full distribution of q 0 as The special case µ ′ = 0 is a “half chi-square” distribution: In large sample limit, f ( q 0 |0) independent of nuisance parameters; f ( q 0 | µ ′ ) depends on nuisance parameters through σ . MPI Seminar 2016 / Statistics for Particle Physics 15 G. Cowan

Cowan, Cranmer, Gross, Vitells, arXiv:1007.1727, EPJC 71 (2011) 1554 Cumulative distribution of q 0 , significance From the pdf, the cumulative distribution of q 0 is found to be The special case µ ′ = 0 is The p -value of the µ = 0 hypothesis is Therefore the discovery significance Z is simply MPI Seminar 2016 / Statistics for Particle Physics 16 G. Cowan

Monte Carlo test of asymptotic formula Here take τ = 1. Asymptotic formula is good approximation to 5 σ level ( q 0 = 25) already for b ~ 20. MPI Seminar 2016 / Statistics for Particle Physics 17 G. Cowan

Discovery: the p 0 plot The “local” p 0 means the p -value of the background-only hypothesis obtained from the test of µ = 0 at each individual m H , without any correct for the Look-Elsewhere Effect. The “Expected” (dashed) curve gives the median p 0 under assumption of the SM Higgs ( µ = 1) at each m H . ATLAS, Phys. Lett. B 716 (2012) 1-29 The blue band gives the width of the distribution (±1 σ ) of significances under assumption of the SM Higgs. MPI Seminar 2016 / Statistics for Particle Physics 18 G. Cowan

Test statistic for upper limits cf. Cowan, Cranmer, Gross, Vitells, arXiv:1007.1727, EPJC 71 (2011) 1554. For purposes of setting an upper limit on µ use where I.e. when setting an upper limit, an upwards fluctuation of the data is not taken to mean incompatibility with the hypothesized µ : From observed q µ find p -value: Independent of Large sample nuisance param. in approximation: large sample limit 95% CL upper limit on µ is highest value for which p -value is not less than 0.05. MPI Seminar 2016 / Statistics for Particle Physics 19 G. Cowan

Some Statistical Tools for Particle Physics Particle Physics - PowerPoint PPT Presentation

Some Statistical Tools for Particle Physics Particle Physics Colloquium MPI fr Physik u. Astrophysik Munich, 10 May, 2016 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan MPI

Elementary Particle Physics in a Nutshell Elementary Particle Physics in a Nutshell

THEORETICAL PARTICLE PHYSICS IN KARLSRUHE I. The Team II. Research in Theoretical Particle

Subatomic (Particle) Physics in Canada The Canadian particle physics community Our

Project 2: Basic particle system Constrained Particle System Tinkertoys Requirements for

UK Particle Physics Outreach Very Selective Highlights Peter Watkins Head of Particle Physics

Particle Physics: The Standard Model Dirk Zerwas LAL zerwas@lal.in2p3.fr March 14, 2013 Dirk

Particle Physics: The Standard Model Dirk Zerwas LAL zerwas@lal.in2p3.fr March 8, 2012 Dirk

THE NORWEGIAN HIGH ENERGY HIGH ENERGY PARTICLE PARTICLE PHYSICS PHYSICS PROJECT 2006-11

Particle Fever M e l b o u r n e - A u g 2 1 - 2 0 1 5 1 Fundamental Particle Physics The

The Particle Physics Odyssey [ Where are we? Where are we going? ] G. Isidori The Particle

! Importance of Particle Adhesion ! Importance of Particle Adhesion ! History of Particle

20 Particle Systems Steve Marschner Eston Schweickart CS4620 Spring 2017 Examples of Particle

Particle dynamics Particle overview Particle system Forces Constraints

Particle dynamics Particle overview Particle system Forces Constraints

Statistical Methods for Particle Physics Day 2: Statistical Tests and Limits

STFC Science Roadmap Challenges A) How did the Universe begin and how is it evolving? 1) What is

Rcourse: Basic statistics with R Sonja Grath, No emie Becker & Dirk Metzler Winter

NLP for low-resourced languages Teresa Lynn, PhD Research Fellow ADAPT Centre Dublin City

Presenting Data e.g., bronze, silver, gold ordered e.g., support, tank, jungler

Week 2: from categorical and ordered Express Separate Express Separate Arrange

texdoc 2.0 An update on creating LaTeX documents from within Stata Example 2 Ben Jann

Calibrate p values by taking the square root Rutgers Foundations of Probability Seminar

Acknowledgements Acknowledgements Coauthors: Amy Wilson-Stronks, The Joint Commission,

RGG An XML based GUI Generator for R Ilhami Visne 1 , Klemens Vierlinger 1 , Friedrich Leisch 2 ,

Some Statistical Tools for Particle Physics Particle Physics - PowerPoint PPT Presentation

Some Statistical Tools for Particle Physics Particle Physics Colloquium MPI fr Physik u. Astrophysik Munich, 10 May, 2016 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan MPI

Elementary Particle Physics in a Nutshell Elementary Particle Physics in a Nutshell

THEORETICAL PARTICLE PHYSICS IN KARLSRUHE I. The Team II. Research in Theoretical Particle

Subatomic (Particle) Physics in Canada The Canadian particle physics community Our

Project 2: Basic particle system Constrained Particle System Tinkertoys Requirements for

UK Particle Physics Outreach Very Selective Highlights Peter Watkins Head of Particle Physics

Particle Physics: The Standard Model Dirk Zerwas LAL zerwas@lal.in2p3.fr March 14, 2013 Dirk

Particle Physics: The Standard Model Dirk Zerwas LAL zerwas@lal.in2p3.fr March 8, 2012 Dirk

THE NORWEGIAN HIGH ENERGY HIGH ENERGY PARTICLE PARTICLE PHYSICS PHYSICS PROJECT 2006-11

Particle Fever M e l b o u r n e - A u g 2 1 - 2 0 1 5 1 Fundamental Particle Physics The

The Particle Physics Odyssey [ Where are we? Where are we going? ] G. Isidori The Particle

! Importance of Particle Adhesion ! Importance of Particle Adhesion ! History of Particle

20 Particle Systems Steve Marschner Eston Schweickart CS4620 Spring 2017 Examples of Particle

Particle dynamics Particle overview Particle system Forces Constraints

Particle dynamics Particle overview Particle system Forces Constraints

Statistical Methods for Particle Physics Day 2: Statistical Tests and Limits

STFC Science Roadmap Challenges A) How did the Universe begin and how is it evolving? 1) What is

Rcourse: Basic statistics with R Sonja Grath, No emie Becker &amp; Dirk Metzler Winter

NLP for low-resourced languages Teresa Lynn, PhD Research Fellow ADAPT Centre Dublin City

Presenting Data e.g., bronze, silver, gold ordered e.g., support, tank, jungler

Week 2: from categorical and ordered Express Separate Express Separate Arrange

texdoc 2.0 An update on creating LaTeX documents from within Stata Example 2 Ben Jann

Calibrate p values by taking the square root Rutgers Foundations of Probability Seminar

Acknowledgements Acknowledgements Coauthors: Amy Wilson-Stronks, The Joint Commission,

RGG An XML based GUI Generator for R Ilhami Visne 1 , Klemens Vierlinger 1 , Friedrich Leisch 2 ,

Rcourse: Basic statistics with R Sonja Grath, No emie Becker & Dirk Metzler Winter