A New Method for Tackling Limited Monte Carlo
Carlos Argüelles Austin Schneider Tianlu Yuan 1
A New Method for Tackling Limited Monte Carlo Carlos Argelles - - PowerPoint PPT Presentation
A New Method for Tackling Limited Monte Carlo Carlos Argelles Austin Schneider Tianlu Yuan 1 Analysis Scenario: Binned data, Poisson likelihood, and Simulation 2 Analysis Scenario Three requirements: 1. Binned data counts 2.
Carlos Argüelles Austin Schneider Tianlu Yuan 1
2
3
Three requirements: 1. Binned data counts 2. Independent rare processes 3. Modelled by simulation Applies to much of particle-physics and astrophysics
4
Three requirements: 1. Binned data counts 2. Independent rare processes 3. Modelled by simulation Applies to much of particle-physics and astrophysics
PhysRevLett.121.221801
5
Three requirements: 1. Binned data counts 2. Independent rare processes 3. Modelled by simulation Applies to much of particle-physics and astrophysics
Three requirements: 1. Binned data counts 2. Independent rare processes 3. Modelled by simulation Applies to much of particle-physics and astrophysics
6
Detector / analysis response Generated event properties
Simulating every physical hypothesis theta is too expensive Reweighting modifies the physical hypothesis with the same simulation set
7
Detector / analysis response Generated event properties
Event properties Physical hypothesis
8
Sum weights to obtain expected number of events Using this approximation we construct the AdHoc likelihood from the Poisson likelihood The error on this approximation vanishes as we approach large simulation size
9
The signals we look for are small Our virtual detector works similarly to the real thing. Sometimes we can work around this. Sometimes we can’t, or MC is too expensive.
10
11
Exact knowledge of lambda → treat lambda probabilistically using Bayes’ theorem The likelihood is informed by the MC This generalizes the likelihood Note that we recover the AdHoc likelihood when
12
Monte Carlo can also be modelled by a Poisson process Number of MC events m is Poisson distributed For simplicity, consider the case
The data expectation 𝛍 is related to the weight So the likelihood of lambda becomes
13
For arbitrary weights we can consider mu and sigma in terms of the “effective” weights and counts Proceed exactly as before but now where the factorial has been replaced by a gamma function*.
*Note: this is fine because this factor does not depend on lambda and cancels in the normalization step, although for an un-normalized likelihood this might present a problem.
14
Bohm and Zech (2012) showed that a scaled Poisson distribution is a good approximation to this when the first and second moments are matched The effective treatment uses the scaled Poisson distribution More details on this can be found in our paper, DOI:10.1007/JHEP06(2019)030
Equal Weights Arbitrary Weights Distribution Scaled Poisson Compound Poisson Used in the likelihood Scaled Poisson Scaled Poisson
With the likelihood of lambda, we use Bayes’ theorem to compute the probability of lambda assuming a uniform prior
15
Where G is the gamma distribution and
16
Integrating over the true expectation, we now have the effective likelihood This accounts for the uncertainty from finite Monte Carlo sample size
17
18
18
19
The effective likelihood produces similar results to the Poisson description The maximum likelihood is an unbiased estimator for large MC sample size
20
We produce 500 independent Monte Carlo sets and 500 data sets to test the coverage True coverage compared to Wilks’ asymptotically approximated coverage Effective likelihood provides a good estimate
AdHoc likelihood vastly underestimates the coverage
21
The effective likelihood is also suitable for Bayesian analyses The effective likelihood broadens in the case of low MC sample size, providing robust error regions The AdHoc likelihood is liable to underestimate the width
Increasing MC Size
22
Comparing the runtime of the effective likelihood to other treatments
23
Assume that correlated shape uncertainties will be handled implicitly by the reweighting.
If a population of events with large possible contribution to the variance is not included, then the estimate of the variance may be incorrect.
24
samples
○ Provides a robust treatment of these errors, provided MC is available ○ Converges to the AdHoc likelihood for large MC ○ Has improved coverage properties ○ Can be substituted directly for the AdHoc likelihood https://austinschneider.github.io/MCLLH/
25
Implementations and paper links can be found here: https://austinschneider.github.io/MCLLH/