A New Method for Tackling Limited Monte Carlo Carlos Argelles - - PowerPoint PPT Presentation

a new method for tackling limited monte carlo
SMART_READER_LITE
LIVE PREVIEW

A New Method for Tackling Limited Monte Carlo Carlos Argelles - - PowerPoint PPT Presentation

A New Method for Tackling Limited Monte Carlo Carlos Argelles Austin Schneider Tianlu Yuan 1 Analysis Scenario: Binned data, Poisson likelihood, and Simulation 2 Analysis Scenario Three requirements: 1. Binned data counts 2.


slide-1
SLIDE 1

A New Method for Tackling Limited Monte Carlo

Carlos Argüelles Austin Schneider Tianlu Yuan 1

slide-2
SLIDE 2

Analysis Scenario: Binned data, Poisson likelihood, and Simulation

2

slide-3
SLIDE 3

Analysis Scenario

3

Three requirements: 1. Binned data counts 2. Independent rare processes 3. Modelled by simulation Applies to much of particle-physics and astrophysics

slide-4
SLIDE 4

Analysis Scenario

4

Three requirements: 1. Binned data counts 2. Independent rare processes 3. Modelled by simulation Applies to much of particle-physics and astrophysics

PhysRevLett.121.221801

slide-5
SLIDE 5

Analysis Scenario

5

Three requirements: 1. Binned data counts 2. Independent rare processes 3. Modelled by simulation Applies to much of particle-physics and astrophysics

“It is well known that the count of independent, rare natural processes can be described by the Poisson distribution.”

slide-6
SLIDE 6

Three requirements: 1. Binned data counts 2. Independent rare processes 3. Modelled by simulation Applies to much of particle-physics and astrophysics

Analysis Scenario

6

Detector / analysis response Generated event properties

slide-7
SLIDE 7

Reweighting

Simulating every physical hypothesis theta is too expensive Reweighting modifies the physical hypothesis with the same simulation set

7

Detector / analysis response Generated event properties

Event properties Physical hypothesis

slide-8
SLIDE 8

Approximate Expectations

8

Sum weights to obtain expected number of events Using this approximation we construct the AdHoc likelihood from the Poisson likelihood The error on this approximation vanishes as we approach large simulation size

slide-9
SLIDE 9

The Curse of Rare Processes and Small Signals

9

The signals we look for are small Our virtual detector works similarly to the real thing. Sometimes we can work around this. Sometimes we can’t, or MC is too expensive.

slide-10
SLIDE 10

Accounting for errors

10

slide-11
SLIDE 11

Incorporating Errors

11

Exact knowledge of lambda → treat lambda probabilistically using Bayes’ theorem The likelihood is informed by the MC This generalizes the likelihood Note that we recover the AdHoc likelihood when

slide-12
SLIDE 12

Obtaining

12

Monte Carlo can also be modelled by a Poisson process Number of MC events m is Poisson distributed For simplicity, consider the case

  • f equal weights

The data expectation 𝛍 is related to the weight So the likelihood of lambda becomes

slide-13
SLIDE 13

Extension to Arbitrary Weights

13

For arbitrary weights we can consider mu and sigma in terms of the “effective” weights and counts Proceed exactly as before but now where the factorial has been replaced by a gamma function*.

*Note: this is fine because this factor does not depend on lambda and cancels in the normalization step, although for an un-normalized likelihood this might present a problem.

slide-14
SLIDE 14

Equal → Arbitrary (An implicit assumption)

14

Bohm and Zech (2012) showed that a scaled Poisson distribution is a good approximation to this when the first and second moments are matched The effective treatment uses the scaled Poisson distribution More details on this can be found in our paper, DOI:10.1007/JHEP06(2019)030

Equal Weights Arbitrary Weights Distribution Scaled Poisson Compound Poisson Used in the likelihood Scaled Poisson Scaled Poisson

slide-15
SLIDE 15

With the likelihood of lambda, we use Bayes’ theorem to compute the probability of lambda assuming a uniform prior

15

Where G is the gamma distribution and

slide-16
SLIDE 16

The Effective Likelihood

16

Integrating over the true expectation, we now have the effective likelihood This accounts for the uncertainty from finite Monte Carlo sample size

slide-17
SLIDE 17

Performance

17

slide-18
SLIDE 18

A Toy Experiment

18

  • Measure a resonance component on top of a steeply falling background.
  • Simulate comparable amounts of signal and background
  • Generated according to power-law distributions
  • Smeared with different uncertainties

18

slide-19
SLIDE 19

Point Estimation

19

The effective likelihood produces similar results to the Poisson description The maximum likelihood is an unbiased estimator for large MC sample size

slide-20
SLIDE 20

Coverage

20

We produce 500 independent Monte Carlo sets and 500 data sets to test the coverage True coverage compared to Wilks’ asymptotically approximated coverage Effective likelihood provides a good estimate

  • f the coverage

AdHoc likelihood vastly underestimates the coverage

slide-21
SLIDE 21

A 2D Bayesian Example

21

The effective likelihood is also suitable for Bayesian analyses The effective likelihood broadens in the case of low MC sample size, providing robust error regions The AdHoc likelihood is liable to underestimate the width

Increasing MC Size

slide-22
SLIDE 22

Performance Comparison

22

Comparing the runtime of the effective likelihood to other treatments

slide-23
SLIDE 23

Caveats

23

  • Bin to bin correlations are not directly built into the likelihood.

Assume that correlated shape uncertainties will be handled implicitly by the reweighting.

  • Estimate of variance in bin expectation relies on Monte Carlo events in the bin.

If a population of events with large possible contribution to the variance is not included, then the estimate of the variance may be incorrect.

  • Monte Carlo is needed in every bin.
slide-24
SLIDE 24

Summary

24

  • The exact expectation in a bin is usually unknown
  • It is important to account for the uncertainty inherent with limited Monte Carlo

samples

  • The effective likelihood

○ Provides a robust treatment of these errors, provided MC is available ○ Converges to the AdHoc likelihood for large MC ○ Has improved coverage properties ○ Can be substituted directly for the AdHoc likelihood https://austinschneider.github.io/MCLLH/

slide-25
SLIDE 25

Likelihood Summary

25

Implementations and paper links can be found here: https://austinschneider.github.io/MCLLH/