Two-view 2D->3D matching with calorimetry in pandora Dom - - PowerPoint PPT Presentation

two view 2d 3d matching with calorimetry in pandora
SMART_READER_LITE
LIVE PREVIEW

Two-view 2D->3D matching with calorimetry in pandora Dom - - PowerPoint PPT Presentation

Two-view 2D->3D matching with calorimetry in pandora Dom Brailsford, Etienne Chardonnet FD sim/reco meeting 27/04/20 2D->3D matching 2D->3D matching takes 2D clusters (e.g. from each wire view) and matching them across views to make


slide-1
SLIDE 1

Two-view 2D->3D matching with calorimetry in pandora

Dom Brailsford, Etienne Chardonnet FD sim/reco meeting 27/04/20

slide-2
SLIDE 2

2D->3D matching

  • 2D->3D matching takes 2D clusters (e.g. from each wire view)

and matching them across views to make 3D objects


  • Pandora’s main 2D->3D matching algorithm requires a cluster in

three distinct views to function

  • Combining positions from clusters in two of the views infers a

position in the third view. A pseudo chi2 is calculated for inferred vs actual positions on the cluster


  • This is problematic for any detector technology which only has

two views (e.g. the CRP-based dual-phase LArTPCs)

2

slide-3
SLIDE 3

Two-view 2D->3D matching

  • We are exploring how calorimetry could

enhance two-view 2D->3D matching


  • Exploratory algorithm details:
  • 1. Make every pairwise comparison of 2D

clusters in the two views

  • 2. Find the region that the two clusters
  • verlap each other in time
  • 3. For that overlap region, produce fractional

profiles of charge for each cluster

  • 4. Downsample* the two charge profiles so

that they are coarse and equally binned

  • 5. Calculate the charge profiles’ correlation

coefficient and corresponding p-value (p- value calculation on next slide)

  • All plots are taken from custom samples made

in ProtoDUNE dual-phase, but all details are directly relevant for the DUNE far detector, both single-phase and dual-phase

3

550 600 650 700 750 1800 1900 2000 2100 2200 2300 400 450 500 550 600 1700 1800 1900 2000 2100 2200 2300

U view 2D cluster 2D cluster V view Time Time *Resampling method suggested by Andy** **Suggested to Andy by Tom Junk

slide-4
SLIDE 4

Correlation coefficient’s p- value

  • For uncorrelated bivariate normal distribution pairs, the correlation coefficient

follows a Student t-distribution with n-2 degrees of freedom

  • The t-value is



 
 


  • P-value is calculated by integrating the t-distribution above the calculated t

value (a one tailed test)

  • H0: r==0
  • H1: r>0
  • The t-distribution supposedly approximately holds for non-gaussian variables,

provided the sample sizes are large enough. I’ll revisit this in a few slides

4

slide-5
SLIDE 5

Resampled fractional charge profiles (di-muon sample)

10 20 30 40 50

x (cm)

0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04

Fractional value (no units)

U cluster V cluster

5 10 15 20 25 30 35 40 45 50

x (cm)

0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04

Fractional value (no units)

U cluster V cluster

10 20 30 40 50 60 70 80

x (cm)

0.005 0.01 0.015 0.02 0.025 0.03

Fractional value (no units)

U cluster V cluster

5 10 15 20 25 30 35 40 45 50

x (cm)

0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04

Fractional value (no units)

U cluster V cluster

r: 0.142 p-value: 0.062 r: 0.804 p-value: 7e-17 r: 0.821 p-value: 1e-24 r: 0.161 p-value: 0.119

5

slide-6
SLIDE 6

Further 2D->3D matching exploration

  • We can do a little bit more and try to correlate local regions of

the profiles


  • Once the resampled fractional charge profiles have been

constructed

  • Slide a fixed-length window of N bins across the equally-

binned charge profiles

  • Repeatedly calculate correlation coefficient and p-value as

the window slides across the profiles

  • For each p-value calculation, calculate a ‘matching score’
  • Score == 1-p

6

slide-7
SLIDE 7

10 20 30 40 50

X (in cm)

0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

10 20 30 40 50

X (in cm)

0.5 0.6 0.7 0.8 0.9 1 10 20 30 40 50 60 70 80 90

X (in cm)

0.5 0.6 0.7 0.8 0.9 1 10 20 30 40 50

X (in cm)

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Sliding window scores (same di-muon event)

Matching score Matching score Matching score Matching score

7

slide-8
SLIDE 8

10 20 30 40 50

x (cm)

0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045

Fractional value (no units)

U cluster V cluster

10 20 30 40 50

x (cm)

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08

Fractional value (no units)

U cluster V cluster

10 20 30 40 50

x (cm)

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1

Fractional value (no units)

U cluster V cluster

10 20 30 40 50 60 70

x (cm)

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08

Fractional value (no units)

U cluster V cluster

Resampled fractional charge profiles (1mu1p sample)

r: 0.778 p-value: 6e-16 r: 0.449 p-value: 0.010 r: 0.781 p-value: 0.001 r: 0.995 p-value: -6.9e-10

8

slide-9
SLIDE 9

10 20 30 40 50 60

X (in cm)

0.5 0.6 0.7 0.8 0.9 1 10 20 30 40 50 60

X (in cm)

0.2 0.4 0.6 0.8 1 10 20 30 40 50

X (in cm)

0.2 0.4 0.6 0.8 1 10 20 30 40 50 60 70

X (in cm)

0.5 0.6 0.7 0.8 0.9 1

Sliding window scores (1mu1p event)

Matching score Matching score Matching score Matching score

9

slide-10
SLIDE 10

Toy study

  • Revisiting the student t-distribution

assumption


  • Produce 10000 fake fractional charge

profiles

  • Fill 3 histograms with landau throws,

smeared with a gaussian

  • Two hists. are filled with the same

landau values but smeared separately

  • Third hist filled with separate landau

values

  • Each bin is filled N times with distinct

throws to mimic the downsampling

  • Calculate correlation coefficient and p-

value

  • Landau (315, 13)
  • Gaus (1,0.1)
  • N hist bins == 30
  • N samples per bin == 5

1 2 3 4 5 6 7 8 9 10 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 1 2 3 4 5 6 7 8 9 10 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22

10

Correlated fake profile Uncorrelated fake profile

slide-11
SLIDE 11

Toy study

  • Top plot shows correlation

coefficient for the 10,000 universes

  • Black: correlated

distributions

  • Red: uncorrelated

distributions

  • Bottom plot shows

corresponding p-values

  • The red distribution

should be flat, but it is not

0.2 0.4 0.6 0.8 1

p-value

1 10

2

10

3

10

4

10

  • No. universes

1 − 0.8 − 0.6 − 0.4 − 0.2 − 0.2 0.4 0.6 0.8 1

Correlation coefficient

1000 2000 3000 4000 5000

  • No. universes

11

slide-12
SLIDE 12

P-value vs r (t-distribution)

1 − 0.8 − 0.6 − 0.4 − 0.2 − 0.2 0.4 0.6 0.8 1

r

0.2 0.4 0.6 0.8 1

p

20 40 60 80 100 120 140 160 180 200 220 240

12

slide-13
SLIDE 13

0.2 0.4 0.6 0.8 1

p-value

1 10

2

10

3

10

4

10

  • No. universes

Toy study

1 − 0.8 − 0.6 − 0.4 − 0.2 − 0.2 0.4 0.6 0.8 1

Correlation coefficient

1000 2000 3000 4000 5000

  • No. universes
  • Instead, calculate the p-value

using permutation tests

  • Randomly shuffle the bins

for one distribution in a comparison and recalculate r

  • P-value == fraction of

times you measure an r that is more extreme than your original r measurement

  • Top plot shows correlation

coefficient (same as previous slide)

  • Bottom plot shows

corresponding p-value

13

slide-14
SLIDE 14

p-value vs r (permutation test)

1 − 0.8 − 0.6 − 0.4 − 0.2 − 0.2 0.4 0.6 0.8 1

r

0.2 0.4 0.6 0.8 1

p

10 20 30 40 50 60

14

slide-15
SLIDE 15

Future and summary

  • Pandora’s main 2D->3D matching algorithm requires three views to
  • perate
  • Not usable by detectors with two views, such as the CRP-based

dual-phase LArTPCs

  • We are exploring the use of calorimetry in a new 2D->3D matching

algorithm, suitable for two-view LArTPCs

  • The new matching algorithm tries to correlate the energy

depositions between the two views to match clusters

  • So far, the calorimetry-based checks seem to do a good job of

separating matching clusters vs non-matching clusters

  • While the t-test based p-value shows strong separation power, it is not

a robust p-value, statistically speaking. Swapping to a permutation- test based p-value would rectify the robustness

15

slide-16
SLIDE 16

Backup

16

slide-17
SLIDE 17

New two view matching alg

  • Overall aim is to 3D match 2D clusters by correlating the charge depositions in the

clusters common transverse (time) overlap region


  • Recipe (part 1) *
  • Collect candidate matching cluster hits that lie in the overlap region
  • Create cumulative distribution for each cluster’s overlap
  • Smooth the cumulative distribution using linear/quadratic/cubic interpolation. We

are currently using linear interpolation due to quadratic/cubic instability in some regions of the distributions

  • Resample the two smoothed cumulative distributions such that they have

coarser, equal binning. The current scheme downsamples to 1/5th the number

  • f hits in the sparser cumulative distribution
  • Differentiate the two resampled cumulative distributions to produce two equally

binned fractional charge profiles

  • Calculate the correlation coefficient and corresponding p-value between the two

distributions

*resampling suggested by Andy** **suggested to Andy by Tom Junk

17

slide-18
SLIDE 18

1 − 0.8 − 0.6 − 0.4 − 0.2 − 0.2 0.4 0.6 0.8 1

Correlation coefficient

500 1000 1500 2000 2500

  • No. universes

1 − 0.8 − 0.6 − 0.4 − 0.2 − 0.2 0.4 0.6 0.8 1

Correlation coefficient

1000 2000 3000 4000 5000

  • No. universes

1 − 0.8 − 0.6 − 0.4 − 0.2 − 0.2 0.4 0.6 0.8 1

Correlation coefficient

1000 2000 3000 4000 5000

  • No. universes

1 − 0.8 − 0.6 − 0.4 − 0.2 − 0.2 0.4 0.6 0.8 1

Correlation coefficient

1000 2000 3000 4000 5000 6000 7000 8000

  • No. universes

How binning alters r (toy study)

  • N. profile bins: 30
  • N. samples ber bin: 1
  • N. profile bins: 30
  • N. samples ber bin: 5
  • N. profile bins: 150
  • N. samples ber bin: 1
  • N. profile bins: 150
  • N. samples ber bin: 5

18

slide-19
SLIDE 19

Before

19

slide-20
SLIDE 20

Now

20

slide-21
SLIDE 21

Now

21

slide-22
SLIDE 22

Example event

  • Di-muon vertex-like particle gun
  • Both particles have the same x-

direction sign

  • Both particles aim roughly

downstream in the geometry

  • The event on the right will be

used as the example throughout the rest of the slides

22

slide-23
SLIDE 23

The first idea

  • For each cluster comparison doublet e.g. U1V1, U2V1 U1V2, U2V2
  • Find transverse overlap region between two clusters
  • Construct fractional cumulative charge distributions for the two

clusters, keeping within the bounds of the overlap region

  • Calculate test statistic and probability e.g. KS test
  • Store the transverse overlap and matching probability in the

TwoViewTransverseOverlapResult which is stored in the overlap comparison matrix

  • Tools then assess each element in the overlap matrix

23

slide-24
SLIDE 24

Cumulative distributions from the same event

10 20 30 40 50

x (cm)

0.2 0.4 0.6 0.8 1

Fractional value (no units)

U cluster V cluster

5 10 15 20 25 30 35 40 45 50

x (cm)

0.2 0.4 0.6 0.8 1

Fractional value (no units)

U cluster V cluster

10 20 30 40 50 60 70 80

x (cm)

0.2 0.4 0.6 0.8 1

Fractional value (no units)

U cluster V cluster

5 10 15 20 25 30 35 40 45 50

x (cm)

0.2 0.4 0.6 0.8 1

Fractional value (no units)

U cluster V cluster

KS prob.: 0.525927 Kuiper prob.: 0.196262 Correlation: 0.998779 KS prob.: 1 Kuiper prob.: 1 Correlation: 0.999841 KS prob.: 0.994437 Kuiper prob.: 1 Correlation: 0.999928 KS prob.: 0.994469 Kuiper prob.: 0.996247 Correlation: 0.998721

24

slide-25
SLIDE 25

Observations and suggestions

  • By eye, it’s pretty clear which clusters should go together
  • The KS test looks to be too weak to discern the fine differences

between two charge cumulative distributions

  • Kuiper test seems more powerful but still not quite enough to

sufficiently separate

  • There is some very, very, very modest separation in the correlation

coefficient, but the monotonically increasing nature of the cumulative distribution is dominating the metric

  • Options:
  • Segment the cumulative distributions and assess each segment
  • Slide a window across the two clusters and assess the cumulative

distribution in each window. The window would slide such that each hit would be at the centre of a window

  • Assess fractional charge profiles rather than cumulative distribution

25

slide-26
SLIDE 26

Segmenting the distribution (sorry, no segmented plots to show yet)

10 20 30 40 50

x (cm)

0.2 0.4 0.6 0.8 1

Fractional value (no units)

U cluster V cluster

5 10 15 20 25 30 35 40 45 50

x (cm)

0.2 0.4 0.6 0.8 1

Fractional value (no units)

U cluster V cluster

10 20 30 40 50 60 70 80

x (cm)

0.2 0.4 0.6 0.8 1

Fractional value (no units)

U cluster V cluster

5 10 15 20 25 30 35 40 45 50

x (cm)

0.2 0.4 0.6 0.8 1

Fractional value (no units)

U cluster V cluster

Segment KS prob. 1 0.009 2 0.59 3 0.82 4 1 5 0.23 Segment KS prob. 1 0.84 2 1 3 0.97 4 1 5 1 Segment KS prob. 1 0.43 2 1 3 0.98 4 1 5 1 Segment KS prob. 1 0.88 2 0.85 3 0.93 4 1 5 0.99 26

slide-27
SLIDE 27

Fractional charge profiles

5 10 15 20 25 30 35 40 45 50

x (cm)

0.001 0.0015 0.002 0.0025 0.003 0.0035 0.004 0.0045

Fractional value (no units)

U cluster V cluster

10 20 30 40 50 60 70 80

x (cm)

0.001 0.002 0.003 0.004 0.005 0.006

Fractional value (no units)

U cluster V cluster

10 20 30 40 50

x (cm)

0.0005 0.001 0.0015 0.002 0.0025 0.003 0.0035 0.004 0.0045

Fractional value (no units)

U cluster V cluster

5 10 15 20 25 30 35 40 45 50

x (cm)

0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01 0.011

Fractional value (no units)

U cluster V cluster

Correlation: -0.000817181 Correlation: +0.0189502 Correlation: +0.00655381 Correlation: -0.0644948

27

slide-28
SLIDE 28

Summary

  • A draft of the two view overlap result class infrastructure has

now been written

  • It is not supposed to be a final blueprint, but enough to

start studying matching metrics

  • We’re now looking into calorimetry-based matching metrics

for the two view matching


  • Findings so far
  • Calorimetry looks to be a powerful hook in the dual phase
  • The pain is going to be how best to discern the subtle-yet-
  • bvious differences between charge distributions
  • We still have some avenues to explore

28