Data Combination in Particle Physics Correlated and non-Gaussian - - PowerPoint PPT Presentation

data combination in particle physics
SMART_READER_LITE
LIVE PREVIEW

Data Combination in Particle Physics Correlated and non-Gaussian - - PowerPoint PPT Presentation

Terascale School on data combination and limit setting 04 October to 07 October 2011 DESY Data Combination in Particle Physics Correlated and non-Gaussian data with systematic uncertainties Volker Blobel Universit at Hamburg


slide-1
SLIDE 1

Terascale School on data combination and limit setting

04 October to 07 October 2011 – DESY

Data Combination in Particle Physics

Correlated and non-Gaussian data with systematic uncertainties

Volker Blobel − Universit¨ at Hamburg

Constrained least squares as a natural method is a more general alternative to χ2-function minimiza- tion, especially for data combination.

Part 1. Combining correlated data Part 2. Constrained least squares Part 3. Combining non-Gaussian data

Keys during display: enter = next page; → = next page; ← = previous page; home = first page; end = last page (index, clickable); C-← = back; C-N = goto page; C-L = full screen (or back); C-+ = zoom in; C-− = zoom out; C-0 = fit in window; C-M = zoom to; C-F = find; C-P = print; C-Q = exit.

slide-2
SLIDE 2

Part 1. Combining correlated data

  • 1. Combining data
  • 2. Averaging by linear least squares
  • 3. Mean values, variances and covariances, correlations
  • 4. Combining correlated data of a single quantity
  • 5. Weights by Lagrange multiplier method
  • 6. Least squares: Gauss, Legendre and Lagrange
  • 7. Charm particle lifetime
  • 8. Common additive systematic error I
  • 9. Common multiplicative systematic error I
  • 10. Average of two correlated data
  • 11. Two-by-two covariance matrix from maximum likelihood
  • 12. The two-dimensional normal distribution
  • 13. Dzero result
  • 14. Dzero: how big is the correlation coefficient?
  • 15. Two alternative, but equivalent data combination methods
  • 16. The PDG strategy
  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 2

slide-3
SLIDE 3
  • 1. Combining data

For many physics analyses, several channels are combined into one result. In a similar way results from different experiments are merged. The aim of this procedure is to increase the precision of the results . . . no bias, as accurate as possible . . . Lifetime of charmed particles: four different values extracted with different methods from the same data – hence correlated Like-sign dimuon charge asymmetry: two values determined from different events samples – av- erage has significant deviation from the SM Mass of the top quark: several measured values with several systematic errors Non-linearity: e.g. normalization uncertainty: straightforward methods produce biased results Non-linearity: combining over-determined measurements of triangle parameters Non-Gaussian data: Poisson-distributed data, lognormal factors “Error” propagation: quantity depends on several measured values From linear problems with Gaussian variables . . . to non-linear problems with non-Gaussian variables and several sources of systematic errors. Essential: Understanding of physics, detector behaviour and data analysis.

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 3

slide-4
SLIDE 4
  • 2. Averaging by linear least squares

Weighted average: The average value xave of single values xi, i = 1, 2, . . . n, with covariance matrix V x, is the weighted sum: xave =

  • i

wi xi with

  • i

wi = 1 ( xave unbiased, if xi unbiased) where the weights wi are usually positive, but can be negative in certain cases. The weighted average according to the equation is unbiased, if the single values are unbiased: E [xave] =

  • i

wiE [xi] =

  • i

wiµ = µ for

  • i

wi = 1 The variance σ2

ave follows from the law of (linear) propagation of uncertainties.

Uncorrelated data xi ± σi: (V x diagonal) wi =

  • i

1 σ2

i

−1 · 1 σ2

i

σ2

ave =

  • i

1 σ2

i

−1

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 4

slide-5
SLIDE 5
  • 3. Mean values, variances and covariances, correlations

1-dim. random variable: mean value µ = E [x] =

  • x · p(x) dx

p(x) = pdf variance σ2 = E

  • (x − µ)2

= V [x] =

  • (x − µ)2 · p(x) dx .

n-dim. random variable: mean value µ = E [x] =

  • . . .
  • x · p(x) dx

covariance matrix V x = E

  • (x − µ) (x − µ)T

= V [x] . Random vector x, covariance matrix V x and correlation matrix Cx: x =       x1 x2 . . . . . . xn       V x =       σ11 σ12 . . . σ1n σ21 σ22 . . . σ2n . . . . . . . . . . . . . . . . . . σn1 σn2 . . . σnn       Cx =       1 ρ12 . . . ρ1n ρ21 1 . . . ρ2n . . . . . . . . . . . . . . . . . . ρn1 ρn2 . . . 1       The elements (V )jk are the covariances (V )jk = σjk = ρjkσjσk, and ρjk are the correlation coefficients with −1 ≤ ρjk ≤ +1 (values of the correlation coefficients are often printed in %).

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 5

slide-6
SLIDE 6
  • 4. Combining correlated data of a single quantity

Correlated data xi with (non-diagonal) covariance matrix V x: needs inverse V −1

x

wi =

  • j,k
  • V −1

x

  • jk

−1

j

  • V −1

x

  • ij

σ2

ave = wV xwT = n

  • i,j=1

wiwj (V x)ij Optimal values of the weights are obtained by χ2-minimization (linear least squares). The inverse of V x is used as weight matrix in the weighted sum of squares: (x = (x1, x2, . . . xn)T) χ2-function S(xave) = (xave · 1 − x)T V −1

x (xave · 1 − x) =

  • i,j

(xave − xi)

  • V −1

x

  • ij (xave − xj)

(all components of the vector 1 are one). The χ2-function S(xave) is minimized with respect to the

  • ne parameter xave.

Generalization to different problems with multiple quantities possible, but constrained least squares is simpler.

See NIMA A 500 (2003) 391–405); NIMA A 270 (1988) 110–117

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 6

slide-7
SLIDE 7

. . . contnd.

Correlated data xi with (non-diagonal) covariance matrix V x: The derivative of the weighted sum of squares S(xave) with respect to the parameter xave 1 2 ∂S ∂xave = 1TV −1

x 1 · xave − 1TV −1 x x

is set to zero to obtain the solution xave =

  • 1TV −1

x 1

−1 1TV −1

x

  • x = w x

The expression in []-parentheses is the weight w w =

  • 1TV −1

x 1

−1

  • scalar

1TV −1

x row vector

wi =

  • j,k
  • V −1

x

  • jk

−1

j

  • V −1

x

  • ij

The weight w is a 1-by-n matrix w = (w1, w2 . . . wn) (or a row vector) and average value xave and its variance σ2

ave are given by

xave = w x =

n

  • i=1

wixi σ2

ave = wV xwT = n

  • i,j=1

wiwj (V x)ij

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 7

slide-8
SLIDE 8
  • 5. Weights by Lagrange multiplier method

Alternative method for the determination of weights: Minimization of the variance σ2

ave of the

average σ2

ave = wV xwT = n

  • i,j=1

wiwj (V x)ij subject to the equality constraint

  • i

wi = 1 Method of Lagrange multipliers with Lagrange function: L(w, λ) =

n

  • i,j=1

wiwj (V x)ij + λ

  • i

wi − 1

  • with a single Lagrange multiplier λ.

The constrained problem is solved by setting the derivative of L(w, λ) with respect to the weights wi and to the Lagrange multiplier λ to zero. The result is identical to the previous result.

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 8

slide-9
SLIDE 9
  • 6. Least squares: Gauss, Legendre and Lagrange

Carl Friedrich Gauß (1777 –1855) Used the least squares method already around 1794, but did not publish it at that time, but later in 1809. “Our principle, which we have made use of since 1795, has lately been published by Legendre . . . .” Proved in 1821 and 1823 the optimality of the least squares estimate without any assumptions that the random varibales follow a particular distribution (rediscovered by Markoff in 1912). Adrien-Marie Legendre (1752 – 1833) Was the first to publish the method in 1805. Joseph-Louis Lagrange (1736 –1813) Method of Lagrange multipliers (optimization of functions

  • f several variables subject to equality constraints) formulated in “Le¸

cons sur le calcul des fonc- tions” (1804). Used the principle of minimizing the sum of the absolute residuals

i |ri|, with

  • i ri = 0 in 1799.

Gauß-Markoff-Theorem: The vector x ∈ Rn of measurements with a vector ǫ of random errors is assumed to be related to an unknown parameter (or parameters) by a fixed linear model relation. All el- ements ǫi of ǫ have zero means (no bias), are uncorrelated and have the same variance. The residuals ri are the differences between the measured values xi and the values, given by the parametrization (linear model). The best (i.e. minimum variance) linear unbiased estimator (BLUE) for the parameter(s) is the least squares estimator, minimizing the sum of squared residuals ||r||2. No assumptions are re- quired that the random errors follow a particular distribution; i.e. the normal (Gaussian) distribution is not required.

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 9

slide-10
SLIDE 10
  • 7. Charm particle lifetime

Lifetime (in units of 10−13 sec) of charmed particles in a CERN experiment determined using four different methods from the same data (hence correlated): τ =       9.5 +1.7

−1.2

11.9 +1.5

−1.3

11.1 +1.8

−1.2

8.9 +1.6

−1.2

      V τ =     2.74 1.15 0.86 1.31 1.15 1.67 0.82 1.32 0.86 0.82 2.12 1.05 1.31 1.32 1.05 2.93     Cτ =       τ1 τ2 τ3 τ4 τ1 − τ2 0.54 − τ3 0.36 0.44 − τ4 0.46 0.60 0.42 −       Upper and lower errors transformed to a symmetric interval by the geometric mean, for example to replace

+1.7 −1.2 by

√ 1.7 · 1.2 = 1.43. Full error matrix estimated by MC simulation of large number of “experiments”, analysed in exactly the same way as real data (plus some corrections). weight factors w = (0.14, 0.47, 0.35, 0.04) Final average for the charm lifetime: τave = (11.16 ± 1.13) × 10−13 s [ τave = (10.62 ± 0.75) × 10−13 s if correlations ignored ] Bias of ≈ 1 standard deviation and underestimation of accuracy, if correlations are ignored.

  • L. Lyons et al., Combining correlated estimates of a single physical quantity, NIMA A270 (1988) 110–117
  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 10

slide-11
SLIDE 11
  • 8. Common additive systematic error I

Common additive uncertainty : xi ± σi ± ∆ (identical systematic error ∆)

  • (PDG:) first average xi ± σi, then combine error with ∆2;
  • (PDG:) apply factor (1 + ∆2 (

i 1/σ2 i ))1/2 to all errors, and treat as uncorrelated;

  • syst. uncertainty ∆ as additional measured value

. . . as correlated data . . . x1 = x′

1 + a

x2 = x′

2 + a

x′

1

± σ1 x′

2

± σ2 a = ± ∆ V   x′

1

x′

2

a   =   σ2

1

σ2

2

∆2  

  • define non-diagonal covariance matrix by law of (linear) propagation of uncertainties, with

(V x)ii = σ2

i + ∆2

(V x)ij = ∆2 i = j. V x1 x2

  • =

1 1 1 1

  • V

  x′

1

x′

2

a     1 1 1 1   = σ2

1 + ∆2

∆2 ∆2 σ2

2 + ∆2

  • all methods are equivalent.
  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 11

slide-12
SLIDE 12
  • 9. Common multiplicative systematic error I

Common multiplicative uncertainty: i.e. normalization or calibration uncertainty e.g. (xi ± σi) (1 ± ∆) non-linear! Data with common multiplicative systematic uncertainty . . . as correlated data . . . x1 = x′

1 × a

x2 = x′

2 × a

x′

1

± σ1 x′

2

± σ2 a = 1 ± ∆ V   x′

1

x′

2

a   =   σ2

1

σ2

2

∆2   Non-diagonal covariance matrix by law of propagation of uncertainties: non-linear transformation V x1 x2

  • =

a x1 a x2

  • V

  x′

1

x′

2

a     a a x1 x2   = σ2

1 + x2 1∆2

x1x2∆2 x1x2∆2 σ2

2 + x2 2∆2

  • with a = 1

Dangerous: elements of the transformation matrix are not constant; the two representations are not equivalent. . . . discussed later.

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 12

slide-13
SLIDE 13
  • 10. Average of two correlated data

Data vector x and covariance matrix V x: x = x1 x2

  • V x =
  • σ2

1

ρσ1σ2 ρσ1σ2 σ2

2

  • inverse covariance matrix = V −1

x

= 1 1 − ρ2

  • 1/σ2

1

−ρ/(σ1σ2) −ρ/(σ1σ2) 1/σ2

2

  • Weight factors and variance of average determined by σ1, σ2 and ρ:

w1 = σ2

2 − ρσ1σ2

σ2

1 + σ2 2 − 2ρσ1σ2

w2 = 1 − w1 = σ2

1 − ρσ1σ2

σ2

1 + σ2 2 − 2ρσ1σ2

σ2

ave =

  • 1 − ρ2

σ2

1σ2 2

σ2

1 + σ2 2 − 2ρσ1σ2

= variance of average No correlation: ρ = 0 ρ = 0 ρ = 0 w1 = σ2

2

σ2

1 + σ2 2

w2 = σ2

1

σ2

1 + σ2 2

σ2

ave =

1 σ2

1

+ 1 σ2

2

−1 negative correlation ρ < 0 ρ < 0 ρ < 0: both weights are always positive and xave is between x1 and x2. positive correlation ρ > 0 ρ > 0 ρ > 0: weight of the less accurate value may become negative and then the average xave is outside the range x1 . . . x2.

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 13

slide-14
SLIDE 14

. . . contnd.

1.0 0.5 0.0 0.5 1.0 correlation coefficent ρ 1.0 0.5 0.0 0.5 1.0 1.5 2.0 σ1 =σ2 =1

weight w1 weight w2 standard deviation of average

1.0 0.5 0.0 0.5 1.0 correlation coefficent ρ 1.0 0.5 0.0 0.5 1.0 1.5 2.0 σ1 =1 σ2 =2

weight w1 weight w2 standard deviation of average

  • Weight w2 < 0 for large positive correlation ρ > + σ1

σ2

not meaningful?

  • negative weights impossible for correlations by overlapping data samples,
  • xave ≡ x1 and σave ≡ σ1, no improvement for ρ = σ1

σ2,

  • expected difference2 E [(x1 − x2)2] = σ2

1 + σ2 2 − 2ρσ1σ2

  • smaller value of σave for negative correlation.
  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 14

slide-15
SLIDE 15
  • 11. Two-by-two covariance matrix from maximum likelihood

Maximum likelihood estimate x : L( x) = max

x

L(x) likelihood equation: ∂ log L/∂xj = 0 j = 1, 2, . . . , n Hessian (second order derivative matrix of the negative log likelihood function) with elements (Hx)jk = −∂2 log L ∂xj∂xk . Cram´ er-Rao inequality (V x)jk ≥ (Hx)jk . . . in practice: V x = H−1 The 2-dimensional case: H = H11 H12 H12 H22

  • V x =
  • σ2

1

ρσ1σ2 ρσ1σ2 σ2

2

  • ρ = −

H12 √H11H22 σ2

1 =

  • 1

1 − ρ2

  • 1

H11 σ2

2 =

  • 1

1 − ρ2

  • 1

H22

1.0 0.5 0.0 0.5 1.0 coefficient of correlation ρ 10 10

1

10

2

Variance inflation factor VIF =1/(1−ρ2 )

VIF =1/(1−ρ2 )

For non-zero correlation ρ = 0 the variances are larger by the factor 1/(1−ρ2), the variance inflation factor, compared to zero correlation.

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 15

slide-16
SLIDE 16
  • 12. The two-dimensional normal distribution
  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 16

slide-17
SLIDE 17
  • 13. Dzero result

Measurement of the anomalous like-sign dimuon charge asymmetry with 9 fb−1 of pp collisions arXiv:1106.6308[hep:ex]; Fermilab-Pub-11/307-E inclusive muon sample Ab

sl = (−1.04 ± 1.30 (stat) ± 2.31 (syst)) %

like-sign dimuon sample Ab

sl = (−0.808 ± 0.202 (stat) ± 0.222 (syst)) %

combined result Ab

sl = (−0.787 ± 0.172 (stat) ± 0.093 (syst)) %

0.04 0.03 0.02 0.01 0.00 0.01 1 2 3 4 5 6 7

SM average like-sign dimuon inclusive muon ∆ =3.9σ ∆ =2.6σ

anomalous dimuon charge asymmetry

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 17

slide-18
SLIDE 18
  • 14. Dzero: how big is the correlation coefficient?

Statistical and systematic error quadratically added: inclusive muon sample Ab

sl = (−1.04 ± 2.65 (tot)) %

like-sign dimuon sample Ab

sl = (−0.808 ± 0.300 (tot)) %

Weights w1, w2, and st.dev. of average, as a function of ρ:

1.0 0.5 0.0 0.5 1.0 correlation coefficent ρ 0.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 Anomalous Like-Sign Dimuon Charge Asymmetry

weight w1 weight w2

1.0 0.5 0.0 0.5 1.0 correlation coefficent ρ 0.0000 0.0005 0.0010 0.0015 0.0020 0.0025 0.0030 0.0035 0.0040 Anomalous Like-Sign Dimuon Charge Asymmetry

standard deviation of average published D0 standard deviation

0.03 0.02 0.01 0.00 0.01 0.02 inclusive muon sample 0.02 0.01 0.00 0.01 0.02 like-sign dimuon sample ~

measured values (star) with error ellipse

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 18

slide-19
SLIDE 19
  • 15. Two alternative, but equivalent data combination methods

Assuming the two measured values with uncorrelated statistical and fully (ρ = +1) correlated system- atic uncertainties: Three uncorrelated data and one constraint: x′

1 = (−1.04 ± 1.3) %

x′

2 = (−0.808 ± 0.202) %

∆ = (0 ± 1) % f1 = (x′

1 + 2.31 · ∆) − (x′ 2 + 0.222 · ∆)

i.e. extra variable ∆ for systematic uncertainty. Two correlated data and one constraint: x1 = (−1.04 ± 1.3) % x2 = (−0.808 ± 0.202) % V = 1.32 + 2.312 2.31 · 0.222 2.31 · 0.222 0.2022 + 0.2222

  • f1 = x1 − x2

i.e. covariance matrix with statistical and systematic uncertainties. The two results are identical: Ab

sl = (−0.792 ± 0.246) %

χ2

1 = 0.009

p-value = 92.5% published Ab

sl = (−0.787 ± 0.196) %

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 19

slide-20
SLIDE 20
  • 16. The PDG strategy

Linear least squares, minimizing value of χ2 expression, using standard averaging: value x with error δx. Selection of reliable data from the literature:

  • add statistical and systematic errors in quadrature, handle “asymmetric errors”;
  • e.g. do not use data from preprints;
  • . . . check consistency of the data, using χ2, compared to expected value N − 1, for N data.
  • χ2 ≤ N − 1 :

χ2 ≤ N − 1 : χ2 ≤ N − 1 : accept result x as average and δx as error. PDG: χ2 ≫ N − 1 : χ2 ≫ N − 1 : χ2 ≫ N − 1 : bad χ2 – data incompatible – reject result for average ? > χ2 > N − 1 : ? > χ2 > N − 1 : ? > χ2 > N − 1 : χ2 greater than expected value, but not greatly so: errors may be underestimated – average x remains unchanged, but error δx increased by scale factor S =

  • χ2

N − 1 1/2 . . . undetected correlations? χ2 χ2 χ2 very large: data inconsistent or er- rors underestimated or undetected negative correlations; χ2 χ2 χ2 very small: either errors overesti- mated (could decrease uncertainty δx of average), or undetected pos- itive correlations (could increase uncertainty δx of average).

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 20

slide-21
SLIDE 21

Part 2. Constrained Least Squares

  • 1. x-y-data with uncertainties in both coordinates
  • 2. Constrained Least Squares
  • 3. Alternative least squares methods for fitting/averaging
  • 4. Comparison
  • 5. Constrained least squares fit program Aplcon
  • 6. Averaging correlated scattering lengths
  • 7. Straight line with uncertainties in both coordinates
  • 8. Straight line and correlated data
  • 9. Uncertainties of fit parameters
  • 10. Mathematics: solution of constrained least squares
  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 21

slide-22
SLIDE 22
  • 1. x-y-data with uncertainties in both coordinates

The subject is discussed by Press et. al. (Numerical Recipes) with the remarks: “If experimental data are subject to measurement error not only in the yi’s, but also in the xi’s, then the task of fitting a straight-line model y(x) = a + bx is considerably harder ...Be aware that the liter- ature on the seemingly straightforward subject of this section is generally confusing and sometimes plain wrong.” What is the uncertainty of residual ri = yi −(a+b xi)?

2 4 6 8 2 4 6 8 x y

(data in figure from C.A. Cantrell) C.A. Cantrell [Atmos. Chem. Phys., 8, 5477-5487, 2008] lists > 30 publications for methods (including methods giving wrong results), only for straight-line fits, almost all for uncorrelated data only. try “Deming regression” or “error-in-variables-model (EIV)” or “total least squares (TLS)” in Google.

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 22

slide-23
SLIDE 23
  • 2. Constrained Least Squares

. . . fundamental equations

Vector of variables x, including

  • measured variables xm (with covariance matrix V x) and
  • unmeasured variables xu (e.g. model parameters).

x = xm xu

  • minimize

∆xT

m V −1 x ∆xm

subject to equality constraints fj (x + ∆x) = 0 j = 1, 2 . . . m Best estimate for variables is x = x + ∆x. Rudolf Boeck, Application of a generalized method of least squares for kinematical analysis of tracks in bubble chambers, CERN 60–30

From publications: “In practice, the added technical complexity of a constrained fit with extra free parameters is not justified . . . ” “The application of Lagrange multipliers is unnecessarily complicated and the linear approximation requires additional assumptions and iterations.”

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 23

slide-24
SLIDE 24
  • 3. Alternative least squares methods for fitting/averaging

xm = measured variables, with covariance matrix V x xu = unmeasured variables, “parameters” x = (xm, xu) = measured and unmeasured variables t = independent coordinates χ2-function minimization Constrained Least Squares S(xu) =

  • i

((xm)i − f(ti, xu))2 σ2

i

= min → rTV −1

x r

= min Residuals r: χ2-function to be minimized is sum

  • f squares of residuals; problems, if residuals
  • depend on > 1 measurement, and/or depend
  • n > 1 error contribution, especially contri-

butions changing the normalization. S(∆xm) = ∆xT

mV −1 x ∆xm = min

fj (xm + ∆xm, xu + ∆xu, t) = 0 j = 1, 2 . . . m Individuals corrections ∆xm for measured variables: expression to be minimized is sum

  • f squares of corrections.
  • Constraints fj(x) = 0 may be implicit ex-

pressions;

  • bias reduced or avoided.

Both alternatives are equivalent, with identical results, for simple problems. In both alternatives the data may be correlated and the functions/constraints may be non-linear.

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 24

slide-25
SLIDE 25
  • 4. Comparison

χ2-function minimization, e.g. using Minuit

  • User has to provide the function S(x), which is “seen” by Minuit. The user function

includes all data, uncertainties, the physical and statistical model.

  • Minuit calculates by finite differences the first derivative of S(x), and approximates, using

the VM method, the full Hessian in ≥ n iterations for linear and non-linear problems.

  • Variables are the parameters ( = unmeasured variables).

Constrained least squares, e.g. using Aplcon 2.0

  • User describes set of variables incl. covariance matrix, and individual model functions fj(x).
  • Aplcon calculates by finite differences the first derivative of all individual model func-

tions fj(x), which allows to calculate the full Hessian during each iteration (Gauss-Newton matrix).

  • Many variables: measured and unmeasured variables plus Lagrange multipliers.
  • Principle used in HEP for > 50 years, mainly with kinematical constraints for particle

reactions and decays; Aplcon 1.0 in use for 35 years.

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 25

slide-26
SLIDE 26
  • 5. Constrained least squares fit program Aplcon

minimize ∆xTV −1

x ∆x

subject to fj (x, t) = 0 j = 1, 2 . . . m Properties:

  • Extreme form of constrained least squares, with separation into a quadratic expression, and a

set of constraints fj(x) with all nonlinearities; solved using Lagrange multipliers;

  • simple to use: derivatives calculated by numerical methods, no step definition necessary, no

principle distinction between measured (Xm) and unmeasured variables (Xu); full initial and final covariance matrix, and pulls;

  • Extension to non-Gaussian variables: selected variables can be treated e.g as Poisson- or log-

normal-distributed;

  • Extension to advanced analysis of uncertainties: profile likelihood
  • Aplcon is a method for difficult problems to follow accurately the assumed physical and statis-

tical model of the measurement process, and to avoid a bias in the result;

  • Aplcon (Fortran) available from www.desy.de/~blobel
  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 26

slide-27
SLIDE 27
  • 6. Averaging correlated scattering lengths

50 years old data on the isospin 1/2 and 3/2 scattering lengths in πp-scattering in the s-state: Experiment (1): a1 = 0.170 ± 0.0240; a3 = −0.107 ± 0.0197; corr. coefficient ρ = −39.1%. Experiment (2): a′

3 = −0.104 ± 0.006.

Input to the Aplcon fit to average the two a3-values and, at the same time, improve the correlated a1: xm =   a1 a3 a′

3

  =   0.170 ± 0.0240 −0.107 ± 0.0197 −0.104 ± 0.0060   V x =   0.580 −0.185 −0.185 0.388 0.036   × 10−3 and after the code f1 = a3 − a′

3 the result by Aplcon is

x =   a1 a3 a′

3

  =   0.169 ± 0.0220 −0.1043 ± 0.0057 −0.1043 ± 0.0057   V x =   0.499 −0.0157 −0.0157 −0.0157 0.0329 0.0329 −0.0157 0.0329 0.0329   × 10−3 .

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 27

slide-28
SLIDE 28
  • 7. Straight line with uncertainties in both coordinates

X := . . . (variable array) Vx := . . . (matrix array) for j = 1 to N f(j) = a + b · xj − yj final result in X and Vx variable array X =         Xm Xu         =                x1 y1 x2 y2 . . . xN yN a b               

2 4 6 8 2 4 6 8 x y

Note: order of measured and unmeasured variable irrelevant – distinguished by zero elements in input covariance matrix Vx. If measurement of slope b exists before: add variance of b to V x, with no change in the program code

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 28

slide-29
SLIDE 29
  • 8. Straight line and correlated data

Now: correlation between x and y in data = 0, and fit of straight line required X := . . . (variable array) Vx := . . . (matrix array) for j = 1 to N f(j) = a + b · xj − yj final result in X and Vx variable array X =         Xm Xu         =                x1 y1 x2 y2 . . . xN yN a b               

2 4 6 8 10 12 2 4 6 8 x y

add off-diagonal elements to V x no change of code red star is fitted xy-value Extension to parabola simple: include c in X and add + c · xj2 to f(j)

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 29

slide-30
SLIDE 30
  • 9. Uncertainties of fit parameters

Aplcon provides full covariance matrix V x for combined variables: fitted values of measured variables and of un- measured variables (“parameters”), from the inverse of the Hessian (by the law of propagation

  • f uncertainties);

pulls for all measured variables: should follow N(0, 1) distributions;

  • Covariance matrix is accurate in simple cases: measured data Gaussian and constraints

linear, or asymptotically in the limit of ∞ data;

  • Matrix may be inaccurate (and non-Gaussian) for non-Gaussian data, constraints from

non-linear models and low statistic

  • statistically improved information is required on

confidence intervals for important parameters. confidence intervals on selected parameters by profile analysis (optional): realized by repeated fits with one additional internal constraint; contours for selected parameters pairs by profile analysis (optional): realized by repeated fits with two additional internal constraints.

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 30

slide-31
SLIDE 31
  • 10. Mathematics: solution of constrained least squares

Special case: no unmeasured variables (parameters), x ≡ xm, and linear equality constraints: minimize ∆xT V −1

x ∆x

subject to equality constraints f + A∆x = 0 Solved by stationary point of Lagrange function L( x) = ∆xTV −1

x ∆x + 2λT (A∆x + f)

Derivatives w.r.t ∆x and λ set to zero: V −1

x ∆x

+ ATλ = A∆x = −f

  • V −1

x

AT A ∆x λ

  • =
  • −f
  • solution:

∆x λ

  • =
  • V −1

x

AT A −1 −f

  • Solution by determination of partitioned matrix:

W A =

  • AV xAT−1
  • V −1

x

AT A −1 =

  • V x − V xATW AAV x

V xATW A W AAV x W A

  • Note: inverse matrix V −1

x

not used in algorithm (may be singular).

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 31

slide-32
SLIDE 32

Part 3. Non-Gaussian data and nonlinearities

  • 1. Poisson distributed data – counts
  • 2. Cross section measurement
  • 3. Log-normal distribution
  • 4. Averaging with “normalization” uncertainty
  • 5. Covariance matrix plot
  • 6. Solution with constraints
  • 7. Triangle parameters
  • 8. Correlated additive systematic errors II
  • 9. Correlated multiplicative systematic errors II
  • 10. Combination of FNAL results on the mass of the top quark
  • 11. Branching ratios
  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 32

slide-33
SLIDE 33
  • 1. Poisson distributed data – counts

Poisson data, having a variance equal to the mean, have the problem of non-uniform variance. The Poisson distribution for a small mean value is asym-

  • metric. The normal distribution is a bad approxi-

mation for small mean values and in the tails. Poisson and normal density for µ = σ2 = 7 − →

5 10 15 1E-4 0.001 0.01 0.1 n densities

Least squares requires data with constant variance, independent of fit result. What happens, if xi are not normal distributed or do not have constant variance? Bias Example: Average of data following (or proportional to) Poisson distribution x1 = 9 ± 3 x2 = 16 ± 4 Weighted mean (LS) xave = 11.52 ± 2.40 Using Poisson statistic (ML) xave = 12.5 ± 2.5

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 33

slide-34
SLIDE 34
  • 2. Cross section measurement

Least squares popular in particle physics for cross section fits and averaging, using data from ≥ 1 experiment. Cross sections xi are measured via counted numbers ni = S · xi of events: cross section xi = S−1 · ni, i = 1, . . . where the sensitivity factor S is a product S = A1 · A2 · · · Aa ·

  • L dt · ∆x
  • f many factors (trigger, detection, reconstruction . . . probabilities, luminosity, bin width).

number ni: follows Poisson distribution, sensitivity S: will follow a log-normal distribution (log of S normal distributed) – the inverse S−1 will follow a log-normal distribution too: cross section xi = S−1 (log-normal) × ni (Poisson) – possible with Aplcon Normalization factors will approximately follow the log-normal distribution, as a consequence of the Central Limit-Theorem: product of many factors with small uncertainty. cross section: log-normal × Poisson, but in practice often assumed to follow the normal distribution with xi assumed to be independent (with diagonal covariance matrix); even resolution-corrected (“unfolded”) cross sections usually assumed to be independent (!).

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 34

slide-35
SLIDE 35
  • 3. Log-normal distribution

Log-normal distribution for e.g normalization factors (and other variables, which by definition are positive): log-normal variable (with uncertainty ∝ value): external α ⇒ exp [ α′ ] with new internal Gaussian variable α′ ≡ ln α Example: α = 1 ± 0.2

2 4 6 1E-6 1E-5 1E-4 0.001 0.01 0.1 1 x log-normal x 2 4 6 1E-6 1E-5 1E-4 0.001 0.01 0.1 1 normal x density of 1/x

Data in HEP are often given with uncertainty in %, i.e relative uncertainty. This indicates the log-normal (instead of the normal) distribution with constant relative uncertainty.

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 35

slide-36
SLIDE 36
  • 4. Averaging with “normalization” uncertainty

“χ2-function”

In a publication (NIM A) the following measurement for two data points x1, x2 and a common normalization factor α with uncertainty ǫ is given: x1 = 8.0 ± 2% x2 = 8.5 ± 2% α = 1 ± ǫ with ǫ = 0.1 “Assuming that the two measurements refer to the same physical quantity, the best estimate of its true value can be obtained by fitting the points to a constant” (from the publication). Result of straightforward unweighted average: xave = (x1 + x2)/2 = 8.25, but . . . Publication: average xave by “χ2-function minimization”, the covariance matrix V is defined to include the normalization uncertainty: χ2 = ∆TV −1∆ = minimum with V = σ2

1

σ2

2

  • + ǫ2 ·
  • x2

1

x1x2 x1x2 x2

2

  • (∆ is “the vector of the differences” between xi and average xave).

Resulting average is xave = 7.87 ± 0.81 , outside (!) the range of the two input values . . . apparently wrong large bias with constructed non-diagonal covariance matrix. Note: weights w1 = +1.25 and w2 = −0.25 because σ1 < σ2, and large positive correlation.

try ‘Peelles puzzle” in Google (or NIMA A 346 (1994) 306–311)

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 36

slide-37
SLIDE 37
  • 5. Covariance matrix plot

“χ2-function”

Axis of covariance ellipse is slightly tilted (left) because input values x1 and x2 (and σ1, σ2) are not equal; this causes the “strange” value of the average. χ2 = ∆TV −1∆ = minimum with V = σ2

1

σ2

2

  • + ǫ2 ·
  • x2

1

x1x2 x1x2 x2

2

  • (∆ is “the vector of the differences” between xi and average xave).

tilted axis

6 8 10 6 8 10

non-tilted axis

6 8 10 6 8 10

Axis of covariance ellipse is not tilted for σ1 = σ2 (right).

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 37

slide-38
SLIDE 38
  • 6. Solution with constraints

. . . with aplcon

With two constraints the average xave, multiplied by the normalization factor α, is forced to agree with the two measurements. X := . . . (variable array) Vx := . . . (matrix array) f(1) = α · xave − x1 f(2) = α · xave − x2 X =         Xm Xu         =     x1 x2 α xave     variable measured fit result pull x1 8.0 ±2% 8.235 ±0.116 2.14 x2 8.5 ±2% 8.235 ±0.116 −2.14 α 1 ±10% 1.000 ±0.100 −2.14 xave 8.235 ±0.832 χ2 = 4.6 ndf = 1 p-value = 3.2% no problem with normalization uncertainty with constrained least squares. Data contain no information about the normalization normalization factor unchanged!

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 38

slide-39
SLIDE 39
  • 7. Triangle parameters

Determination of triangle parameters:

  • three sides a, b and c are measured,
  • one angle γ is measured.

Three values are sufficient for a complete definition of a triangle. Thus the least squares method can be used to improve the measured values. The parameter of interest is assumed to be the

  • triangle area A with uncertainty (“error” propagation):

A =

  • p(p − a)(p − b)(p − c) = p (p − c) tan γ/2

with p = (a + b + c)/2 a γ b c

✁ ✁ ✁ ✁ ✁ ✁ ❅ ❅ ❅ ❅ ❅ ❅

The least squares problem with 4 measured value, evtl. with non-diagonal covariance matrix, and 1 (unmeasured) parameter can be solved easily by aplcon, including propagation of uncertainties.

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 39

slide-40
SLIDE 40

. . . contnd.

X := . . . (variable array) Vx := . . . (matrix array) p = (a + b + c)/2 ! circumference/2 A =

  • p(p − a)(p − b)(p − c)

! area of triangle f(1) = tan(γ/2) − A/(p(p − c)) ! angle constraint variable measured fit result pull a 10 ±0.05 10.01 ±0.05 1.75 b 7 ±0.2 7.06 ±0.20 1.75 c 9 ±0.2 8.72 ±0.12 −1.75 γ 1 ±0.02 1.019 ±0.017 1.75 A 30.10 ±0.87 X =         Xm Xu         =       a b c γ A       a γ b c

✁ ✁ ✁ ✁ ✁ ✁ ❅ ❅ ❅ ❅ ❅ ❅

e.g. unitarity triangle, represent- ing interactions between quarks

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 40

slide-41
SLIDE 41
  • 8. Correlated additive systematic errors II

x1 ± σ1 ± ∆a1 ± ∆b1 x2 ± σ2 ± ∆a2 ± ∆b2 statistical errors σ1 and σ2 uncorrelated; but sources a and b of systematic errors assumed to be fully (positive) correlated for ∆a1, ∆a2 and for ∆b1, ∆b2

  • Either define non-diagonal covariance matrix by law of (linear) propagation of uncertainties:

V x1 x2

  • =
  • σ2

1 + ∆2 a1 + ∆2 b1

∆a1∆a2 + ∆b1∆b2 ∆a1∆a2 + ∆b1∆b2 σ2

2 + ∆2 a2 + ∆2 b2

  • i.e.

insert all systematic uncertainty contributions into covariance matrix, which has to be inverted for the fit.

  • or introduce one additional measured variable for each source of uncertainty:

. . . as correlated data . . . x1 = x′

1 + a · ∆1a + b · ∆1b

x2 = x′

2 + a · ∆2a + b · ∆2b

x′

1

± σ1 x′

2

± σ2 a = ± 1 b = ± 1 V     x′

1

x′

2

a b     =     σ2

1

σ2

2

1 1     . . . using aplcon with constraints: e.g. pull for each source of systematic uncertainty avaialable. . . . both methods are equivalent (for additive uncertainties).

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 41

slide-42
SLIDE 42
  • 9. Correlated multiplicative systematic errors II

Multiplicative systematic errors (e.g. normalization) are expressed by relative errors ∆rel, sometimes expressed as a factor (1 + ∆rel) applied to measured or fitted values x. Now errors ∆a1, ∆b1, ∆a2, ∆b2 are defined as relative errors!

  • 1. factor

(1 ± ∆a1) (1 ± ∆b1) · · ·

  • 2. factor

(1 ± ∆a2) (1 ± ∆b2) · · · statistical errors σ1 and σ2 uncorrelated; but systematic multiplicative relative errors are assumed to be fully (positive) correlated for ∆a1, ∆a2 and for ∆b1, ∆b2 The factors can be assumed to follow the log-normal distribution: exp (∆rel) ≈ 1 + ∆rel + 1 2!∆2

rel + . . .

Factors, applied to fitted values x:

  • 1. factor

(1 + a · ∆a1) (1 + b · ∆b1) · · · → exp (a · ∆a1) exp (b · ∆b1) · · ·

  • 2. factor

(1 + a · ∆a2) (1 + b · ∆b2) · · · → exp (a · ∆a2) exp (b · ∆b2) · · · x′

1

± σ1 x′

2

± σ2 a = 1 ± 1 b = 1 ± 1 V     x′

1

x′

2

a b     =     σ2

1

σ2

2

1 1    

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 42

slide-43
SLIDE 43
  • 10. Combination of FNAL results on the mass of the top quark

2010 data: 11 measurement and 14 systematic uncertainty categories Experiment Data mass ± stat weight wi CDF Run I l+j 176.1 ± 5.1 −0.025 di-l 167.4 ± 10.3 −0.005 all-j 186.0 ± 10.0 −0.007 D0 Run I l+j 180.1 ± 3.6 +0.013 di-l 168.4 ± 12.3 +0.002 CDF Run II (publ) all-j 174.80 ± 1.70 +0.105 trk 175.30 ± 6.20 −0.005 DO Run II (prel) l+j* 173.75 ± 0.83 +0.262 di-l 174.66 ± 2.92 −0.021 CDF RUN II (prel) l+j* 173.00 ± 0.65 +0.700 d1-l 170.56 ± 2.19 −0.018 Systematic uncertainty categories top mass uncertainty Combined values (GeV/c2) cat. ∆m cat. ∆m iJES 0.46 Signal 0.19 aJES 0.21 Backgr. 0.23 bJES 0.20 Fit 0.11 cJES 0.13 MC 0.40 dJES 0.19 UN/MI 0.02 rJES 0.15 CR 0.39 LepPt 0.15 MHI 0.09 FERMILAB-TM-2466-E (arXiv:1007.3178v1 [hep-ex]): Combination of CDF and D0 results on the mass of the top quark using up to . . . , based on BLUE (best linear unbiased estimate), all systematic uncertainties treated as additive uncer- tainties, uncorrelated (Fit, iJES), or with full (positive) correlation either for Runs, for experiments, for channels. From the paper: “The weights of some of the measurements are negative . . . a negative weight means that it affects the resulting mtop central value and helps reduce the total uncertainty.”

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 43

slide-44
SLIDE 44

. . . contnd.

Averaging with aplcon, treating all uncertainties (except statistical error and multiple hadron inter- actions) as multiplicative, assuming log-normal factors (see before). Comparison 2009 mtop = 173.12 ± 1.26 GeV/c2 2010 mtop = 173.32 ± 1.06 GeV/c2 aplcon mtop = 173.09 ± 1.13 GeV/c2 2011 mtop = 173.18 ± 0.94 GeV/c2 Note: average mass value dominated by two experimental values. In aplcon for each systematic error category the corresponding factor (a, b . . . ) is fitted and the pull is calculated, but no weights (no weights wi with

i wi = 1) exist.

The calculated uncertainty in aplcon is slightly larger.

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 44

slide-45
SLIDE 45
  • 11. Branching ratios

In total 16 values x1 . . . x16 measured for five different decay channels of the f1(1285): Γ4 Γ1 Γ4 (Γ2 + Γ3) Γ2 (Γ2 + Γ3) Γ1 (Γ2 + Γ3) Γ5

1/3Γ1

Γ4 Γ (Γ2 + Γ3) Γ5 Branching rations B1 . . . B5 determined by constrained fit using all data with constraints e.g. f(.) = B4 − X1 · B1 . . . f(.) = 3 · B5 − B1 · X13 . . . f(.) =

  • j

Bj − 1 Symbol Mode PDG-Fraction Γi/Γ aplcon: Γi/Γ (Γi/Γ)∗ B1 f1(1285) → 4π (33.1 +2.1

−1.9)%

(29.7 ± 1.7)% (31.9 ± 1.6)% B2 f1(1285) → a0(980)π (36 ± 7)% (39 ± 8)% (37 ± 8)% B3 f1(1285) → ηππ (16 ± 7)% (16 ± 8)% (16 ± 7)% B4 f1(1285) → K ¯ Kπ (9.0 ± 0.4)% (7.7 ± 0.4)% (8.6 ± 0.4)% B5 f1(1285) → γρ0 (5.5 ± 1.3)% (6.8 ± 0.7)% (6.4 ± 0.6)% PDG: 10 scale factors up to S = 2.8 are applied to original data; aplcon 1.result: result with unmodifeed original data; aplcon 2.result∗: scale factor S = 2.8 applied to the value with largest pull.

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 45

slide-46
SLIDE 46

Summary Data combination

  • Increase of precision by data combination, and test for data compatibility;
  • Requires understanding of physics, detector behaviour and data analysis, . . .
  • . . . and estimation of all relevant systematic uncertainties, . . .
  • . . . and of concept of statistical correlations and distributions,

Constrained least squares fits allow solution of complicated combination problems:

  • non-linearities in the model;
  • model relations with > 1 measured values and with correlated uncertainties;
  • non-Gaussian variables, e.g. following the Poisson and the log-normal distribution

with simple code, reduced to the essentials.

  • V. Blobel – University of Hamburg

Data Combination in Particle Physics page 46

slide-47
SLIDE 47

Contents

Part 1. Combining correlated data 2

  • 1. Combining data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

  • 2. Averaging by linear least squares

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

  • 3. Mean values, variances and covariances, correlations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5

  • 4. Combining correlated data of a single quantity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6

  • 5. Weights by Lagrange multiplier method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8

  • 6. Least squares: Gauss, Legendre and Lagrange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9

  • 7. Charm particle lifetime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

  • 8. Common additive systematic error I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

  • 9. Common multiplicative systematic error I

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

  • 10. Average of two correlated data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

  • 11. Two-by-two covariance matrix from maximum likelihood

. . . . . . . . . . . . . . . . . . . . . . . . . . . 15

  • 12. The two-dimensional normal distribution

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

  • 13. Dzero result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

  • 14. Dzero: how big is the correlation coefficient?

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

  • 15. Two alternative, but equivalent data combination methods . . . . . . . . . . . . . . . . . . . . . . . . . .

19

  • 16. The PDG strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

20 Part 2. Constrained Least Squares 21

  • 1. x-y-data with uncertainties in both coordinates

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

  • 2. Constrained Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

  • 3. Alternative least squares methods for fitting/averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

  • 4. Comparison

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

  • 5. Constrained least squares fit program Aplcon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

  • 6. Averaging correlated scattering lengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

  • 7. Straight line with uncertainties in both coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

28

slide-48
SLIDE 48
  • 8. Straight line and correlated data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29

  • 9. Uncertainties of fit parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

30

  • 10. Mathematics: solution of constrained least squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31 Part 3. Non-Gaussian data and nonlinearities 32

  • 1. Poisson distributed data – counts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33

  • 2. Cross section measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34

  • 3. Log-normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

35

  • 4. Averaging with “normalization” uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

36

  • 5. Covariance matrix plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

  • 6. Solution with constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

38

  • 7. Triangle parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

39

  • 8. Correlated additive systematic errors II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

41

  • 9. Correlated multiplicative systematic errors II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

42

  • 10. Combination of FNAL results on the mass of the top quark . . . . . . . . . . . . . . . . . . . . . . . . . .

43

  • 11. Branching ratios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45 Summary 46