practical data analysis
play

Practical data analysis Large Number Theorems Width of a - PowerPoint PPT Presentation

Practical data analysis References Variability Probability Distributions Practical data analysis Large Number Theorems Width of a distribution Doru Constantin and Guillaume Tresset Sampling Chi-squared doru.constantin@u-psud.fr


  1. Practical data analysis References Variability Probability Distributions Practical data analysis Large Number Theorems Width of a distribution Doru Constantin and Guillaume Tresset Sampling Chi-squared doru.constantin@u-psud.fr distribution guillaume.tresset@u-psud.fr Errors Laboratoire de Physique des Solides, Orsay.

  2. References I Practical data analysis ◮ Barlow, R. J. (1993). References Statistics: A Guide to the Use of Statistical Methods in the Physical Variability Sciences . Probability Chichester, England; New York: Wiley. Distributions Large Number ◮ Bevington, P. R. (1969). Theorems Data Reduction and Error Analysis for the Physical Sciences . Width of a distribution New York: McGraw-Hill. Sampling ◮ Bevington, P. R. and K. Robinson (2003). Chi-squared Data Reduction and Error Analysis for the Physical Sciences (3 ed.). distribution New York: McGraw-Hill. Errors ◮ Bohm, G. and G. Zech (2010). Introduction to Statistics and Data Analysis for Physicists . Hamburg: Verlag Deutsches Elektronen-Synchrotron. Freely available online from http://www-library.desy.de/preparch/books/ vstatmp_engl.pdf

  3. References II Practical data analysis ◮ Drosg, M. (2009). References Dealing with Uncertainties (2 ed.). Variability Springer. Probability ◮ Feller, W. (1968). Distributions An Introduction to Probability Theory and Its Applications (3rd edition Large Number Theorems ed.). Width of a New York: Wiley. distribution Sampling ◮ Grinstead, C. M. and J. L. Snell (1997). Chi-squared Introduction to Probability (2 ed.). distribution American Mathematical Society. Errors Freely available online from http://www.dartmouth.edu/~chance/ ◮ Hughes, I. G. and T. P. A. Hase (2010). Measurements and their Uncertainties . Oxford: Oxford University Press. Short and very legible introduction.

  4. References III Practical data analysis References Variability ◮ Jaynes, E. T. (2003). Probability Distributions Probability Theory – The Logic of Science . Large Number Cambridge: Cambridge University Press. Theorems Width of a ◮ Press, W. H., S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery distribution (1992). Sampling Numerical Recipes in C: The Art of Scientific Computing (2 ed.). Chi-squared distribution Cambridge: Cambridge University Press. Errors ◮ Taylor, J. R. (1997). An Introduction to Error Analysis (2 ed.). Sausalito: University Science Books.

  5. Variability Practical data analysis References Variability Probability Distributions Large Number Theorems 1. When measuring the height of all adult males in a Width of a certain town, one finds 177 ± 5 cm. distribution Sampling 2. The charge of the electron is (1 . 602176565 ± 0 . 000000035) × 10 − 19 C. Chi-squared distribution Errors

  6. The meaning of probability Practical data analysis References Variability Probability Distributions Casting a die: Large Number Theorems 1. Out of a large number of trials, each face will come Width of a distribution on top about 1 in 6 times. Sampling 2. Our state of knowledge gives us no reason to prefer Chi-squared one of the faces over the others. distribution Errors Each face has a 1 / 6 probability of coming up.

  7. Random variables Practical data analysis ◮ A random variable “is simply an expression whose value is the outcome of a particular experiment” References Variability (Grinstead & Snell, 1997). It takes values in a certain Probability domain Ω . Distributions ◮ This domain (or sample space ) can be discrete, Large Number Ω = { ω 1 , ω 2 , . . . ω k , . . . } ⊂ Z n (finite or countably Theorems infinite) or continuous Ω ⊂ R n Width of a distribution ◮ The elements of the sample space ( ω k or x ∈ R n ) are Sampling called outcomes . Subsets of Ω are called events . Chi-squared distribution ◮ We introduce a probability distribution, Errors characterized by a distribution function m . In the discrete case, this function satisfies: m ( ω ) ≥ 0 , ∀ ω ∈ Ω � ω ∈ Ω m ( ω ) = 1 The probability of an event E is defined as : P ( E ) = � ω ∈ E m ( ω ).

  8. Continuous distributions Practical data analysis Let X be a continuous real-valued random variable. A References density function for X is a function f : Ω → R such that Variability Probability � b Distributions P ( a ≤ X ≤ b ) = f ( x )d x , ∀ a , b ∈ R . Large Number a Theorems Width of a � distribution P ( X ∈ E ) = f ( x )d x . ∀ E ⊂ R Sampling E Chi-squared P ([ x , x + d x ]) = f ( x )d x distribution Errors f ( x )d x is the probability of the outcome x The cumulative distribution function of X is: � x d F ( x ) = P ( X ≤ x ) = f ( t )d t , with d xF ( x ) = f ( x ) −∞

  9. Central tendency Practical data analysis References Variability Probability Distributions Large Number Theorems Width of a distribution Sampling Chi-squared distribution Errors Figure: Log-normal distribution with parameters µ = 0 and σ = 0 . 25 (solid line) and σ = 1 (dashed line). The mean (blue), median (green) and mode (red) are shown for both curves.

  10. Spread Practical data analysis References IQR Variability Q1 Q3 Q1 − 1.5 × IQR Q3 + 1.5 × IQR Probability Distributions Median Large Number −4 σ −3 σ −2 σ −1 σ 0 σ 1 σ 2 σ 3 σ 4 σ Theorems −2.698 σ −0.6745 σ 0.6745 σ 2.698 σ Width of a distribution Sampling Chi-squared 24.65% 50% 24.65% distribution −4 σ −3 σ −2 σ −1 σ 0 σ 1 σ 2 σ 3 σ 4 σ Errors 15.73% 68.27% 15.73% −4 σ −1 σ 1 σ 4 σ −3 σ −2 σ 0 σ 2 σ 3 σ Figure: Boxplot details

  11. Higher-order moments Practical data analysis � 3 � � 4 � �� X − µ �� X − µ References skewness ; γ 2 = − 3 γ 1 = kurtosis Variability σ σ Probability Distributions Large Number Theorems Width of a distribution Sampling Chi-squared distribution Errors Graphics by MarkSweep. Licensed under Public domain via Wikimedia Commons

  12. Uniform Practical data analysis References ◮ All outcomes have equal Variability probability Probability f ( x ) Distributions  1 for x ∈ [ a , b ] 1  Large Number  b − a ◮ U ( x ; a , b ) =  b − a Theorems  0 otherwise   Width of a distribution ◮ µ = 1 2 ( a + b ) , m = 1 2 ( a + b ) x 0 a b Sampling M = any value in [ a , b ] . Chi-squared 1 distribution ◮ σ 2 = 1 F ( x ) 12 ( b − a ) 2 , γ 1 = 0 , γ 2 = − 6 / 5 Errors ◮ One cannot have a uniform distribution over an infinite domain (discrete or continuous)! a x 0 b Graphics by IkamusumeFan. Licensed under CCA-SA 3.0 via Wikimedia Commons _p / / 4 / / 4

  13. Binomial Practical data analysis ◮ Number k of successes in a References p=0.5 and n=20 sequence of n independent p=0.7 and n=20 Variability p=0.5 and n=40 yes / no experiments (Bernoulli Probability trials), each of which yields Distributions success with probability p . Large Number Theorems ◮ B ( k ; n , p ) = C k n p k (1 − p ) n − k ; 0 10 20 30 40 Width of a k ∈ { 0 , 1 , . . . , n } distribution ◮ µ = np , m = � np � or � np � Sampling M = � ( n + 1) p � or � ( n + 1) p � − 1 . Chi-squared distribution Errors ◮ σ 2 = np (1 − p ) , γ 1 = 1 − 2 p γ 2 = 1 − 6 p (1 − p ) √ np (1 − p ) , np (1 − p ) ◮ k is the variable , n and p are parameters . Graphics by Tayste. Licensed under Public domain via Wikimedia Commons / /

  14. Normal Practical data analysis References Variability 1.0 Probability ◮ Very widely encountered. μ = σ = 0, 2 0.2, μ = σ = 0, 2 1.0, 0.8 μ = σ = 2 0, 5.0, Distributions μ = σ = − 2, 2 0.5, 0.6 2 π e − ( x − µ )2 2 σ 2 ; Large Number 1 ◮ N ( x ; µ, σ ) = x ∈ R √ 0.4 Theorems σ 0.2 Width of a ◮ � X � = m = M = µ 0.0 distribution − 5 − 4 − 3 − 2 − 1 0 1 2 3 4 5 x � X 2 � = σ 2 , γ 1 = 0 , γ 2 = 0 1.0 Sampling μ = σ = 2 0, 0.2, μ = σ = 0, 2 1.0, 0.8 μ = 0, σ = 2 5.0, Chi-squared μ = σ = − 2, 2 0.5, distribution 0.6 0.4 Errors 0.2 0.0 − 5 − 4 − 3 − 2 − 1 0 1 2 3 4 5 x Graphics by Inductiveload. Licensed under Public domain via Wikimedia Commons i / / 4 i / / 4

  15. Poisson Practical data analysis ◮ Probability of a given number of independent events k occurring References in a fixed interval with a known Variability average rate. Probability Distributions ◮ P ( k ; λ ) = λ k k ! e − λ ; k ∈ N , λ ∈ R + Large Number Theorems ◮ µ = λ, m ≃ ⌊ λ + 1 / 3 − 0 . 02 /λ ⌋ Width of a M = ⌈ λ ⌉ − 1 , ⌊ λ ⌋ distribution Sampling ◮ σ 2 = λ, γ 1 = λ − 1 / 2 , γ 2 = λ − 1 Chi-squared ◮ Can be seen as the limit of a distribution Errors binomial distribution for large n : P ( k ; λ = np ) ≃ B ( k ; n , p ) ◮ Approaches N for large λ : P ( k ; λ ) ≃ N ( x = k ; µ = λ, σ 2 = λ ) Graphics by Skbkekas. Licensed under CCA 3.0 via Wikimedia Commons

  16. Lorentzian Practical data analysis References Variability ◮ Shape of resonance peaks. Also Probability named after Cauchy (in Distributions mathematics) and Breit and Large Number Theorems Wigner (in spectroscopy) Width of a 1 distribution ◮ L ( x ; x 0 , γ ) = � 2 � ; � � x − x 0 1 + πγ Sampling γ x ∈ R , x 0 ∈ R , γ ∈ R + Chi-squared distribution ◮ m = M = x 0 Errors ◮ No µ or higher moments! Graphics by Skbkekas. Licensed under CCA 3.0 via Wikimedia Commons

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend