Introduction to Statistics for Quantitative Analysis To support - - PowerPoint PPT Presentation

introduction to statistics for quantitative analysis
SMART_READER_LITE
LIVE PREVIEW

Introduction to Statistics for Quantitative Analysis To support - - PowerPoint PPT Presentation

Introduction to Statistics for Quantitative Analysis To support P113, P121, P114, P122, P141 labs Standard deviations like youve never seen them before! Reviews of Intro to Stats: Spellbinding! The talk at all the parties! Jason Huber,


slide-1
SLIDE 1

Introduction to Statistics for Quantitative Analysis

To support P113, P121, P114, P122, P141 labs

“Spellbinding! The talk at all the parties!” Jason Huber, 2nd floor Hoeing RA “Brilliant! Fun for geeks of all ages!” Rhett Butler, CNN reviews “A seminal work. Brought to you by the author of Solutions to P113 Problem Set #2.” Leona Helmsley, The Shopping Channel

Standard deviations like you’ve never seen them before!

Reviews of Intro to Stats:

slide-2
SLIDE 2
slide-3
SLIDE 3

Bush Dukakis Undecided Month 1 Month 2 Headline: Dukakis surges past Bush in polls!

4%

42% 40% 18% 41% 43% 16% 1988 US Presidential election

slide-4
SLIDE 4

Is statistics relevant to you personally? Global Warming Effect of EM radiation Analytical medical diagnostics

slide-5
SLIDE 5

What kinds of things can you measure quantitatively? What kinds of things can you measure qualitatively? What is the difference between a qualitative and quantitative measurement? Which of these types of measurement are important in science? In so far as possible, physics is exact and quantitative … though you will repeatedly see mathematical approximations made to get at the qualitative essence of phenomena.

slide-6
SLIDE 6

2 1 2

A quantitative measurement is meaningless without a unit and error.

slide-7
SLIDE 7

Accuracy: Precision: A measure of closeness to the “truth”. A measure of reproducibility.

slide-8
SLIDE 8

accurate precise Accuracy vs. precision

slide-9
SLIDE 9

Types of errors Statistical error: Results from a random fluctuation in the process of measurement. Often quantifiable in terms of “number of measurements or trials”. Tends to make measurements less precise. Systematic error: Results from a bias in the

  • bservation due to observing conditions or apparatus
  • r technique or analysis. Tend to make measurements

less accurate.

slide-10
SLIDE 10

# time

True value

slide-11
SLIDE 11

True value

# time Parent distribution (infinite number of measurements)

slide-12
SLIDE 12

True value

# time

The game: From N (not infinite) observations, determine “” and the “error on ” … without knowledge of the “truth”.

slide-13
SLIDE 13

The parent distribution can take different shapes, depending on the nature of the measurement. The two most common distributions one sees are the Gaussian and Poisson distributions.

slide-14
SLIDE 14

Probability or number of counts

x Most probable value Highest on the curve. Most likely to show up in an experiment.

slide-15
SLIDE 15

Probability or number of counts

x Most probable value Median Value of x where 50% of measurements fall below and 50% of measurements fall above

slide-16
SLIDE 16

Probability or number of counts

x Most probable value Median Mean or average value of x

slide-17
SLIDE 17

x counts The most common distribution one sees (and that which is best for guiding intuition) is the Gaussian distribution.

slide-18
SLIDE 18

x counts For this distribution, the most probable value, the median value and the average are all the same due to symmetry.

slide-19
SLIDE 19

x counts x True value,  The most probable estimate of  is given by the mean of the distribution of the N observations

slide-20
SLIDE 20

x counts x True value, 

N x N x x x x x

N i i N N

 

      

1 1 2 1

" "  

slide-21
SLIDE 21

x x True value, 

slide-22
SLIDE 22

x x True value,  Error goes like

N i i

x

1

) (

But this particular quantity “averages”

  • ut to zero.

Try f(-xi)2 instead.

slide-23
SLIDE 23

x x True value, 

N x

N i i

 

1 2

) (  

The “standard deviation” is a measure of the error in each

  • f the N measurements.
slide-24
SLIDE 24

1 ) (

1 2

   

N x x

N i i

 is unknown. So use the mean (which is your best estimate of ). Change denominator to increase error slightly due to having used the mean. This is the form of the standard deviation you use in practice. This quantity cannot be determined from a single measurement.

slide-25
SLIDE 25

Gaussian distribution x counts

 

 

2 2

2

2 1

 

x x

e x g

 

slide-26
SLIDE 26

Gaussian distribution intuition x counts 1 is roughly half width at half max

slide-27
SLIDE 27

Gaussian distribution intuition x counts Probability of a measurement falling within 1 of the mean is 0.683

slide-28
SLIDE 28

Gaussian distribution intuition x counts Probability of a measurement falling within 2 of the mean is 0.954

slide-29
SLIDE 29

Gaussian distribution intuition x counts Probability of a measurement falling within 3 of the mean is 0.997

slide-30
SLIDE 30

Bush Dukakis Undecided Month 1 Month 2 Headline: Dukakis surges past Bush in polls!

4%

42% 40% 18% 41% 43% 16%

slide-31
SLIDE 31

The standard deviation is a measure of the error made in each individual measurement. Often you want to measure the mean and the error in the mean. Which should have a smaller error, an individual measurement or the mean?

N

m

  

Error in the mean

slide-32
SLIDE 32

Numerical example: Some say if Dante were alive now, he would describe hell in terms of taking a university course in physics. One vision brought to mind by some of the comments I’ve heard is that of the devil standing over the pit of hell gleefully dropping young, innocent, and hardworking students into the abyss in order to measure “g”, the acceleration due to gravity. Student 1: 9.0 m/s2 Student 2: 8.8 m/s2 Student 3: 9.1 m/s2 Student 4: 8.9 m/s2 Student 5: 9.1 m/s2

slide-33
SLIDE 33

2

. 9 5 1 . 9 9 . 8 1 . 9 8 . 8 . 9 s m a      

2 2 2 2 2 2

12 . 1 5 ) . 9 1 . 9 ( ) . 9 9 . 8 ( ) . 9 1 . 9 ( ) . 9 8 . 8 ( ) . 9 . 9 ( s m             

2

054 . 5 12 . s m

m

  

2

s m 05 . 00 . 9 a  

Error on each individual measurement

slide-34
SLIDE 34

y=F(x) y x How does an error in one measurable affect the error in another measurable? x1 y1 x+x y+y y-y X-x

slide-35
SLIDE 35

The degree to which an error in one measurable affects the error in another is driven by the functional dependence of the variables (or the slope: dy/dx) y x x1 y1 x+x y+y y-y X-x y=F(x)

slide-36
SLIDE 36

The complication

v 2 1 v

2

M P Ma F at t x x

   

Most physical relationships involve multiple measurables! y = F(x1,x2,x3,…) Must take into account the dependence of the final measurable on each of the contributing quantities.

slide-37
SLIDE 37

Partial derivatives What’s the slope

  • f this graph??

For multivariable functions, one needs to define a “derivative” at each point for each variable that projects

  • ut the local slope of the graph in the direction of that

variable … this is the “partial derivative”.

slide-38
SLIDE 38

Partial derivatives The partial derivative with respect to a certain variable is the

  • rdinary derivative of the function with respect to that variable

where all the other variables are treated as constants.

const z y

dx z y x dF x z y x F

... ,

...) , , ( ,...) , , (      

slide-39
SLIDE 39

Example

3 2

) , , ( yz x z y x F 

3

2xyz x F   

3 2z

x y F   

2 2 3z

y x z F   

slide-40
SLIDE 40

Dude! Just give us the freakin’ formula!

slide-41
SLIDE 41

The formula for error propagation If f=F(x,y,z…) and you want f and you have x, y, z …, then use the following formula:

...

2 2 2 2 2 2

                             

z y x f

z F y F x F    

slide-42
SLIDE 42

Measure of error in x The formula for error propagation If f=F(x,y,z…) and you want f and you have x, y, z …, then use the following formula:

...

2 2 2 2 2 2

                             

z y x f

z F y F x F    

slide-43
SLIDE 43

Measure of dependence of F on x If f=F(x,y,z…) and you want f and you have x, y, z …, then use the following formula:

...

2 2 2 2 2 2

                             

z y x f

z F y F x F    

The formula for error propagation

slide-44
SLIDE 44

If f=F(x,y,z…) and you want f and you have x, y, z …, then use the following formula:

...

2 2 2 2 2 2

                             

z y x f

z F y F x F    

The formula for error propagation Similar terms for each variable, add in quadrature.

slide-45
SLIDE 45

Example A pitcher throws a baseball a distance of 30±0.5 m at 40±3 m/s (~90 mph). From this data, calculate the time of flight of the baseball.

2

v d v F v 1 d F v d t        

slide-46
SLIDE 46

0.058s 0.75 t 058 . 3 40 30 40 5 . σ v d σ v 1 σ

2 2 2 2 2 v 2 2 2 d 2 t

                              

t

slide-47
SLIDE 47

Why are linear relationships so important in analytical scientific work? y x x1 y1 y=F(x)

slide-48
SLIDE 48

y x y=F(x)=mx+b Is this a good “fit”?

slide-49
SLIDE 49

y x y=F(x)=mx+b Is this a good fit? Why?

slide-50
SLIDE 50

y x y=F(x)=mx+b Is this a good fit?

slide-51
SLIDE 51

y x y=F(x)=mx+b Graphical analysis pencil and paper still work! Slope (m) is rise/run b is the y-intercept

slide-52
SLIDE 52

y x y=F(x)=mx+b Graphical determination of error in slope and y-intercept

slide-53
SLIDE 53

y x y=F(x)=mx+b Linear regression

With computers: Garbage in Garbage out

slide-54
SLIDE 54

Simple linear regression y=F(x)=mx+b Hypothesize a line

) ( ) (

2 2

         

 

b mx y b b mx y m

i i i i

Minimize sum of squared “residuals” of data with respect to the line (m and b).

slide-55
SLIDE 55

Simple linear regression y=F(x)=mx+b Hypothesize a line

) ( ) (

2 2

         

 

b mx y b b mx y m

i i i i

Can include point by point error information to weight better known points more heavily …