1 Introduction to Statistics and Data Analysis 2 1.1 Overview: - - PDF document

1
SMART_READER_LITE
LIVE PREVIEW

1 Introduction to Statistics and Data Analysis 2 1.1 Overview: - - PDF document

Probability, Statistics, and Statistical Methods 1 1 Introduction to Statistics and Data Analysis 2 1.1 Overview: Statistical Inference, Samples, Populations, and the Role of Probability 3 4 1.2 Sampling Procedures; Collection of


slide-1
SLIDE 1

Probability, Statistics, and Statistical Methods

1

1

Introduction to Statistics and Data Analysis

2

slide-2
SLIDE 2

1.1

Overview: Statistical Inference, Samples, Populations, and the Role of Probability

3 4

slide-3
SLIDE 3

1.2

Sampling Procedures; Collection of Data

5 6

slide-4
SLIDE 4

7

1.3

Random Sampling

8

slide-5
SLIDE 5

9

1.4

Measures of Location: Sample Mean, Sample Mode, and Sample Median

10

slide-6
SLIDE 6

The sample mode denoted by xmode is the observation with the highest frequency

11

Sample median is a better measure of the central tendency of a sample since it would not be effected by extreme values in the sample

1.5

Measures of Variability: Sample Range, Sample Variance, and Sample Standard Deviation

12

slide-7
SLIDE 7

The sample range, denoted by r, is given by r = Xmax – Xmin

13

Data collected on a pH meter from a sample of 10 observations are: 7.07, 7.00, 7.10, 6.97, 7.00, 7.03, 7.01, 7.01, 6.98, 7.08

 The sample mean

x = (7.07 + 7.00 + …+7.08)/10 = 7.025

 The sample mode

xmode = 7.00 and 7.01

 The sample median

x = (7.01 + 7.01)/2 = 7.01

 The sample range

r = 7.10 – 6.97 = 0.14

 The sample variance

s2 = [(7.07-7.025)2 + (7.00-7.025)2 + …+(7.08-7.025)2)] /9 = 0.00194

14

slide-8
SLIDE 8

15

1.6

Discrete and Continuous Data

Discrete Data – countable, could be finite or infinite, no additional data point between two consecutive data

  • points. Example, number of defects in an automobile,

number of trees in a forest, … Continuous Data – measurable, infinite, additional data points could be found between any two data

  • points. Example, time, weight, density, …

16

slide-9
SLIDE 9

1.7

Statistical Modeling, Scientific, Inspection, and Graphical Diagnostics

17 18

slide-10
SLIDE 10

19 20

slide-11
SLIDE 11

21

4.8 4.0 3.2 2.4 1 .6 1 2 1 0 8 6 4 2

Battery Life Frequency

Histogram of Battery Life

Before editing

5.0 4.5 4.0 3.5 3.0 2.5 2.0 1 .5 1 6 1 4 1 2 1 0 8 6 4 2

Battery Life Frequency

Histogram of Battery Life

After editing

22

slide-12
SLIDE 12

23 24

5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5

Battery Life

Boxplot of Battery Life

slide-13
SLIDE 13

2

Probability

25

Consider the experiment of tossing a die. If we are interested in the number facing up, the sample space would be S = {1, 2, 3, 4, 5, 6}

26

slide-14
SLIDE 14

27 28

slide-15
SLIDE 15

29

A B = A  B  C = C’ = (A  B ) (B  C)  (A  C) =

30

slide-16
SLIDE 16

2.3

Counting Sample Points

31 32

slide-17
SLIDE 17

In how many different ways can a buyer order one of these homes? n = 4 ꞏ 3 = 12

33

Sam is going to assemble a computer. He has two choices of chips, four choices of a hard drive, three choices for memory, and five choice of the

  • case. How many different ways could Sam assemble the computer?

n = 2 ꞏ 4 ꞏ 3 ꞏ 5 = 120

34

slide-18
SLIDE 18

35 36

slide-19
SLIDE 19

In how many different ways could three individuals, A, B, and C, be arranged in a row from left to right? A B C ACB BAC BCA CAB CBA 3! = 6

37

Example: If three medals (gold, silver, bronze) could be given to three students in a class of 25 and each student can receive at most one medal, how many possible selections could be made?

25P3 = 25!/(25-3)! = 13,800 38

slide-20
SLIDE 20

Example: If three identical medals could be given to three students in a class

  • f 25 and each student can receive at most one medal, how many possible

ways could the three medals be given? 25 3 = 25! 3!(25−3)! = 2,300

39 40

slide-21
SLIDE 21

2.4

Probability of an Event

41 42

slide-22
SLIDE 22

43

2.5

Additive Rules

44

slide-23
SLIDE 23

45 46

slide-24
SLIDE 24

47 48

slide-25
SLIDE 25

49 50

slide-26
SLIDE 26

2.6

Conditional Probability, Independence, and the Product Rule

51 52

slide-27
SLIDE 27

Example: If an adult is chosen, what is the probability that a male person is chosen given that this male person is employed? P(M|E) = P(E  M)/P(E) = 460/600 = 23/45 P(E  M) = n (E  M) / n(S) = 460/900 P(E) = n(E)/ n(S) = 600/900

53 54

slide-28
SLIDE 28

55 56

slide-29
SLIDE 29

3

Random Variables and Probability Distributions

57 58

slide-30
SLIDE 30

3.2

Discrete Probability Distribution

59 60

slide-31
SLIDE 31

61 62

slide-32
SLIDE 32

63 64

slide-33
SLIDE 33

Suppose that the number of crashes observed in an intersection on the Memorial weekend has the following probability distribution: x 1 2 3 4 f(x) 0.2 0.1 0.3 0.3 0.1 Find the probability of having 3 crashes. Find the probability of having 3 or more crashes. P(x = 3) = 0.3 P(x ≥ 3) = 0.3 + 0.1 = 0.4

65

Section 3.3

Continuous Probability Distributions

66

slide-34
SLIDE 34

67 68

slide-35
SLIDE 35

69

* P(x = a) = 0, P (a < x < b) = P(x < b) – P(x < a)

70

slide-36
SLIDE 36

71

The weekly demand for Pepsi, in 1,000 liters, from a local store, is a continuous random variable with the probability density function f(x) = 2 (x – 1) for 1 < x < 2 = 0 elsewhere Find the probability that x = 1.5. Find the probability that x ≤ 1.5. P(x = 1.5) = 0 P(x ≤ 1.5) = 2 𝑦 1 𝑒𝑦

.

  • = 2
  • 𝑦 | 1.5

𝑦 1= 0.25

72

1 2 1 2 3

f(x)

slide-37
SLIDE 37

4

Mathematical Expectation

73 74

slide-38
SLIDE 38

75 76

slide-39
SLIDE 39

77 78

slide-40
SLIDE 40

Suppose that the number of crashes observed in an intersection on the Memorial weekend has the following probability distribution: x 1 2 3 4 f(x) 0.2 0.1 0.3 0.3 0.1 Find the mean  and the variance 2 of X.  = E(X) = 0ꞏ0.2 + 1ꞏ0.1 + 2ꞏ0.3 + 3ꞏ0.3 + 4ꞏ0.1 = 2.0 2 = E(X- )2 = (0-2)2ꞏ0.2 + (1-2)2ꞏ0.1 + (2-2)2ꞏ0.3 + (3-2)2ꞏ0.3 + (4-2)2ꞏ0.1 = 1.6

79

The weekly demand for Pepsi, in 1,000 liters, from a local store, is a continuous random variable with the probability density function f(x) = 2 (x – 1) for 1 < x < 2 = 0 elsewhere Find the mean  and the variance 2 of X.

80

slide-41
SLIDE 41

5

Some Discrete Probability Distributions

81

The binomial distribution is a discrete probability distribution of the number of successes in a sequence of n independent success/failure experiments, each of which yields success with probability p. Such a success/failure experiment is also called a Bernoulli experiment or Bernoulli trial. In general, if the random variable X follows the binomial distribution with parameters n and p, we write b(x; n, p) = the probability of getting exactly x successes in n trials is given by the probability mass function:

82

slide-42
SLIDE 42

The probability that a certain kind of component will pass a shock test is 0.75. Find the probability that exactly 2 of the next 4 components tested passed. Find the probability that 2 or more of the next 4 components tested passed.

83 84

slide-43
SLIDE 43

The hypergeometric distribution is a discrete probability distribution that describes the probability of x successes in n draws from a finite population of size N containing k successes without replacement. A random variable X follows the hypergeometric distribution if its probability mass function is given by:

85

Lot of 40 components each are called unacceptable if they contain as many as 3 defective or more. The procedure for sampling the lot is to select 5 components at random and to reject the lot if a defective is found. What is the probability that exactly 1 defective is found in the sample if there are 3 defectives in the entire lot? n = 5, N = 40, k = 3, and x = 1

86

slide-44
SLIDE 44

The Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently

  • f the time since the last event.

A discrete random variable X is said to have a Poisson distribution with parameter λ>0, t>0, if for x = 0, 1, 2, ... the probability mass function of X is given by:

87 88

slide-45
SLIDE 45

During a laboratory experiment the average number of radioactive particles passing through a counter in 1 millisecond is 4. What is the probability that 6 particles enter the counter in a given millisecond? x = 6,  = 4, and t = 1

89

6

Some Continuous Probability Distributions

90

slide-46
SLIDE 46

The continuous uniform distribution is a probability distributions such that for each member of the family, all intervals of the same length on the distribution's support are equally probable. The support is defined by the two parameters, A and B, which are its minimum and maximum values. The probability density function of the continuous uniform random variable X is:

91 92

slide-47
SLIDE 47

In probability theory, the normal (or Gaussian) distribution is a continuous probability distribution that has a bell-shaped probability density function, known as the Gaussian function or informally the bell curve: The parameter μ is the mean or expectation (location of the peak) and σ2 is the

  • variance. σ is known as the standard deviation. The distribution with μ = 0 and

σ2 = 1 is called the standard normal distribution.

93 94

slide-48
SLIDE 48

95 96

slide-49
SLIDE 49

97

6.3

Areas under the Normal Curve

98

slide-50
SLIDE 50

99 100

slide-51
SLIDE 51

An arbitrary normal random variable X could be transformed into a standard normal variable Z by means of the transportation Z = (X – ) / 

101

(a) Pr(z>1.84) = 1 – Pr(z < 1.84) = 1 – 0.9671 = 0.0329 (b) Pr(-1.97 < z < 0.86) = Pr(z < 0.86) – Pr(z < -1.97) = 0.8051 – 0.0244 = 0.7807

102

slide-52
SLIDE 52

Find k such that (a) Pr(Z > k) = 0.3015 and (b) Pr(k < Z< -0.18) = 0.4197 (a) Pr (Z > k) = 0.3015; thus Pr (Z < k) = 1 – 0.3015 = 0.6985; From normal table, k = 0.52 (b) Pr(k < Z< -0.18) = 0.4197; thus Pr( Z< -0.18) - Pr( Z < k) = 0.4197; 0.4286 - Pr( Z < k) = 0.4197; Pr( Z < k) = 0.0089, k = -2.37

103

Given a random variable X having a normal distribution with = 50 and  = 10, find the probability that X will fall in between 45 and 62.

Pr(45 < X < 62) = Pr[ (45-50)/10 < Z < (62-50)/10] = Pr (-0.5 < Z < 1.2) = Pr (Z < 1.2) – Pr (Z < -0.5) = 0.8849 – 0.3085 = 0.5764

104

slide-53
SLIDE 53

Given a normal distribution with  = 40 and  = 6, find the value x that (a) has 45% of the area to its left, and (b) 14% of the area to its right.

Pr (X < x) = 0.45 => Pr (Z < z) = 0.45; From Normal table, z = -0.13 z=(x-)/ = (x-40)/6 = -0.13 => x = 39.22 Pr (X > x) = 0.14 => Pr (Z > z) = 0.14 => Pro(Z < z) = 1 – 0.14 = 0.86 From Normal table, z = 1.08 z=(x-)/ = (x-40)/6 = 1.08 => x = 46.48

105

6.4

Applications of the Normal Distribution

106

slide-54
SLIDE 54

A certain type of storage battery lasts, on average, 3.0 years with a standard deviation of 0.5 years. Assume that the battery life is normally distributed, find the probability that a given battery will last less than 2.3 years. Pr (X < 2.3) = Pr [Z < (2.3 -3.0)/0.5] = Pr (Z < -1.4) = 0.0808

107

In an industrial process, the diameter of a ball bearing is an important

  • measurement. Specifications for the diameter are 3.00 + 0.01 cm.

It is known that the diameter of the ball bearings follows a normal distribution with a mean of 3.0 and a standard deviation of 0.005. What’s the proportion of manufactured ball bearings will not meet the specifications?

Pr (X < 2.99) + Pr (X > 3.01) = Pr (Z < -2.0) + Pr (z > 2.0) =2 (0.0228) = 0.0456

108

90 80 70 60 50 40 30 20 10

X Density 2.99 0.02275 3.01 0.02275 3

Distribution Plot

Normal, Mean=3, StDev=0.005