Introduction to Inferential Statistics Jaranit Kaewkungwal, Ph.D. - - PowerPoint PPT Presentation

introduction to inferential statistics
SMART_READER_LITE
LIVE PREVIEW

Introduction to Inferential Statistics Jaranit Kaewkungwal, Ph.D. - - PowerPoint PPT Presentation

Introduction to Inferential Statistics Jaranit Kaewkungwal, Ph.D. Faculty of Tropical Medicine Mahidol University 1 2 Data & Variables Types of Data Types of Data QUALITATIVE QUALITATIVE Data expressed by type Data expressed by type


slide-1
SLIDE 1

1

Introduction to Inferential Statistics

Jaranit Kaewkungwal, Ph.D.

Faculty of Tropical Medicine Mahidol University

slide-2
SLIDE 2

2

Data & Variables

slide-3
SLIDE 3

Types of Data Types of Data

QUALITATIVE QUALITATIVE Data expressed by type Data expressed by type Data that has been described Data that has been described QUANTITATIVE QUANTITATIVE Data classified by numeric value Data classified by numeric value Data that has been measured or counted Data that has been measured or counted QUALITITATIVE and QUANTITATIVE data are not mutually exclusive QUALITITATIVE and QUANTITATIVE data are not mutually exclusive

Adapted from: Dr. Craig Jackson, Adapted from: Dr. Craig Jackson, University of Central England

slide-4
SLIDE 4

Types of Data: Qualitative (Categorical) Data Types of Data: Qualitative (Categorical) Data

NOMINAL DATA NOMINAL DATA

  • values that the data may have do not have specific order

values that the data may have do not have specific order

  • values act as labels with no real meaning

values act as labels with no real meaning

  • Binomial: two possible values (categories, states)
  • Multinomial: more than two possible values (categories, states)

e.g. Health status e.g. Health status healthy =1 healthy =1 sick=2 sick=2 e.g. Treatment e.g. Treatment new regimen = 1 new regimen = 1 standard regimen = 2 standard regimen = 2 e.g. hair colour e.g. hair colour brown =1 brown =1 blond =2 blond =2 black =100 black =100 ORDINAL DATA ORDINAL DATA

  • values with some kind of ordering

values with some kind of ordering

  • data that has been measured or counted

data that has been measured or counted e.g. social class: e.g. social class: upper=1 upper=1 middle = 2 middle = 2 working = 3 working = 3 e.g. e.g. glioblastoma glioblastoma tumor grade: tumor grade: 1 1 2 2 3 3 4 4 5 5 e.g. position in a race: e.g. position in a race: 1 1 st

st

2 2 nd

nd

3 3 rd

rd

Adapted from: Dr. Craig Jackson, Adapted from: Dr. Craig Jackson, University of Central England

slide-5
SLIDE 5

Types of Data: Quantitative Data Types of Data: Quantitative Data

DISCRETE DISCRETE

  • distinct or separate parts, with no finite detail

distinct or separate parts, with no finite detail e.g children in family e.g children in family CONTINUOUS CONTINUOUS

  • between any two values, there would be a third

between any two values, there would be a third e.g between meters there are centimetres e.g between meters there are centimetres INTERVAL INTERVAL

  • equal intervals between values and an arbitrary zero on the sc

equal intervals between values and an arbitrary zero on the scale ale e.g temperature gradient e.g temperature gradient RATIO RATIO

  • equal intervals between values

equal intervals between values and and an absolute zero an absolute zero e.g body mass index e.g body mass index

Adapted from: Dr. Craig Jackson, Adapted from: Dr. Craig Jackson, University of Central England

slide-6
SLIDE 6

White Hot White Hot Red Hot Red Hot Cold Cold “ “Dangerous Dangerous” ” “ “Unpleasant Unpleasant” ” “ “Uncomfortable Uncomfortable” ” “ “Tolerable Tolerable” ” “ “Comfortable Comfortable” ” “ “Cold Cold” ” 80 80 o

  • C

C 60 60 o

  • C

C 40 40 o

  • C

C 20 20 o

  • C

C 10 10 o

  • C

C Unsafe Unsafe Safe Safe

Levels of Variables Levels of Variables

Temperature Temperature

Adapted from: Dr. Craig Jackson, Adapted from: Dr. Craig Jackson, University of Central England

slide-7
SLIDE 7

7

Examples of Data Coding

1 2 3 4

1 2 3 4

1 2 3 4 88 99

Nominal/Cat. Var Ordinal/Cat. Var

Exclude from Analysis?

slide-8
SLIDE 8

8

Examples of Data Coding

1 2 99

  • Cont. 99

Cont.

slide-9
SLIDE 9

9

Example of Descriptive Statistics

slide-10
SLIDE 10

10

Constant vs. Variable

Variables are the specific properties that have the ability to take different values. Constants are the specific properties that cannot vary

  • r won’t be made to vary.
slide-11
SLIDE 11

Terminology Terminology -

  • Variables

Variables

INDEPENDENT INDEPENDENT (syn: treatment, experimental, predictor, input, exposure, explanatory variable) is a stimulus or activity that is identified or manipulated to predict the dependent variable; they are considered as the causal factors, or that you may manipulate. e.g. new drug, working hours, exposure, worker attitudes, polic e.g. new drug, working hours, exposure, worker attitudes, policies ies DEPENDENT DEPENDENT (syn: Effect, criterion, criterion measure, outcome, output variable) is a response that the researcher wanted to predict; they are considered as the

  • utcomes of the treatments or the responses to changes in the independent

variables. e.g. e.g. Symptomotology Symptomotology, productivity, accident rates, attitudes, health status, , productivity, accident rates, attitudes, health status, performance on neuropsychological test performance on neuropsychological test

Adapted from: Dr. Craig Jackson, Adapted from: Dr. Craig Jackson, University of Central England

slide-12
SLIDE 12

Terminology Terminology -

  • Variables

Variables

CONTROLLED CONTROLLED

Extraneous variable is a variable that has a potential to distorts the relationship between dependent and independent variables.

  • Controlled extraneous variables are recognized before the study

is initiated and are controlled in the design and selection criteria.

  • Uncontrolled extraneous variables are recognized before the

study is initiated or, sometimes, even if recognized cannot be controlled in the design and selection phase. Usually an attempt is made to assess and adjust them through sophisticated statistical tools.

e.g., Working hours, temperatures, extraneous exposure, diet, cl e.g., Working hours, temperatures, extraneous exposure, diet, class, income, ass, income, Ambient noise and temperature in testing room Ambient noise and temperature in testing room

Adapted from: Dr. Craig Jackson, Adapted from: Dr. Craig Jackson, University of Central England

slide-13
SLIDE 13

13

Study Variables

Independent Variables & Dependent variables X Y X (independent) Y (dependent) Extraneous variable

X (independent) Y (dependent) X (independent) Y (dependent) X2 (independent)

slide-14
SLIDE 14

14

Study Variables

Confounding Variable

  • When the effects of two or more variables cannot be separated.
slide-15
SLIDE 15

15

S T D ra te Y e s 5 5 /9 5 (6 1 % ) C o n d o m U se N o 4 5 /1 0 5 (4 3 % )

“Condom Use increases the risk of STD” BUT ... Explanation: Individuals with more partners are more likely to use

  • condoms. But individuals with more partners are also more likely to

get STD.

S T D ra te # P a r tn e r s < 5 Y e s 5 /1 5 (3 3 % ) C o n d o m U se N o 3 0 /8 2 (3 7 % ) # P a r tn e r s > 5 Y e s 5 0 /8 0 (6 2 % ) C o n d o m U se N o 1 5 /2 3 (6 5 % )

Study Variables

Confounding Variable

  • When the effects of two or more variables cannot be separated.
slide-16
SLIDE 16

16

Example of Study Variables

Dependent Var: Infant//Child Growth Indepependent Var: Adult Fatness Extraneous Var:

  • Confouding Var (Adj,/Controlled Var) : Child Age,

Adult Age, Socio-economic, Smoking, Physical Acitivity, etc.

  • Uncontrolled Var:

Sex (male & female)

slide-17
SLIDE 17

17

Bias & Chance

slide-18
SLIDE 18

18

Bias Chance Truth

Possible Explanations of Outcome Measured

Observed = Truth + Error Systematic error + Random error

(Bias) (Chance)

Measuring Outcomes: Observed vs. Truth

slide-19
SLIDE 19

19

Bias vs. Chance

slide-20
SLIDE 20

20

Bias:

  • A process at any stage of

inference tending to produce results that depart systematically from the true values. Chance:

  • The divergence of an
  • bservation on a sample from the

true population value in either direction.

  • The divergence due to chance

alone is called random variation

Bias vs.Chance

Bias and chance- are not mutually exclusive.

slide-21
SLIDE 21

21

Bias vs.Chance

“A well designed, carefully executed study usually gives results that are obvious without a formal analysis and if there are substantial flaws in design or execution a formal analysis will not help.”

Johnson AF. Beneath the technological fix. J Chron Dis 1985 (38), 957-961

slide-22
SLIDE 22

22

Chance

“Free kick” Probability of being hit

50% 50% 68% 95% 2.5 % 2.5 %

slide-23
SLIDE 23

23

Chance

“Free kick” Probability of getting goal

50% 50% 68% 95% 2.5 % 2.5 %

slide-24
SLIDE 24

24

Normal Distribution in Descriptive Statistics

Standard Score Raw Score

20 25 30 35 40

X = 30; SD = 5

slide-25
SLIDE 25

25

Types

  • f

Statistical Methods

slide-26
SLIDE 26

26

Types of Statistics

  • By Level of Generalization

– Descriptive Statistics – Inferential Statistics

  • Parameter Estimation
  • Hypothesis Testing

– Comparison – Association – Multivariable data analysis

  • By Level of Underlying Distribution

– Parametric Statistics – Non-parametric Statistics

Sampling Techniques Generalization/ Inferential Statistics

slide-27
SLIDE 27

27

Descriptive Statistics

slide-28
SLIDE 28

28

Descriptive Statistics

  • Measure of Location (Categorical Vars)

– Frequency ( f )

  • Measure of Location (Continuous Vars)

– Mean – Median – Mode

Average Mid-point The Most Frequent

X1 X2 X3 X4 X5 X6 X7 X8 X9 x x n

i i n

=

=

1

  • r

μ =

=

x N

i i n 1

X1 X2 X3 X4 X5 X6 X7 X8 X9

( 1 2 2 2 2 3 3 4 5)

Gender Male Female Count 270 260 250 240 230 220 210 Male Female
slide-29
SLIDE 29

29

Descriptive Statistics

  • Measure of Spread

– Range – Standard Deviation / Variance

Max - Min Xis deviate from Mean x

σ μ

= −

=

∑ (

) x N

i i n 1

S x n

i i n

X

= −

=

∑ (

)

  • 1

1

  • r
slide-30
SLIDE 30

30

Descriptive Statistics

  • Measure of Shape

– Normal Distribution – Skewed Distribution

  • Positively skewed

Negatively skewed

slide-31
SLIDE 31

31

Inferential Statistics

slide-32
SLIDE 32

32

Inferential Statistics

Ho: X1 = X2 Ho: μ1 = μ2 Ho: rxy = 0 Ho: ρ xy = 0

X μ proportion Π

  • Purpose of Inferential Statistics

– Generalisabiliy of research results from Sample Statistics to Population Parameters

  • Types of Inferential Statistical Methods

– Parameter Estimation - to estimate the range of values that is likely

to include the true value in population

– Hypothesis Testing - to ask whether an effect (difference) is present or

not among different groups

slide-33
SLIDE 33

33

Sampling Distribution Sampling Distribution

=26 =30 =25 μ = 27

Sampling Method Sampling Method I n f e r e n t i a l S t a t i s t i c s I n f e r e n t i a l S t a t i s t i c s

SD = 9.1 SD = 11.3 SD = 12.2

x x1

1

x x2

2

x x3

3

slide-34
SLIDE 34

34

Sampling Distribution (Normal Distribution) Sampling Distribution (Normal Distribution) in Parameter ( in Parameter (μ) Estimation Estimation

=26 =30 =25 μ = 27 SD = 9.1 SD = 11.3 SD = 12.2 n = 100 N = 10,000

Confidence Limits of μ : X ± Zα/2,ν SX 95% CI of μ : X ± (1.96 * (SD/√n)) 25 ± (1.96 * (12.2/√100))

μ = 25 (22.6 to 27.4, 95% CI)

Standard error

slide-35
SLIDE 35

35

Parameter Estimation

slide-36
SLIDE 36

36

Example Confidence Limits of μ :

Point Estimates: Single values (Mean, Variance, Correlation, treatment effect, relative risk, etc.) representing characteristics in the whole population Interval Estimates: Ranges of values, usually centered around point estimates, indicating bounds within which we expect the true values for the whole population to lie (stability

  • f the estimate)

Parameter Estimation

SD = 0.75 Average: 2.5 - 4.7

X

Z S X

±

α υ

/ , 2

slide-37
SLIDE 37

37

Parameter Estimation

Point estimates & Confidence Intervals

Point estimates and confidence intervals are used to characterise the statistical precision of any rate (incidence, prevalence), comparisons of rates (e.g., relative risk), and other statistics.

  • US adults have used unconventional therapy = 34%

(31% - 37%, 95%CI)

  • Sensitivity of clinical examination of splenomegaly = 27%

(19 - 36%, 95%CI)

  • Relative risk of lung cancer of smoker vs. non-smoker = 5.6

(2.1 - 8.9, 95%CI)

  • Relative risk of HIV infected of male vs. female = 2.1

(0.5 - 6.9, 95%CI)

slide-38
SLIDE 38

38

Parameter Estimation Parameter Estimation (Normal Distribution) (Normal Distribution)

=26 =30 =25 μ = 27 SD = 9.1 SD = 11.3 SD = 12.2 n = 100 N = 10,000

Confidence Limits of μ : X ± Zα/2,ν SX 95% CI of μ : X ± (1.96 * (SD/√n)) 25 ± (1.96 * (12.2/√100))

μ = 25 (22.6 to 27.4, 95% CI)

Standard error

slide-39
SLIDE 39

39

Parameter Estimation Consideration in confidence level of the estimate

Confidence Interval, usually set at 95% CI, can be interpreted such that - if the study is unbiased and repeated 100 times, there is 95% chance that the true value is included in these intervals of the 100 samples

Select a cut-point for CI

True Value (in population)

95% CI (from Sample 2) 95% CI (from Sample 1) 95% CI (from Sample 3)

slide-40
SLIDE 40

40

Hypothesis Testing - Comparisons

slide-41
SLIDE 41

41

Hypothesis Testing

  • Hypothesis & Tail of the test

– One-sided vs. Two-sided Test

T w

  • -sided test:

E xH

  • : O

utcom e 1 = O utcom e 2 H a: O utcom e 1 ≠ O utcom e 2 O ne-sided test: E x H

  • : O

utcom e 1 ≤ O utcom e 2 H a: O utcom e 1 > O utcom e 2 H

  • : O

utcom e 1 ≥ O utcom e 2 H a: O utcom e 1 < O utcom e 2

O1<O2 | O1=O2 | O1>O2

2.5% 95% 2.5%

O1<O2 | O1 >= O2

5% 95%

slide-42
SLIDE 42

42

Hypothesis Testing

  • Basic steps in hypothesis testing
slide-43
SLIDE 43

43

Hypothesis Testing

Not Reject Ho !! μ1 = μ2 Ho: μ1 = μ2 Ho: μ1 − μ2 = 0 Ha: μ1 − μ2 = 0 μ1 μ2

slide-44
SLIDE 44

44

Hypothesis Testing

μ1 μ2 Ho: μ1 − μ2 = 0 Ha: μ1 − μ2 = 0 Reject Ho !! μ1 < μ2

slide-45
SLIDE 45

45

Hypothesis Testing

at α = 0.05 Reject H0 !! μ1 > μ2

H0: μ1 − μ2 = 0 Ha: μ1 − μ2 = 0

α / 2 = 0.005 − 2.576 α / 2 = 0.005 2.576

at α = 0.01 Not Reject H0 !! μ1 = μ2 given n = very large

p-value = 0.04

slide-46
SLIDE 46

46

Hypothesis Testing

  • Type I & Type II errors

H0: G1 = G2

Accept H0

(G1=G2)

Reject H0

(G1<>G2)

Reality/Truth

H0 True (G1=G2) H0 False (G1<>G2)

Decision Correct Correct Type I Error Type II Error

Power : 1 - β Confidence : 1 - α

0.01, 0.05 0.99, 0.95 0.10, 0.20 0.90, 0.80

A B C D β α

slide-47
SLIDE 47

47

Choosing the Right Statistical Procedure

slide-48
SLIDE 48

48

Choosing the Right Data Analysis Procedure

Basic questions that you should have answers before you can choose the correct test are the following:

What is the purpose of the analysis?

  • describe the data; Or
  • compare groups of data to make decisions; or
  • examine the association between variables for

prediction or forecasting? Is the distribution of the data approximately normal? Is the sample size large enough that the Central Limit Theorem will allow a normality assumption?

slide-49
SLIDE 49

49

Selecting a Statistical Method

Goal Type of Outcome Data

  • f the

Continuous Categorical Binomial Survival analysis (from Gaussian Continuous Time Population) (Non-Gaussian)

Describe Value of Data (1 Group) Compare Value of Data vs. Hypothetical Value (1 Group) Mean, SD Median, Interquar- tile range Proportion (Percent) Kaplan- Meier survival curve One- sample t- test Wilcoxon’s test Chi-square (χ2) or Binomial/ Runs test

slide-50
SLIDE 50

50

Selecting a Statistical Method

Goal Type of Outcome Data

  • f the

Continuous Categorical Binomial Survival analysis (from Gaussian Continuous Time Population) (Non-Gaussian)

Compare Values 2 Grps

  • f

Indept >2 Grps Grps. Compare Values 2 Grps

  • f

Paired >2 Grps Grps/Vars. Unpaired t- test One-way ANOVA Mann- Whitney test Kruskal- Wallis test χ2 test ,

Fisher’s Exact,

χ2 test Log-rank / Mantel- Haenszel Cox Prop Haz.Reg. Paired t-test Repeated measures ANOVA Wilcoxon’s test Friedman’s test McNemar’s χ2 test Cochrane’s Q test Condtn Prop Haz.Reg. Condtn Prop Haz.Reg.

slide-51
SLIDE 51

51

Selecting a Statistical Method

Goal Type of Outcome Data

  • f the

Continuous Categorical Binomial Survival analysis

(from Gaussian

Continuous Time

Population) (Non-Gaussian)

Quantify Association Values of Two variables Predict Value of Outcome Var: from 1 Var (Simple Reg. ) from > 2 Vars (Multiple Reg.) Pearson’s Correlation Spearman’s Correlation Contingency coefficient, Crude Odds Ratio, Relatv Risk Linear or Non-linear Regression . Cox’s Proportional

  • Haz. Reg.

Non- parametric Regression Logistic Regression

slide-52
SLIDE 52

52

Selecting a Statistical Method

Goal Type of Outcome Data

  • f the

Continuous Categorical Binomial Survival analysis (from Gaussian Continuous Time Population) (Non-Gaussian) Measures of Agreement Values from Two Raters/Methods

Measures of Validity Values from Two Raters/Methods

Pearson’s Correlation Weighted Kappa (κ) Weighted Kappa (κ) Agreement rate Cohen’s κ ICC

ANOVA Factor Analysis Non- parametric ANOVA

Factor Analysis

χ2 Sensitivity Specificity ROC curve

slide-53
SLIDE 53

53

Example of Inferential Statistical Methods

slide-54
SLIDE 54

54

Example

  • f

Parameter Estimates & 95%CI

slide-55
SLIDE 55

55

Example

  • f

Parameter Estimates & 95%CI

slide-56
SLIDE 56

56

Example of Comparisons

slide-57
SLIDE 57

57

Example of Comparisons

∗ ∗ ∗

slide-58
SLIDE 58

58

slide-59
SLIDE 59

59

Example of Comparisons

slide-60
SLIDE 60

60

Example of Comparisons

slide-61
SLIDE 61

61

Example of Association

slide-62
SLIDE 62

62

Example of Association

slide-63
SLIDE 63

63

Example

  • f

Association

slide-64
SLIDE 64

64

Example

  • f

Association

slide-65
SLIDE 65

65

Example

  • f

Association

slide-66
SLIDE 66

66

Example

  • f

Association

slide-67
SLIDE 67

67 Dose-Response

slide-68
SLIDE 68

68

slide-69
SLIDE 69

69

Example

  • f

Parameter Estimates

slide-70
SLIDE 70

70

Survival (%) Months

10 20 30 40 50 60 70 80 90 100 110 120 20 40 60 80 100

A B

Months

10 20 30 40 50 60 70 80 90 100 110 120 20 40 60 80 100 <10,000 >100,000 10,000-100,000

Survival (%)

Figure 2. Survival from time of human immunodeficiency virus (HIV) infection of 194 CSWs. A, Overall. B, By serum virus load (HIV type 1 RNA copies/mL).

Each curve is truncated when <10 women remain in that group.

Example

  • f

Parameter Estimates

slide-71
SLIDE 71

71

Charactersitcs

  • No. of No. of Patients 7-Year survival, Rate ratio

patients who died % % (95% CI)* (95% CI)** Age at Infection, years <= 19 105 31 (29.5) 72.3 (62.1-80.1) Referent >=20 89 35 (39.3) 63.3 (50.5-73.6) 1.50 (0.92-2.45) Sex work Brothel 159 54 (34.0) 69.6 (61.1-76.6) 1.34 (0.71-2.52) Nonbrothel 35 12 (34.3) 62.9 (41.6-78.2) Referent Oral contraceptive use Yes 112 36 (32.1) 69.6 (59.2-77.9) 0.83 (0.51-1.36) No 82 30 (36.6) 67.1 (54.7-76.8) Referent Depot medroxyprogesterone use Yes 55 18 (32.7) 70.9 (55.7-81.7) 0.78 (0.45-1.37) No 139 48 (34.5) 67.8 (58.4-75.5) Referent Infection status Seroconverted 34 7 (20.6) *** 1.42 (0.63-3.22) Seropositive at enrl. 160 59 (36.9) 69.6 (61.4-76.4) Referent Viral load, HIV-1 RNA copies/ml. >1000,000 34 24 (70.6) 34.5 (18.8-50.9) 15.40 (5.2-45.2) 10,000-100,000 113 38 (33.6) 70.3 (60.1-78.4) 4.63 (1.64-13.1) <10,000 47 4 (8.5) 92.5 (78.4 -97.5) Referent Total 194 66 (34.0) 68.7 (61.0-75.2)

* Survival analysis ** Cox proportional hazard model *** Insufficient follow-up time to this m ore recent converted group; 5-year survival = 77.8 (56.8-89.5) %

Table 2. Survival from time of infection of 194 HIV-infected CSWs

Example of Association

slide-72
SLIDE 72

72

Statistical Methods:

Multi-variable Data Analysis

slide-73
SLIDE 73

73

Multi-variable Data Analysis

Causes/Exposures vs. Outcomes

crowding Malnutrition Vaccination Genetic Risk factors for tuberculosis (Distant from Outcome) Mechanism of Tuberculosis (Proximal to Outcome) Susceptible Host Susceptible Host Infection Infection Tuberculosis Tuberculosis Tissue Invasion and Reaction

Exposure to Mycobacterium

Example Example: : Causes of Tuberculosis Causes of Tuberculosis

slide-74
SLIDE 74

74

Multi-variable Data Analysis

Causes/Exposures vs. Outcomes

Example: Relationship between risk factors and disease : hypertension ( BP) and congestive heart failure (CHF). Hypertension causes many diseases, including congestive heart failure, and congestive heart failure has many causes, including hypertension.

slide-75
SLIDE 75

75

Example of MDA

Risk (never) = 7/47 = .149 Risk (sporadic) = 9/35 = .257 RR (sporadic vs.never) = 25.7/14.9 = 1.72

slide-76
SLIDE 76

76

Example of MDA

slide-77
SLIDE 77

77

Charactersitcs

  • No. of No. of Pts 5-Year survival, Rate ratio Adjusted rate ratio

patients who died % % (95% CI)* (95% CI)** (95% CI)*** Initial CD4 lymphocyte, cells/μL <200 15 14 (93.3) 0 20.9 (9.00-48.7) 15.5 (6.46-37.4) 200-500 88 34 (38.6) 63.4 (51.8-72.9) 2.46 (1.21-5.01) 1.42 (0.67-3.00) >500 54 10 (18.5) 84.7 (70.4-92.4) Referent Referent Viral load, HIV-1 RNA copies/ml. >1000,000 30 23 (76.7) 26.7 (12.6-43.0) 13.9 (4.78-40.6) 12.5 (4.09-38.2) 10,000-100,000 89 31 (34.8) 65.0 (53.1-74.5) 3.87 (1.36-11.0) 3.42 (1.19-9.81) <10,000 38 4 (10.5) 96.7 (78.6-99.5) Referent Referent Total 157 58 (36.9) 64.6 (56.0-71.9) * Survival analysis ** Cox proportional hazard model *** Cox proporational hazard model adjusted for initial CD4 lymphocyte count and virus load

Table 3. Mortality from time of first CD4 T lymphocyte count of 157 HIV-infected CSWs (125 women were HIV seropositive at study enrollment and 32 seroconverted during study)

Example of MDA

slide-78
SLIDE 78

78

The End of Inferential Statistics

slide-79
SLIDE 79

79

Example of Statistical Analysis: Chi-Square test: χ2 = Σ(Ο−Ε)2/Ε Kappa Statistics:

Observed agreement - Expected Agreement

κ = Obs Agreemt - Expct Agreemt

1 - Expected Agreement

Bias vs.Chance: Observed & Truth in Statistical Methods

Observed = Truth + Error Systematic error + Random error

29.7 24.3 25.3 20.7 100 55 45 54 52 2 No 46 3 43 Yes exposed No Yes Outcome OBSERVED 100 55 45 54

No

46

Yes

exposed

No Yes Outcome

EXPECTED

slide-80
SLIDE 80

80

Bias vs.Chance: Observed & Truth in Statistical Methods

Observed = Truth + Error Systematic error + Random error

systolic agegrp 1 3 87 190

<=20 21−40 >= 41 age group

ε ε

Example of Statistical Analysis: Analysis of variance:

F = σ2

(τ+ε) /σ2 ε

1890-1962

τ τ

μ

slide-81
SLIDE 81

81

Example of Statistical Analysis: Regression Analysis:

Y = Y + ε Y = α + β1 X1

Cigarette Consumption per Adult per Day

12 10 8 6 4 2

CHD Mortality per 10,000

30 20 10

Bias vs.Chance: Observed & Truth in Statistical Methods

Observed = Truth + Error Systematic error + Random error

(Y = 11, X = 6)

ε Y=13, X = 6