Statistical Analysis for M edical and Public Health Data Qazvin - PowerPoint PPT Presentation

Statistical Analysis for M edical and Public Health Data Qazvin University of M edical Sciences 2017

Workshop Schedule 1- Types of variables 2- Types of Studies 3- Types of data summaries 4- Types of statistical inference 5- statistical graphs and data analysis with STATA

1. Types of variables • Qualitative variables : responses are not number – Nominal variable: makes group of people; no comparison Examples: gender, status (ill, health) – Ordinal variable: makes group of people; simple comparison (< = >) Examples: education, social class (I,II,III,IV)

1. Types of variables • Quantitative variables : responses are numbers 1. interval variables: makes groups, comparison, zero point or origin was made by scientists difference is OK but ratio is not Examples: temperature (0c, 32F , -270K), poverty line (Toman, $, … ) 20C – 10C = 10C 20C/ 10C = 2 F = 32 + 1.8* C 32+1.8* 20=68 32+1.8* 10=50 68 F – 50 F=18F+32=50F=10C 68F/ 50F = 1.36

1. Types of variables 2. Ratio variables: makes groups, comparison, zero point or origin is a true zero difference is OK and ratio is OK Examples: age, weight, height 180cm – 170cm = 10cm 180cm/ 170cm=1.06 180kg-170kg = 10kg 18kg/ 170kg=1.06 Statistical methods for interval and ratio variables are the same.

1. Types of variables • Dependent variable (Y) or outcome or response or end point is a function of many factors • Independent variables (X1, X2, … , Xk) predictors, factors, exploratory variables, treatment are possible causes for Y

2. Types of studies • Observational study : Definition: An observational study 1. draws inferences from a sample to a population 2. independent variables are not under the control of the researcher because of: ethical concerns logistical constraints 3. Randomization of treatment is impossible

Types of observational studies • Case-control study: study originally developed in epidemiology, in which two existing groups differing in outcome are identified and compared on the basis of some supposed causal attribute. • Cross-sectional study: involves data collection from a population, or a representative subset, at one specific point in time. • Longitudinal study: correlational research study that involves repeated observations of the same variables over long periods of time. • Cohort study or Panel study: a particular form of longitudinal study where a group of patients is closely monitored over a span of time. • Ecological study: an observational study in which at least one variable is measured at the group level.

Types of observational studies Disadvantage: cannot be used as reliable sources to make statements of fact about the " safety , efficacy , or effectiveness " of a practice Advantages: 1- provide information on “real world” use and practice 2- detect signals about the benefits and risks of practices in the general population 3- help formulate hypotheses to be tested in subsequent experiments 4- provide data needed to design more informative pragmatic clinical trials 5- inform clinical practice

Experimental Study Definition: the investigator actively manipulates which groups receive the agent or exposure under study Randomized controlled trials (RCT) The steps in an RCT are: 1. State the hypothesis 2. Select the participants. This step includes sample size, inclusion and exclusion criteria, and informed consent 3. Allocate participants randomly to either the treatment or control group; Randomization 4. Administer the intervention. a blinded fashion; single blind; double blind 5. At a pre-determined time, the outcomes are monitored

3- Types of data summaries • Tables • Graphs • Descriptive statistics

3- Types of data summaries One-way table: shows distribution of one variable Table 1 Distribution of blood group of who where when Blood group Freq. percent A 25 18.52 B 40 29.63 AB 55 40.74 O 15 11.11 Total 135 100

3- Types of data summaries Two-way table : shows distribution of one variable by second one Table 2 Distr. of … by … who when where Disease Yes Disease NO total Blood Freq. % Freq. % Freq. % group A 20 5 25 B 20 20 40 AB 40 15 55 O 10 5 15 Total 90 45 135

3- Types of data summaries • Three-way table Application: effect of exposure on outcome after controlling for a confounder Age group exposure Disease + Disease - 25 - 30 Y es no … … … >= 75 Y es No

Statistical Graphs • For qualitative variables: 1. Simple Bar chart 2. Clustered Bar chart 3. Pie chart 4. Clustered pie chart 15

Bar chart for race 100 96 80 67 60 count of id 40 26 20 0 white black other 16

Distribution of low birth weight by race 80 73 60 count of id 42 40 25 23 20 15 11 0 0 1 0 1 0 1 white black other 17

Distribution of race 35.45% 50.79% 13.76% white black other 18

Distribution of low birth weight by race white black 23.96% 42.31% 57.69% 76.04% other 37.31% 62.69% 0 1 Graphs by race 19

Statistical Graphs • For quantitative variables (continuous or discrete) • Histogram • Box plot • Scatter plot • line plot • ROC curve (Receiver operating characteristic) curve 20

Distribution of volume as a continuous variable 25 20 15 Percent 10 5 0 5,000 10,000 15,000 20,000 25,000 Volume (thousands) 21

Distribution of M ileage as discrete variable 15 10 Percent 5 0 10 20 30 40 Mileage (mpg) 22

Distribution of blood pressure (bp) by Sex effect of sex on bp 180 160 Blood pressure 140 120 Male Female 23

Distribution of blood pressure (bp) by age groups and sex effects of age group and sex on bp 180 Blood pressure 160 140 120 Male Female Male Female Male Female 30-45 46-59 60+ 24

Scatter plot of life expectancy by population growth 50 60 70 80 4 Avg. 2 annual % growth 0 80 70 Life expectancy at birth 60 50 0 2 4 25

Line chart for life expectancy over years 65 60 life expectancy 55 50 45 40 1900 1910 1920 1930 1940 Year 26

Line charts for life expectancy and inflation over years 60 50 40 30 20 10 1900 1910 1920 1930 1940 Year life expectancy inflation 27

Receiver Operator Characteristic Curve (ROC) curve • To examine if a clinical marker or a new clinical test is suitable for diagnosing a disease • Find a cutoff point and its sensitivity and specificity for a marker or a test • ROC gives Area Under Curve (AUC) and p-value to examine the efficacy of the marker or test • AUC > 0.5 and closer to 1.0 indicates acceptable marker or test for diagnosing 28

An example of a bad marker 1.00 0.75 Sensitivity 0.50 0.25 0.00 0.00 0.25 0.50 0.75 1.00 Specificity Area under ROC curve = 0.3870 ROC -Asymptotic Normal-- Obs Area Std. Err. [95% Conf. Interval] -------------------------------------------------------- 189 0.3870 0.0452 0.29841 0.47564 29

ROC curve for a good marker 1.00 0.75 Sensitivity 0.50 0.25 0.00 0.00 0.25 0.50 0.75 1.00 Specificity Area under ROC curve = 0.9964 ROC -Asymptotic Normal-- Obs Area Std. Err. [95% Conf. Interval] -------------------------------------------------------- 2000 0.9964 0.0013 0.99390 0.99893 30

Choosing a Cutoff point Detailed report of sensitivity and specificity Correctly Cutpoint Sensitivity Specificity Classified ( >= 1 ) 100.00% 0.00% 50.00% ( >= 2 ) 99.70% 94.20% 96.95% ( >= 3 ) 99.50% 96.00% 97.75% ( >= 4 ) 99.30% 97.60% 98.45% ( >= 5 ) 98.80% 98.30% 98.55% ( >= 6 ) 97.80% 98.50% 98.15% ( >= 7 ) 97.30% 98.80% 98.05% ( >= 8 ) 96.50% 99.70% 98.10% ( > 8 ) 0.00% 100.00% 50.00% 31

Fundaments of statistical Testing and Confidence Interval 32

Fundaments of statistical Testing Research Loop: Population with Representative statistics sample unknown parameters 33

Fundaments of statistical Testing M ethods for statistical inference: 1- Estimation 1-1 Point estimation 1-2 Confidence Interval estimation 2- Statistical Testing (T est of Hypothesis) 34

What is a point estimate? A point estimate is a statistical measure that is calculated based on data obtained in a sample. Examples: sample mean, sample proportion, etc. Population parameters point estimate M ean = µ Xbar Prop. = P X/ n; X=number of successes, n=sample size Standard deviation= σ s s/√n Standard Error = Std. Err. Coefficient of Variation= σ / µ s/ Xbar 35

M ajor problem with point estimates • T o what extend we have confidence to generalize a point estimate to its parameter in the population? • No specific answer! • A point estimate may have confidence from 0% to 100% • The question is answered by building an interval with interested confidence and centered on the point estimate 36

Statistical Analysis for M edical and Public Health Data Qazvin - PowerPoint PPT Presentation

Statistical Analysis for M edical and Public Health Data Qazvin University of M edical Sciences 2017 Workshop Schedule 1- Types of variables 2- Types of Studies 3- Types of data summaries 4- Types of statistical inference 5- statistical

TEAMS T RAINING FOR E MERGENCY M EDICAL T EAMS AND E UROPEAN M EDICAL C ORPS Luca Ragazzoni, MD,

Data and Analysis Part V Statistical Analysis of Data Alex Simpson Part V: Statistical Analysis

M edical knowledge sharing and information exchange: virtually enabling access to specialty care

Data and Analysis Note 12 Statistical Analysis of Data I Alex Simpson Note 12 Statistical

Med edical Su Surge Introd oduction on a and W d Welcom ome Why a Panel Presentation on

Treatment machines E UGENIA M ORETTI M EDICAL P HYSICS AOU SMM U DINE

Q UESTIONS ? C ARL R. D ARNALL A RMY M EDICAL C ENTER R EPLACEMENT

O BSERVATIONAL Are Observational Studies Any M EDICAL Good? O UTCOMES David

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

STA 214: Probability & Statistical Models STA 214: Analysis of Statistical Models

Environmental Health Science Data Streams Data Streams Health Data Health Data Brian S.

Statistical graphics with Statistical graphics with ggplot2 ggplot2 Programming for Statistical

Importing Data from Statistical So ware haven Importing Data into R Statistical So

. Surajit Ray Minjung Kyung Jiezhun (Sherry) Gu Ray SAMSI, June 2 2005 - slide #1 Statistical

Statistical Simulation in Python Tushar Shanker Data Scientist DataCamp Statistical Simulation

Statistical Data Analysis DS GA 1002 Statistical and Mathematical Models

Spatio-temporal correlations across the melting of 2 D Wigner molecules Amit Ghosal IISER

Relational Learning Expressive Background Knowledge can be incorporated easily

Early Action in GHG Mitigation and Role of Information Disclosure Mechanisms Donna Ramirez

Updating IEEE 1471 David Emery & Rich Hilliard* WICSA 2008 Working Session 4

SSML 1.1 Daniel C. Burnett Nuance Communications J anuary 13, 2007 Overview SSML 1.1

Skeletons Animated characters are usually built on top of an underlying skeleton The

2. Cross products Lets suppose you want to calculate the area of a polygon in the plane.

Projections MCV4U: Calculus & Vectors In the real world, a projection occurs when an object

Sambuz

Useful Links

Newsletter

Mail Us

Statistical Analysis for M edical and Public Health Data Qazvin - PowerPoint PPT Presentation

Statistical Analysis for M edical and Public Health Data Qazvin University of M edical Sciences 2017 Workshop Schedule 1- Types of variables 2- Types of Studies 3- Types of data summaries 4- Types of statistical inference 5- statistical

TEAMS T RAINING FOR E MERGENCY M EDICAL T EAMS AND E UROPEAN M EDICAL C ORPS Luca Ragazzoni, MD,

Data and Analysis Part V Statistical Analysis of Data Alex Simpson Part V: Statistical Analysis

M edical knowledge sharing and information exchange: virtually enabling access to specialty care

Data and Analysis Note 12 Statistical Analysis of Data I Alex Simpson Note 12 Statistical

Med edical Su Surge Introd oduction on a and W d Welcom ome Why a Panel Presentation on

Treatment machines E UGENIA M ORETTI M EDICAL P HYSICS AOU SMM U DINE

Q UESTIONS ? C ARL R. D ARNALL A RMY M EDICAL C ENTER R EPLACEMENT

O BSERVATIONAL Are Observational Studies Any M EDICAL Good? O UTCOMES David

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

STA 214: Probability &amp; Statistical Models STA 214: Analysis of Statistical Models

Environmental Health Science Data Streams Data Streams Health Data Health Data Brian S.

Statistical graphics with Statistical graphics with ggplot2 ggplot2 Programming for Statistical

Importing Data from Statistical So ware haven Importing Data into R Statistical So

. Surajit Ray Minjung Kyung Jiezhun (Sherry) Gu Ray SAMSI, June 2 2005 - slide #1 Statistical

Statistical Simulation in Python Tushar Shanker Data Scientist DataCamp Statistical Simulation

Statistical Data Analysis DS GA 1002 Statistical and Mathematical Models

Spatio-temporal correlations across the melting of 2 D Wigner molecules Amit Ghosal IISER

Relational Learning Expressive Background Knowledge can be incorporated easily

Early Action in GHG Mitigation and Role of Information Disclosure Mechanisms Donna Ramirez

Updating IEEE 1471 David Emery &amp; Rich Hilliard* WICSA 2008 Working Session 4

SSML 1.1 Daniel C. Burnett Nuance Communications J anuary 13, 2007 Overview SSML 1.1

Skeletons Animated characters are usually built on top of an underlying skeleton The

2. Cross products Lets suppose you want to calculate the area of a polygon in the plane.

Projections MCV4U: Calculus &amp; Vectors In the real world, a projection occurs when an object

Sambuz

Useful Links

Newsletter

Mail Us

STA 214: Probability & Statistical Models STA 214: Analysis of Statistical Models

Updating IEEE 1471 David Emery & Rich Hilliard* WICSA 2008 Working Session 4

Projections MCV4U: Calculus & Vectors In the real world, a projection occurs when an object