statistics in medicine
play

Statistics in medicine Sources of variation Lecture 1- part 1: - PDF document

11/4/2016 Outline Statistics in medicine Sources of variation Lecture 1- part 1: Describing variation, and Types of variables graphical presentation Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology


  1. 11/4/2016 Outline Statistics in medicine • Sources of variation Lecture 1- part 1: Describing variation, and • Types of variables graphical presentation Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu S L I D E 0 S L I D E 1 • Almost every characteristic that is measured on a patient varies Readings and resources • THAT IS WHY IT IS CALLED A VARIABLE • Chapter 9, p105-118: Jekel's epidemiology, biostatistics, preventive • EXAMPLES medicine, and public health by David L. Katz et al (4th edition). • Blood glucose level • Blood pressure • Diet • Electrolytes • etc.… S L I D E 2 S L I D E 3 1

  2. 11/4/2016 There are different sources of There are different sources of variation variation Let us consider blood pressure as an example Let us consider blood pressure as an example • Biologic differences • Measurement error – Age, race, diet, affect blood pressure – Systematic error • Older patients, of African descent, and those who • Distort the data in one direction leading to bias  consume high salt diet tend to have high blood obscure the truth pressure • Ex. Defective BP cuff that tend to give high readings • Measurement conditions – Random error – Time of the day, anxiety, fatigue…etc. • Slight, inevitable inaccuracies • High blood pressure is observed following exercise, • Not systematic because it makes some readings too and with anxiety high, and some too low Statistics can adjust for random error, but can not fix systematic error S L I D E 4 S L I D E 5 To understand variation, you have Variable could be quantitative or to describe it qualitative • Qualitative • Descriptive statistics definition: • Skin color – Statistics, such as the mean, the • Jaundice standard deviation, the proportion, and • Heart murmurs the rate, used to describe attributes of a set of a data • Quantitative http://clinicalgate.com/wp-content/uploads/2015/06/B9781437729306000483_f48-02- 97 81437729306.jpg – Blood pressure – Electrolytes levels S L I D E 6 S L I D E 7 2

  3. 11/4/2016 There are different types of Nominal variables (qualitative) variables – Nominal • Nominal are “naming” variables • Definition: – Dichotomous (binary) – The simplest scale of measurement. Used for – Ordinal (ranked) characteristics that have no numerical values, no measurement scales and no rank order. It is also – Continuous (interval) called a categorical or qualitative scale. • Ex. Skin color – Continuous (ratio) – Different number can be assigned to each color – Risks and proportions • E.g. 1: purple, 2: black, 3: white, 4 blue, 5: tan – It makes no difference to the statistical analysis which – Counts and units of observation number is assigned to which color, because the number is merely a numerical name for a color – Combining data • Percentages and proportions are commonly used to summarize the data S L I D E 8 S L I D E 9 Dichotomous variables (qualitative) Ordinal “ranked” variables • Definition: • Dichotomous from the Greek “cut into two” variables – Used for characteristics that have an underlying order to their values; that have clearly implied direction from • Ex.: Normal/abnormal skin color, living/dead better to worse. • Some time it s not enough to describe the data as two • Are categorical (qualitative) scales categories living/dead, but it is important to know how long the patient survived  survival analysis • Three or more levels • Although there is an order among categories, however the difference between two adjacent categories is not the same throughout the scale S L I D E 10 S L I D E 11 3

  4. 11/4/2016 Ordinal “ranked” variables Numerical scales (quantitative) Ex. Pitting edema grading scale: “0 - no Ex. Pain scale: “0 - no pain” - “10 - worst • Definition : edema” - “4+ - sever edema” imaginable pain” – The highest level of measurement. It is used for characteristics that can be given numerical values; the difference between numbers have meaning, ex. BMI, height. • Types • Interval http://biology-forums.com/gallery/2137_18_05_12_2_25_00.jpeg https://openclipart.org/detail/218053/pain-scale • Ratio • Percentages and proportions are commonly used to • Discrete summarize the data • Medians are sometime used to describe the whole data • Measures of central tendencies are usually used to summarize: means, medians S L I D E 12 S L I D E 13 Numerical scales (continuous) Numerical scales (Discrete) • Has values equal to integers • Has a value on a continuum • Units of observation: person, animal, thing, etc.… • Presented in frequency tables • Interval: arbitrary zero point • One characteristic in the x-axis, one characteristic in the y-axis, • Ex. Centigrade temperature scale and counts in the cells Frequency table of gender by whether serum total cholesterol was checked or not • Ratio: absolute zero point • Ex. Kalvin temperature scale Cholesterol level Gender Checked Not checked Total Female 17(63%) 10(37%) 27(100%) Male 25 (57%) 19(43%) 44(100%) Total 42(59%) 29(41%) 71(100%) https://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd= Source: Jekel's epidemiology, biostatistics, preventive medicine, and &cad=rja&uact=8&ved=0ahUKEwiuo6nf8sjOAhUEkh4KHXTZAnUQjRwI public health by David L. Katz et al (4th edition). Bw&url=http%3A%2F%2Fwww.livescience.com%2F39994- kelv in.html&psig=AFQjCNFGVvg1wdLx78W2V44wDlZQDQB17A&ust=147 1 538633651130 S L I D E 14 S L I D E 15 4

  5. 11/4/2016 Risks and proportions Combining data • Continuous variable could be converted to ordinal variable • Risk is the conditional probability of an event (e.g. death) in a • When data is converted to categories individual information is lost defined population in a defined period. • The fewer the number of categories the greater is the amount of information lost Histogram of neonatal mortality rate per 1000 live births , • Share some characteristics of discrete and some characteristics of by birth weight group, United States 1980 Birth weight (g) continuous variables 120 • Ex. A discrete event (e.g., death) occurred in a fraction of 100 population 80 60 • Calculated by the ratio of counts in the numerator to counts in 40 denominator 20 0 Source: Buehler W et al. Public Health Rep 1 02:151-161, 1987 S L I D E 16 S L I D E 17 Outline Statistics in medicine • Frequency distributions Lecture 1- part 2: Describing variation, and – Frequency distribution of continuous data graphical presentation – Frequency distribution of binary data Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu S L I D E 18 S L I D E 19 5

  6. 11/4/2016 Readings and resources Frequency distribution is • Chapter 9, p105-118: Jekel's epidemiology, biostatistics, preventive medicine, and public health by David L. Katz et al (4th edition). S L I D E 20 S L I D E 21 Frequency distribution is Frequency tables • Definition – A table showing the number and or the percentages of observations occurring at different values (or range of values) of a variable. • Steps of creating frequency table TABLE of data displaying the VALUE of each data point ( or range of – Decide on the number of non-overlapping intervals data points) in one column and the FREQUENCY with which that • It is better to have equal width intervals value occurs in the other column • Usually 6 to 14 intervals are adequate to demonstrate the shape of the distribution • Creating intervals means: continuous variable converted to ordinal PLOT of data displaying the VALUE of each data point ( or range of variable data points) on one axis and the FREQUENCY with which that value – Information on individual level is lost occurs on the other axis – Count the number of observations in each interval • Percentages could be calculated as well – Percentage=the number of observation in the interval divided by the total number of observations, multiplied by 100 • Presented graphically by histogram S L I D E 22 S L I D E 23 6

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend