SLIDE 1 Accurate communication of statistics
Thomas Lumley
SLIDE 2 –Scott Emerson, MD PhD
“The statistician’s task is to go into the light and spread darkness”
SLIDE 3 Introductions: me
Statistician
(Seattle, now Auckland) Health researcher: heart disease, genomics, air pollution StatsChat: statistics and medical research in the media Sings bass
SLIDE 4
Introductions: you
Who are you? What sort of medical writing do you do? What are you hoping to get out of the workshop?
SLIDE 5
Outline
Talking about risk p’s and t’s: the stuff with maths Extrapolation: what was really measured?
(stuff I didn’t know whether you were going to be interested in)
SLIDE 6
Risk
SLIDE 7
Absolute risks
“Men are more than twice as likely to have prostate cancer and 60 per cent more likely to have testicular cancer.” (compared to 1980s) Lifetime risk: 1 in 195 vs 1 in 312 Or: 5 in 1000 vs 3 in 1000 “Two more cases for every thousand men”
SLIDE 8
SLIDE 9 Cancer Research UK,
as cigarettes’
SLIDE 10 Forward and backward
Studies often look at probability of positive test given disease We care about probability of disease given positive test Not the same.
App: spectrumnews.org
SLIDE 11 Microlives
“Every hour wounds. The last one kills.”
― Neil Gaiman, American Gods
SLIDE 12
Micromort: 1 in a million chance of death Scuba diving: 5 micromorts/dive MDMA: 0.5 micromorts/dose Climbing Everest: 39000/attempt Driving 400km: 1 micromort
SLIDE 13
Microlife: 1 part in a million reduction in life expectancy (1/2 hour) ‘Using up’ your life faster 2 cigarettes = 1 microlife 7 units of alcohol* = 1 microlife hazard ratio of 1.09 = 1 μlife/exposed day
SLIDE 14
Risk and rate
Risk: proportion or probability (%) Rate: proportion or probability per unit time (%/year) Rate of death varies. Risk of death (without a time period) doesn’t.
SLIDE 15
Trick question
NZ is introducing bowel cancer screening. Will this increase or decrease the rate of bowel cancer?
SLIDE 16
Trick question
If you screen for a type of cancer where no treatment is possible, what happens to survival in that cancer?
SLIDE 17
Trick question
If the average time current residents have already spent in an aged-care facility is 3 years, the average total length of stay is At least three years Three years You can’t say.
SLIDE 18
Inference
SLIDE 19
Questions
Is it even a thing? How big do we think it is? How precise is that?
SLIDE 20 http://xkcd.com/552/
SLIDE 21 All the reasons
Chance Causation Reverse causation Confounding (including by time) Selection
SLIDE 22
Is it even a thing?
For science, hypothesis testing is overrated. For scicomm, it’s a useful filter. Caveat: weak studies or implausible hypotheses. Caveat: strong confounding
SLIDE 23
p-value
If there was no effect, how likely would we be to get an estimate this big or bigger? How surprising would the data be with no real effect? If an effect is plausible and would make the data much more likely, we should believe it. NOT ‘probability of no effect’
SLIDE 24 Highly plausible hypothesis, good power: significant results mostly true Moderately plausible hypothesis, low power: significant results
- ften false — and always
- verestimated!
SLIDE 25 Highly plausible hypothesis, good power: significant results mostly true Moderately plausible hypothesis, low power: significant results
- ften false — and always
- verestimated!
Highly implausible hypothesis:
significant results almost always false
SLIDE 26
Daily Mail
SLIDE 27
But Bayesian inference?
Not magic fairy dust Automates combining plausibility and data Doesn’t fix reporting bias
SLIDE 28 Healthy people are healthy
clofibrate Mortality (%) 5 10 15 20 25 30 good compliance poor compliance
Coronary Drug Project trial,
SLIDE 29 Healthy people are healthy
clofibrate Mortality (%) 5 10 15 20 25 30 good compliance poor compliance
Coronary Drug Project trial,
placebo
SLIDE 30 1 2 1 2 3 4 5 FEV1
Lung function (FEV1) in 654 children, comparing smokers and non-smokers
SLIDE 31 1 2 1 2 3 4 5 FEV1
Lung function (FEV1) in 654 children, comparing smokers and non-smokers
45 50 55 60 65 70 75 1 2 3 4 5 height fev
SLIDE 32 If a confounding variable is measured accurately modelled accurately
- the bias can be removed.
- “Smoking: current, former, never” isn’t enough
SLIDE 33 Francesca Domenici, Johns Hopkins
SLIDE 34
Effect size
Very large studies can detect effects too small to care about Very small studies can only detect effects too large to believe
SLIDE 35 Measured an insulin resistance
indicator, differences tiny
SLIDE 36 It’s not that people are dying at a rapid rate. But men who drink more than four cups a day are 56 per cent more likely to die and women have double the chance compared with moderate drinkers, according to the The University of Queensland and the University of South Carolina study.
— NZ Herald 18/9/2013
Under 55 Over 55
SLIDE 37 It’s not that people are dying at a rapid rate. But men who drink more than four cups a day are 56 per cent more likely to die and women have double the chance compared with moderate drinkers, according to the The University of Queensland and the University of South Carolina study.
— NZ Herald 18/9/2013
Under 55 Over 55
SLIDE 38
Interval estimates
95%of confidence intervals include the true value Not: ‘probability the value is in this interval is 95%’ — but not bad if no publication bias Range of values ‘consistent with the data’ Always check the boring end of the interval
SLIDE 39 10 20 30 40 50 60 Experiment Effect
SLIDE 40 Data consistent with very small excess — even before cherry-picking
SLIDE 41
Compare, if you want to compare
“p<0.05 in one group, p>0.05 in the other” is NOT evidence of a difference between groups subsets: under 55 vs over 55 experiments: significant change in treatment group, not in control group
SLIDE 42 It helps to combine studies
Odds Ratio Study Reference 0.02 0.04 0.10 0.25 0.63 1.58 Auckland Block Doran Gamsu Morrison Papageorgiou Tauesch Summary
Individual studies not convincing, but combined result is
SLIDE 43 Regression adjustment
45 50 55 60 65 70 75 1 2 3 4 5 height fev
SLIDE 44 45 50 55 60 65 70 75 1 2 3 4 5 height fev
SLIDE 45 45 50 55 60 65 70 75 1 2 3 4 5 height fev
1 2
1 2
Mean difference —was (0.5, 0.9) L/s —now (-0.13, 0.14) L/s
SLIDE 46
“Comparing children of the same height, there was no evidence of a difference in average FEV1 between smokers and non-smokers. ”
“Moderate differences could not be ruled out, and there was no information about the kids’ health at that time or later in life”
SLIDE 47 http://www.nzherald.co.nz/lifestyle/news/article.cfm?c_id=6&objectid=11685829
SLIDE 48
SLIDE 49
Interlude: trends
Which of these are getting more common? Heart attack Dementia Prostate cancer Colon cancer Teenage pregnancy
SLIDE 50
Extrapolation
SLIDE 51
Goal: everyone lives happily ever after Subgoal: less heart disease subsubgoal: less heart disease in diabetics subsubsubgoal: lower blood glucose subsubsubsubgoal: reduce insulin resistance subsubsubsubsubgoal: activate PPAR-γ
SLIDE 52
Why surrogate outcomes?
Showing you reduce blood sugar: a few hundred patients for a few weeks Showing you prevent heart attacks: several thousand patients for several years
SLIDE 53 Why not surrogate outcomes?
Invaluable for initial research Not reliable: Phase III trials with real
- utcomes fail about 50% of the time
SLIDE 54 Class 1c antiarrhythmics
1970s: After heart attack, particular heartbeat irregularities predicted high risk of death early 1980s: New drugs prevented these
- irregularities. Lots of people were given the
drugs late 1980s: the drugs were tested…
SLIDE 55 Probability of not experiencing cardiac arrest or death
Cardiac Arrhythmia Suppression Trial (CAST)
SLIDE 56
If we can’t wait?
Immune checkpoint inhibitors for cancer Dramatic responses in a minority Don’t know how long they last— drugs too new
SLIDE 57 –Hitchhiker’s Guide to the Galaxy, Douglas Adams
“We demand rigidly defined areas
- f doubt and uncertainty!”
SLIDE 58
Types of uncertainty
Couldn’t measure the right thing Don’t know how much reporting bias Don’t know if statistical adjustment worked Actual sampling uncertainty Helps some people, but maybe not you