Causality Workshop 2018 The book of WHY
1
published in May 2018 current amazon bestseller #1 in the category “statistics”
(followed by Elements of Statistical Learning)
Causality Workshop 2018 The book of WHY published in May 2018 - - PowerPoint PPT Presentation
Causality Workshop 2018 The book of WHY published in May 2018 current amazon bestseller #1 in the category statistics (followed by Elements of Statistical Learning) Pearl received the Turing Award 2011 Beate Sick 1 Topics of today
1
(followed by Elements of Statistical Learning)
2
“Have you eaten from the tree which I forbade you?”
“The woman you gave me for a companion, she gave me fruit from the tree and I ate.”
4
Epidemiological studies of CHD and the evolution of preventive cardiology Nature Reviews Cardiology 11, 276–289 (2014)
5
Aargauer Zeitung
Galton in 1877 at the Friday Evening Discourse at the Royal Institution of Great Britain in London.
Image credits: “The Book of Why”
Groups of fathers with IQ=115 IQ distribution in sons with E(IQsons)=112 with IQfathers=115
slope 1
2 2
X1 X1,X2 X2 X1,X2
100 15 cov( ) ~ , 100 cov( ) 15 N
IQ of fathers IQ of sons
Remark: Correlation of IQs of parents and children is only 0.42 https://en.wikipedia.org/wiki/Heritability_of_IQ
2 2 1 1 2 2 1 1
X1 X2
~ 100, 15 ~ 100, 15 N N
Image credits (changed): https://www.youtube.com/watch?v=aLv5cerjV0c
IQ distribution in fathers with E(IQfathers)=112 with IQsons=115
slope 1 1SD
0.8SD
2 2
X1 X1,X2 X2 X1,X2
100 15 cov( ) ~ , 100 cov( ) 15 N
2 2 1 1 2 2 1 1
X1 X2
~ 100, 15 ~ 100, 15 N N
IQ of fathers IQ of sons
Image credits (changed): https://www.youtube.com/watch?v=aLv5cerjV0c
Groups of sons with IQ=115 IQ distribution in fathers with E(IQfathers)=112 with IQsons=115
2 2
X1 X1,X2 X2 X1,X2
100 15 cov( ) ~ , 100 cov( ) 15 N
2 2 1 1 2 2 1 1
X1 X2
~ 100, 15 ~ 100, 15 N N
IQ of sons IQ of fathers
Image credits (changed): https://www.youtube.com/watch?v=aLv5cerjV0c
2 2 1 1 1
ˆ | X E X X X
stand. 2 1 1
c c
2 1 2 2
X1 X2
1 ~ , 1
X X
c N c Regression line equation:
i1 1 i2 2 1 1 2
1 ( ) ( ) 1 c sd( ) sd( )
n i
x x x x n x x
After standardization of the RV:
2 2 1 1 2 2 2 2
X1 X2
~ 0, 1 ~ 0, 1 N N quantifies regression to the mean
To get this test result, a person might
IQ in test 1 IQ in test 2
result in test 1 result in test 2
13
14
Terry Speed, president of the Biometric Society 1994
total quality management
See also http://bigdata-madesimple.com/30-tweetable-quotes-data-science/
15
The book of Why https://www.quantamagazine.org/to-build-truly-intelligent-machines-teach-them-cause-and-effect-20180515/
16
(if we observe a certain number of storks, what is our best estimate of human birth rate?)
(if we change the number of storks, what will happen with the human birth rate?)
17 Image credits: “The Book of Why”
18
19
2 1 1 1 1
t i i i i p ip
20
21
size of shoe
salary
22
Adjust, control for school Do not adjust for school m1=lm(academic ~ sport, data=dat) m2=lm(academic ~ sport + school, data=dat)
23
m1=lm(academic ~ sport, data=dat) m2=lm(academic ~ sport + school, data=dat)
24
Y ~ X
Y ~ X + M
25
Red: enzyme works Blue: enzyme does not work
Y ~ X Y ~ X + M
26
y ~ x + C y ~ x Y ~ X
27
28
coffee drinker by choice non-coffee drinker by choice
29
Since the treatment is assigned randomly to both treatment groups are exchangeable. Hence observed differences of the
RCT through the lens of a causal graphical model
30
31
X Y D E
adjusted variable
y ~ x y ~ x y ~ x + C y ~ x y ~ x + M y ~ x + D y ~ x + E y ~ x
32
'(y | do(X
33
V3 V1 V2 V4 V7 V5 V6
34
V3 V1 V2 V4 V7 V5 V6
35
(This ensures that we do not block a causal path from X to Y)
A path is blocked if 1 single triple-segment is blocked!
Control for a variable = using the variable in the regression model
36
V3
V1 V2 V4 V7 V5 V6
37
5 1 2
treatment
38
5 1
treatment
39
5 1 3
treatment
40
5 1 2
treatment
41
5 1 4
treatment
42
V
i B
i
S
43
A controlled parent blocks the backdoor path either as controlled mediator or controlled confounder.
45
Image credits: “The Book of Why”
46