ICML’2019
Automatic Classifiers as Scientific Instruments: One Step Further Away from Ground-Truth
Jacob Whitehill and Anand Ramakrishnan Worcester Polytechnic Institute (WPI), Massachusetts, USA
Automatic Classifiers as Scientific Instruments: One Step Further - - PowerPoint PPT Presentation
Automatic Classifiers as Scientific Instruments: One Step Further Away from Ground-Truth Jacob Whitehill and Anand Ramakrishnan Worcester Polytechnic Institute (WPI), Massachusetts, USA ICML2019 Machine learning to advance basic
ICML’2019
Jacob Whitehill and Anand Ramakrishnan Worcester Polytechnic Institute (WPI), Massachusetts, USA
ICML’2019
instruments, e.g.:
instead of questionnaires.
instead of electromyography.
instead of observational protocols.
Empatica E4 EDA Emotient/ iMotions Kaur et al. 2018
ICML’2019
between two constructs U and V, e.g.:
from a sample of n participants.
assume w.l.o.g. have 0-mean and 1-length.
u, v ∈ Rn
Only the angle between the two vectors determines their correlation.
r = ρ(u, v) = u>v = cos ∠(u, v)
ICML’2019
stress detector d whose correlation with ground-truth measurements is q (known from prior validation)?
between U and V could result?
u ˆ u
ICML’2019
−1.0 −0.5 0.0 0.5 1.0 −1.00 −0.75 −0.50 −0.25 0.00 0.25 0.50 0.75 1.00
u v u v u v
r
ICML’2019
correlation with is q.
much larger than, but same sign as, the ground-truth correlation.
−1.0 −0.5 0.0 0.5 1.0 −1.00 −0.75 −0.50 −0.25 0.00 0.25 0.50 0.75 1.00
ˆ u v ˆ u u v ˆ u u
q
ICML’2019
−1.0 −0.5 0.0 0.5 1.0 −1.00 −0.75 −0.50 −0.25 0.00 0.25 0.50 0.75 1.00
with is also q.
this is the opposite sign as the ground-truth correlation.
ˆ u0 u v ˆ u0 ˆ u u
q
ˆ u0 v
We call this a false correlation.
ICML’2019
1.The set of all vectors whose correlation with is q, is an (n-3)-sphere . 2.If the correlation between and is r, then the expected sample correlation between and , where is drawn uniformly at random from , is qr. 3.We derive a formula h(n,q,r) for the probability of a false correlation. 4.We show that h is monotonically decreasing in q and n.
u T n ∈ Rn
ICML’2019
1.The set of all vectors whose correlation with is q, is an (n-3)-sphere . 2.If the correlation between and is r, then the expected sample correlation between and , where is drawn uniformly at random from , is qr. 3.We derive a formula h(n,q,r) for the probability of a false correlation. 4.We show that h is monotonically decreasing in q and n.
u u v v ˆ u ˆ u T n T n ∈ Rn
ICML’2019
1.The set of all vectors whose correlation with is q, is an (n-3)-sphere . 2.If the correlation between and is r, then the expected sample correlation between and , where is drawn uniformly at random from , is qr. 3.We derive a formula h(n,q,r) for the probability of a false correlation. 4.We show that h is monotonically decreasing in q and n.
u u v v ˆ u ˆ u T n T n ∈ Rn
ICML’2019
1.The set of all vectors whose correlation with is q, is an (n-3)-sphere . 2.If the correlation between and is r, then the expected sample correlation between and , where is drawn uniformly at random from , is qr. 3.We derive a formula h(n,q,r) for the probability of a false correlation. 4.We show that h is monotonically decreasing in q and n.
u u v v ˆ u ˆ u T n T n ∈ Rn
But it can still be non-negligible for values of n, q used in recent affective computing studies — despite a small p-value.
ICML’2019
U: Engagement V: Cognitive task performance
detector d (q=0.50).
ICML’2019
25 50 75 100 125 150 175 200 # participants (n) 0.0 0.1 0.2 0.3 0.4 0.5 3roEaEility
(ngagePent new: 3roEaEility of "false negative" correlation (q 0.5, r 0.37)
n
U: Engagement V: Cognitive task performance
detector d (q=0.50).