On Human Predictions with Explanations and Predictions of Machine - - PowerPoint PPT Presentation
On Human Predictions with Explanations and Predictions of Machine - - PowerPoint PPT Presentation
On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study on Deception Detection Vivian Lai and Chenhao Tan @vivwylai | @chenhaotan vivlai.github.io | chenhaot.com University of Colorado Boulder
https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
Risk assessment: COMPAS
https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
Most previous studies are concerned with the impact of such tools used in full automation
Judges are required to take account of the algorithm’s limitations in Wisconsin In the end, though, Justice Bradley allowed sentencing judges to use Compas. They must take account of the algorithm's limitations and the secrecy surrounding it, she wrote, but she said the software could be helpful ”in providing the sentencing court with as much information as possible in order to arrive at an individualized sentence.”
https://www.nytimes.com/2017/05/01/us/politics/sent-to-prison-by-a-software-programs-secret-algorithms.html
Full automation is not desired
https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing
How judges make decisions with COMPAS?
How humans make decisions with machine assistance in challenging tasks?
Full human agency Full automation
Showing machine predicted labels Showing machine predicted labels and explanations Showing machine predicted labels and suggesting high accuracy Showing only explanations (by highlighting salient information)
A spectrum between full human agency and full automation
Full human agency Full automation
Showing machine predicted labels Showing machine predicted labels and explanations Showing machine predicted labels and suggesting high accuracy Showing only explanations (by highlighting salient information)
Deception Detection as a Case Study
87% ~50%
I would not stay at this hotel again. The rooms had a fowl
- dor. It seemed as though the carpets have never been
- cleaned. The neighborhood was also less than desirable.
The housekeepers seemed to be snooping around while they were cleaning the rooms. I will say that the front desk staff was friendly albeit slightly dimwitted.
I would not stay at this hotel again. The rooms had a fowl
- dor. It seemed as though the carpets have never been
- cleaned. The neighborhood was also less than desirable.
The housekeepers seemed to be snooping around while they were cleaning the rooms. I will say that the front desk staff was friendly albeit slightly dimwitted.
The machine predicts that the below review is deceptive I would not stay at this hotel again. The rooms had a fowl
- dor. It seemed as though the carpets have never been
- cleaned. The neighborhood was also less than desirable.
The housekeepers seemed to be snooping around while they were cleaning the rooms. I will say that the front desk staff was friendly albeit slightly dimwitted.
Showing machine predicted labels Showing machine predicted labels and explanations Showing machine predicted labels and suggesting high accuracy Showing only explanations (by highlighting salient information)
Can explanations alone improve human performance?
87% 57.6% 55.9% 54.4% 51.1% 45 55 65 75 85 Machine Heatmap Highlight Examples Control
p=0.006 p<0.001
Explanations alone slightly improve human performance
Accuracy (%)
p=0.056
Showing machine predicted labels Showing machine predicted labels and explanations Showing machine predicted labels and suggesting high accuracy Showing only explanations (by highlighting salient information)
Predicted labels > explanations
87% 74.6% 61.9% 57.6% 51.1% 45 55 65 75 85 Machine Predicted label with accuracy Predicted label without accuracy Heatmap Control
Explicit accuracy improve human performance drastically
Accuracy (%)
p<0.001 p<0.001 p<0.001
Showing machine predicted labels Showing machine predicted labels and explanations Showing machine predicted labels and suggesting high accuracy Showing only explanations (by highlighting salient information)
Tradeoff between human performance and human agency
Higher agency, lower performance Lower agency, higher performance
Showing machine predicted labels Showing machine predicted labels and explanations Showing machine predicted labels and suggesting high accuracy Showing only explanations (by highlighting salient information)
Can explanations moderate this tradeoff?
87% 74.6% 72.5% 61.9% 45 55 65 75 85 Machine Predicted label with accuracy Predicted label & heatmap Predicted label without accuracy
Predicted labels + explanations ≈ explicit accuracy
Accuracy (%)
p<0.001 p<0.001
Showing machine predicted labels Showing machine predicted labels and explanations Showing machine predicted labels and suggesting high accuracy Showing only explanations (by highlighting salient information)
How much do humans trust the predictions?
79.6% 78.7% 64.4% 45 55 65 75 85 Predicted label with accuracy Predicted label & heatmap Predicted label without accuracy
Explanations help increase humans trust on predictions
Trust (%)
p<0.001 p<0.001