why is chicago deceptive towards building model driven
play

Why is Chicago deceptive?: Towards Building Model-Driven Tutorials - PowerPoint PPT Presentation

Why is Chicago deceptive?: Towards Building Model-Driven Tutorials for Humans Vivian Lai, Han Liu, and Chenhao Tan @vivwylai | @HanLiuAI | @ChenhaoTan University of Colorado Boulder machineintheloop.com 1 AI used in societally


  1. “Why is ‘Chicago’ deceptive?”: Towards Building Model-Driven Tutorials for Humans Vivian Lai, Han Liu, and Chenhao Tan @vivwylai | @HanLiuAI | @ChenhaoTan University of Colorado Boulder machineintheloop.com 1

  2. AI used in societally critical tasks Recidivism prediction Medical diagnosis Amazon secret AI Autonomous driving hiring tool Geiger et al. 2012; European Parliament 2016; Kleinberg et al. 2017; Dastin 2018 2

  3. 3

  4. Explanations! 4

  5. Explaining AI is tricky 5

  6. � � Why is explaining AI tricky? Two distinct learning modes Discovering Emulating 6

  7. � Why is explaining AI tricky? Two distinct learning modes Emulating 7

  8. � Why is explaining AI tricky? Two distinct learning modes Discovering 8

  9. � Why is explaining AI tricky? Two distinct learning modes Discovering AI can discover inconspicuous and counterintuitive patterns. 9

  10. So, how can explaining AI be less tricky? Model-driven tutorials � Elucidate counterintuitive patterns � Enhance humans' ability to understand patterns 10

  11. Model-driven tutorials: Guidelines State-of-the-art science communication 11

  12. Model-driven tutorials: Examples How do we choose examples? • SP-LIME • Spaced repetition 12 Ribeiro et al. 2016

  13. Model-driven tutorials: Examples How do we choose examples? • SP-LIME • Sp Spaced repetit itio ion 13 Ribeiro et al. 2016

  14. Experimental Design & Research Questions R1: Effect of different tutorials Training � � Prediction 14

  15. Experimental Design & Research Questions RQ1: Effect of different tutorials Different No � � tutorials assistance Training Prediction 15

  16. Experimental Design & Research Questions Training RQ1: Effect of different tutorials 16

  17. Experimental Design & Research Questions RQ2: Effect of real-time assistance Different Same � � real-time tutorial assistance (Spaced repetition) Training Prediction 17

  18. Experimental Design & Research Questions Training RQ1: Effect of different tutorials � � Prediction RQ2: Effect of real-time assistance 18

  19. Experimental Design & Research Questions RQ1 & RQ2 RQ3 Linear model Deep model 19

  20. Experimental Design & Research Questions Training RQ1: Effect of different tutorials � RQ3: Effect � of model complexity Prediction RQ2: Effect of real-time assistance 20

  21. Experimental Design & Research Questions Training RQ1: Effect of different tutorials � RQ3: Effect � of model complexity Prediction RQ2: Effect of real-time assistance Performed qualitative study to improve interface design. 21

  22. Research question 1 Can model-driven tutorials improve human Model- Human driven performance without any accuracy? tutorials real-time assistance in the Training Prediction prediction phase? 22

  23. Tutorials are useful to some extent Control 54.6% p=0.018* # of stars Guidelines 60.4% indicates p-values ***: p < 0.001 Spaced repetition 57.9% **: p < 0.01 *: p < 0.05 p=0.1 Spaced repetition 59.2% + guidelines 50 55 60 65 70 75 80 Accuracy (%) 23

  24. Tutorials are useful to some extent Control 54.6% “ p=0.018* The tutorial is # of stars Guidelines 60.4% indicates p-values helpful but it’s ***: p < 0.001 just hard not Spaced repetition 57.9% **: p < 0.01 being able to *: p < 0.05 p=0.1 ” Spaced repetition reference it . 59.2% + guidelines 50 55 60 65 70 75 80 Accuracy (%) 24

  25. Research question 2 ? If not, how do varying levels of real-time assistance in prediction phase affect human Full human performance after training? Full agency automation 25

  26. Prediction: various levels of real-time assistance Signed explanations + Signed explanations predicted label + guidelines Unsigned explanations + predicted label + accuracy statement Signed explanations Signed explanations + predicted label + guidelines Full Full human automation agency Information from AI increases from left to right. 26

  27. Prediction: various levels of real-time assistance Signed explanations + Signed explanations predicted label + guidelines + predicted label + accuracy statement Unsigned Signed Signed explanations + explanations explanations predicted label + guidelines Full Full human automation agency 27

  28. Unsigned explanations Signed explanations 28

  29. Real-time assistance improves performance No assistance 60.4% Unsigned 57.8% # of stars p=0.001*** indicates p-values Signed 70.7% ***: p < 0.001 p=0.001*** **: p < 0.01 Signed + predicted label 74% *: p < 0.05 + guidelines + accuracy Machine 86 50 60 70 80 90 Accuracy (%) 29

  30. Signed highlights is sufficient No assistance 60.4% Unsigned 57.8% # of stars indicates p-values Signed 70.7% ***: p < 0.001 p>0.05 **: p < 0.01 Signed + predicted label 74% *: p < 0.05 + guidelines + accuracy Machine 86 50 60 70 80 90 Accuracy (%) 30

  31. Gap between human+AI & AI No assistance 60.4% Unsigned 57.8% # of stars indicates p-values Signed 70.7% ***: p < 0.001 **: p < 0.01 Signed + predicted label 74% *: p < 0.05 + guidelines + accuracy Machine 86 50 60 70 80 90 Accuracy (%) Poursabzi-Sangdeh et al. 2018; Green & Chen 2019; Lage et al. 2019; Lai & Tan 2019; Carton et al. 2020; Lai et al. 2020 31

  32. Research question 3 Can our results generalize in other models? How do vs. model complexity and explanation methods affect human performance Simple Deep model model with/without training? 32

  33. SVM explanations BERT attention explanations 33

  34. SVM explanations BERT LIME explanations 34

  35. Simple model = better human performance 72.8% SVM 64.1% p=0.001*** # of stars indicates p-values 58.2% BERT-ATT Training 54.1% ***: p < 0.001 No training **: p < 0.01 p=0.001*** *: p < 0.05 64.9% BERT-LIME 59.2% 50 55 60 65 70 75 80 Accuracy (%) Lai et al. 2019 35

  36. Simple model = better human performance 72.8% SVM 64.1% p=0.001*** 58.2% BERT-ATT Training 54.1% No training p=0.001*** 64.9% BERT-LIME 59.2% 50 55 60 65 70 75 80 Accuracy (%) 36

  37. Training leads to better performance 72.8% SVM 64.1% p=0.001*** # of stars indicates p-values 58.2% BERT-ATT Training 54.1% ***: p < 0.001 No training p=0.001*** **: p < 0.01 *: p < 0.05 64.9% BERT-LIME 59.2% p=0.001*** 50 55 60 65 70 75 80 Accuracy (%) 37

  38. Takeaway � Tutorials somewhat improve Vivian Lai, Han Liu, Chenhao Tan human performance @vivwylai | vivwylai@gmail.com � @HanLiuAI | @ChenhaoTan University of Colorado Boulder Explanations from simple models are preferred Website:machineintheloop.com � Paper:https://tinyurl.com/model- driven-tutorials Future directions for human- Workshop:https://tinyurl.com/harn centered tutorials and ess-explanations explanations 38

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend