[PDF] - Acknowledgements Krzysztof Gajos Corin Anderson Mary Czerwinski PDF Document

SLIDE 1

1

Gu Guid idelines for

r In

Intellig igent t In Interfaces

Daniel Weld University of Washington

Acknowledgements

Krzysztof Gajos
Corin Anderson
Mary Czerwinski
Pedro Domingos
Oren Etzioni
Raphael Hoffman
Tessa Lau
Desney Tan
Steve Wolfman
UW AI Group
DARPA, NSF, ONR, WRF, Microsoft Research

14-Mar-19 Daniel S. Weld / Univ. Washington 12

SLIDE 2

2

Early Adaptation: Mitchell,Maes

14-Mar-19 Daniel S. Weld / Univ. Washington 21

Predict:

Email message priorities Meeting locations, durations

 Principle 1:

Defaults minimize cost of errors

 Principle 2:

Allow users to adjust thresholds

Adaptation in Lookout: Horvitz

14-Mar-19 Daniel S. Weld / Univ. Washington 22

Adapted from Horvitz

SLIDE 3

3

Adaptation in Lookout: Horvitz

14-Mar-19 Daniel S. Weld / Univ. Washington 23

Resulting Principles

14-Mar-19 Daniel S. Weld / Univ. Washington 24

Decision-Theoretic Framework
Graceful degradation of service precision
Use dialogs to disambiguate

(Considering cost of user time, attention)

Adapted from Horvitz

[Horvitz CHI-99]

SLIDE 4

4

Horvitz <-> POMDP?

What’s Shared?
Policy mapping from belief state to action
Idea of maximizing utility
What’s Different?
No model of state transition
No lookahead or notion of time
Greedy policy

25

Principles About Invocation

Allow efficient invocation, correction & dismissal

14-Mar-19 Daniel S. Weld / Univ. Washington 26

Timeouts minimize cost of prediction errors

SLIDE 5

5

20 Year Retrospective

More guidelines
https://medium.com/microsoft-design/guidelines-for-human-ai-

interaction-9aa1535d72b9

27 14-Mar-19 Daniel S. Weld / Univ. Washington

Human-AI Teams

Environment gives percept
AI makes recommendation [+ explanation]
Human decides whether to
Trust AI’s advice, or
Get more info and decide herself
Reward based on speed/accuracy

SLIDE 6

6

Updates in Human-AI Teams

Environment gives percept
AI makes recommendation [+ explanation]
Human decides whether to
Trust AI’s advice, or
Get more info and decide herself
Reward based on speed/accuracy

Gagan Bansal Ece Kamar Besa Nushi Eric Horvitz Walter Lasecki [Bansal et al. AAAI19]

Many ML Algorithms aren’t Stable wrt Updates

When trained on more data (same distribution)…

Updates (h2) increase ROC…

first “stochasticity”– defined |¬ satisfied, satisfied. first “one-sided”: |¬ classifier fied. “two-sided”: classifier satisfied. fix classifier’ fix classifier first type– classifier classifier classifier Classifier Dataset ROC h1 ROC h2 CS LR Recidivism 0.68 0.72 0.74 Credit Risk 0.72 0.77 0.68 Mortality 0.68 0.77 0.54 MLP Recidivism 0.59 0.73 0.62 Credit Risk 0.70 0.80 0.69 Mortality 0.71 0.84 0.77 classifier

specifics

SLIDE 7

7

Many ML Algorithms aren’t Stable wrt Updates

When trained on more data (same distribution)…

Updates (h2) increase ROC,
But have low compatibility score,

first “stochasticity”– defined |¬ satisfied, satisfied. first “one-sided”: |¬ classifier fied. “two-sided”: classifier satisfied. fix classifier’ fix classifier first type– classifier classifier classifier Classifier Dataset ROC h1 ROC h2 CS LR Recidivism 0.68 0.72 0.74 Credit Risk 0.72 0.77 0.68 Mortality 0.68 0.77 0.54 MLP Recidivism 0.59 0.73 0.62 Credit Risk 0.70 0.80 0.69 Mortality 0.71 0.84 0.77 classifier

specifics ⇢ –1– define “kind” –2– defines

finity

C(h1, h2) = 1 − count(h1 = y, h2 6 = y) count(h2 6 = y)

20

defines –1– classification defines classification. · − · − –2– define λ · D defines classification λ classification compatibility– defines · user’

”performance” ”accuracy”

But for Teams, …

Team Performance Time

SLIDE 8

8

But for Teams, Updates …

Team Performance Time

But for Teams, Updates should be Compatible

Team Performance Time