THE ISSUE OF BIAS
TRADEOFFS AND BALANCE IN ML
- Prof. dr. Mireille Hildebrandt
Interfacing Law & Technology Vrije je Uni niversiteit B Brussel l Smart Environments, Data Protection & the Rule of Law Radboud Radboud U Uni niversity y
THE ISSUE OF BIAS TRADEOFFS AND BALANCE IN ML Prof. dr. Mireille - - PowerPoint PPT Presentation
THE ISSUE OF BIAS TRADEOFFS AND BALANCE IN ML Prof. dr. Mireille Hildebrandt Interfacing Law & Technology Vrije je Uni niversiteit B Brussel l Smart Environments, Data Protection & the Rule of Law Radboud Radboud U Uni niversity y
TRADEOFFS AND BALANCE IN ML
Interfacing Law & Technology Vrije je Uni niversiteit B Brussel l Smart Environments, Data Protection & the Rule of Law Radboud Radboud U Uni niversity y
1.
hree T Typ ypes o
Bias
1. 1. inher eren ent b bias 2. 2. bias a as u unfairnes ess 3. 3. bias o
prohibited ed g grounds
2.
le T Trans nsparenc ncy y 3.
mated De Decisions ns
4. 4. Purpose e 5. 5. GD GDPR PR
17/11/16 Hildebrandt's KNUT MEMORIAL LECTURE 2016 2
8/12/2016 Hildebrandt - NISP ML and the LAW 3
■ thr hree t typ ypes o
bias:
8/12/2016 Hildebrandt - NISP ML and the LAW 4
8/12/2016 Hildebrandt - NISP ML and the LAW 5
■ bias bias inhe nherent nt i in a n any a y action-p n-perception-s n-sys ystem ( m (APS) – Thomas Nagel’s ‘Seeing like a bat’ – the salience of the output of the APS depends on the agent & the environment – perception is a means to anticipate the consequences of action: ‘enaction’ – there is no such thing as objective neutrality, but – this does not imply that anything goes – on the contrary: life and death may depend on getting it ‘right’
8/12/2016 Hildebrandt - NISP ML and the LAW 6
■ ML i is a about – choosing and pruning relevant, correct and sufficiently complete tra traini ning ngsets ts – developing and training the right algorithm to detect the right mathem ematical f function – ML is based on a productive b e bias, cp. Hume as well as Gadamer – op
timi mizati tion
– ther ere a e are a e always t trade-o e-offs! – reliability depends on the extent to which the f e future c e confirms t the p e past – David Wolpert’s no free lunch theorem should inform our assessment
27 October '16 Robolegal: paralegal or toplawyer? 7
Hume Hume, , Gadamer Gadamer, , Wo Wolpert: n no f free l lunch th theo eorem em
Whe here d = = t traini ning ng s set; ; f = = ‘t ‘target’ i ’ input-o
rela lations nshi hips; ; h = h = h hyp ypothe hesis ( (the he a alg lgorithm' hm's g guess f for f f ma made i in r n respons nse t to d d); a ; and nd C = = o
‘loss’ a ’ associated ed w with f f a and h h ( (‘g ‘gen ener eralization er error’) ’)
How well you do is determined by how ‘aligned’ your learning algorithm P(h|d) is with the actual posterior, P(f|d).
Check http://www.no-free-lunch.org
8/12/2016 Hildebrandt - NISP ML and the LAW 8
Hume Hume, , Gadamer Gadamer, , Wo Wolpert: n no f free l lunch th theo eorem em
Im Implications: :
– The bias that is necessary to mine the data will co-determine the results – This relates to the fact that the data used to train an algorithm is finite – ‘Reality’, whatever that is, escapes the inherent reduction – Data is not the same as what it refers to or what it is a trace of
8 July 2016 Privacy Hub Summerschool 9
“We shall see that most current theory of machine learning rests on the crucial
crucial assum assumptio ion n that the distribution of training examples is identical to the distribution of test
is important to keep in mind that this assumption must often be violated in practice.” Tom Mitchell
8 July 2016 Privacy Hub Summerschool 10
Michael Veale: i. ‘the common assum assumptio ion th that fu futu ture re popu populati tions
are e no not fu functi nctions
past dec ecisions is often violated in the public sector;’ ■ actually, pres esen ent f futures es do do c co-d
ermine t e the f e future p e pres esen ent – predictions influence the move from training to test set – they change the probability and the hypothesis space – they enlarge both uncertainty and possibility ■ the point is about the d distribution of both: who gets how much of what – this depends on who gets to act on the output – if machines define a situation as real it is real in its consequences
8/12/2016 Hildebrandt - NISP ML and the LAW 11
8/12/2016 Hildebrandt - NISP ML and the LAW 12
8/12/2016 Hildebrandt - NISP ML and the LAW 13
■ ML i involves es a a t training s set, a , algorithms, a , a t tes est s set – whether supervised, reinforced or unsupervised ■ trade-o e-offs a are i e inevitable: e: – choice of training & t & tes est s set: size, relevance, accuracy, completeness – choice of lea earning a algorithms: clustering, decision tree, deep learning, random forests, back propagation, linear regression etc etc – speed eed of output (e.g. real-time) – accuracy accuracy of predictions – outlier er d detec ection ■ N=All i ll is hu humb mbug, though it may apply in a specific sense under certain conditions
8/12/2016 Hildebrandt - NISP ML and the LAW 14
■ suppose: e: – experts train algorithms on relevant data sets – and keep on testing the output (reinforcement learning) – until the system does very well (e.g. Zeb, student paper grading, legal intelligence) – and the experts get bored and do other things (semiotic desensitization)? – while the systems start feeding increasingly on each other’s output ■ who c can t tes est w whether er t the s e system em i is s still d doing w wel ell 2 2 y yea ears l later er? ? ■ e.g. medical diagnosis, legal intelligence, critical infrastructure
8/12/2016 Hildebrandt - NISP ML and the LAW 15
■ who c can t tes est w whether er t the s e system em i is s still d doing w wel ell 2 2 y yea ears l later er? ? ■ e.g. medical diagnosis, legal intelligence, critical infrastructure ■ what i is ‘d ‘doing w wel ell’? ’? ■ who g gets t to d deter ermine e what i it m mea eans t to ‘d ‘do w wel ell’? ’? ■ so, r , rep eplacem emen ent i is high r risk h high g gain in t ter erms o
functionality, f , fairnes ess a and o
ability t to cognize o e our en environmen ent – a as t this c cognition i is m med ediated ed b by M ML s system ems
8/12/2016 Hildebrandt - NISP ML and the LAW 16
■ APoJ used as a means to provide feed eedback to lawyers, clients, prosecutors, courts ■ APoJ could involve a sen ensitivity a analysis, modulating facts, legal precepts, claims ■ APoJ as a domain for exp xper erimen entation, developing new insights, argumentation patterns, testing alternative approaches ■ APoJ could detect missing information (facts, legal arguments), helping to improve e (instea ead o
mer erel ely p pred edict) the outcome of cases ■ APoJ can be used to improve the acuity of human judgment, if n not u used ed t to r rep eplace i e it ■ if APoJ is used to replace, it should not be confused with law; then en i is b bec ecomes es ad adminis ministrat ratio ion n – t the d e differ eren ence i e is c crucial, c , critical a and p per ertinen ent ■ cp. . http://www.vikparuchuri.com/blog/on-the-automated-scoring-of-essays/
27 October '16 Robolegal: paralegal or toplawyer? 17
8/12/2016 Hildebrandt - NISP ML and the LAW 18
■ bias t tha hat s some me w would ld q quali lify a y as unf nfair – this is a matter of et ethics – we may not agree about goals (values) means (nudging, forcing, negotiating) evaluation: – deontological? utilitarian? virtue ethics? pragmatarian? – that i is w why w we n e need eed l law
8/12/2016 Hildebrandt - NISP ML and the LAW 19
8/12/2016 Hildebrandt - NISP ML and the LAW 20
■ bias t tha hat d discrimi mina nates o
n the he b basis o
hibited le legal g l ground nds – this i is u unlawful a and c can r res esult i in l leg egal r red edres ess: : – fines, tort liability, compensation – invalidation of contracts or legislation
8/12/2016 Hildebrandt - NISP ML and the LAW 21
8/12/2016 Hildebrandt - NISP ML and the LAW 22
■ explanation, interpretability: if y you c cannot t tes est i it y you c cannot c contes est i it – flesh out the productive bias that ensures functionality: test & contest – figure out the unfairness in the training set & the algos: test & contest – infer discrimination on prohibited legal grounds: test & contest
8/12/2016 Hildebrandt - NISP ML and the LAW 23
■ explanation, interpretability: if y you c cannot t tes est i it y you c cannot c contes est i it: 1. deliberate concealment: trade secrets, IP rights and public security 2. we are not wired for understanding statistics, ML or cyber-physical infrastructures 3. mismatch between high dimensional math and meaning attribution
8/12/2016 Hildebrandt - NISP ML and the LAW 24
■ software verification: mathematical (intestines) & empirical (input-output) ■ ‘softwire’ verification: real life implications, safety and reliability issues ■ explorative experiment, a posteriori control (Schiaffonati & Amigoni), AB-testing ■ pTA: citizens’ juries, participatory social science research, Wynne’s public understanding of science, Stirling’s matrix of uncertitude ■ need for agonistic discourse (Rip in STS, Mouffe in political theory)
8/12/2016 Hildebrandt - NISP ML and the LAW 25
8/12/2016 Hildebrandt - NISP ML and the LAW 26
■ current nt c cho hoice a archi hitecture o
AI: I:
■ ML, , Io IoT, A , AI i I is me meant nt t to p pre-e
int ntent nt ■ to r run s n smo moothly u hly und nder t the he r radar o
everyd yday li y life ■ it i is a all a ll about c cont ntinu nuous s surreptitious a automa mated d decisions ns
8/12/2016 Hildebrandt - NISP ML and the LAW 27
= c cho hoice a archi hitecture f for d data s subje jects ( (EU le U legisla lation) n)
1. 1. the he r right ht not t to b be s e subjec ject t to a automated ed d dec ecisions t that h have a e a s significant i impact 2. 2. the he r right ht t to a a notification, a , an e exp xplanation a and a anticipation if e exception a n appli lies 3. 3. the he right t to o
ject a against p profiling b based ed o
leg egitimate i e inter eres est o
the c e controller er
8/12/2016 Hildebrandt - NISP ML and the LAW 28
= c cho hoice a archi hitecture f for d data s subje jects: :
1. 1. the he r right ht no not t to b be s subje ject t to a automa mated d decisions ns t tha hat ha have a a s signi nificant nt i impact, u , unle nless a.
eces essary f for c contract b.
ed b by E EU o
MS l law c.
xplicit c consen ent ■ und nder a a a and nd c c: r : right ht t to hu huma man i n int ntervent ntion, p n, possibili lity t y to c cont ntest ■ prohi hibition t n to ma make s such d h decisions ns b based o
n sens nsitive d data
8/12/2016 Hildebrandt - NISP ML and the LAW 29
= c cho hoice a archi hitecture f for d data s subje jects:
2. 2. the he r right ht t to a a no notification, a n, an e n expla lana nation a n and nd a ant nticipation i n if e exception a n appli lies – exi xisten ence o e of d dec ecisions b based ed o
profiling – mea eaningful e exp xplanation o
the l e logic i involved ed – significance a e and en envisaged ed c conseq equen ences es o
such p proces essing
8/12/2016 Hildebrandt - NISP ML and the LAW 30
= c cho hoice a archi hitecture f for d data s subje jects:
3. 3. the he r right ht t to o
ject a agains nst p profili ling ng w whe hen b n based o
n int nterests o
the he c cont ntrolle ller – at a any t time a e against p profiling f for d direc ect m marketing – or o
grounds r rel elating t to t thei eir p particular s situation
8/12/2016 Hildebrandt - NISP ML and the LAW 31
■ ind ndividual c l citizens ns ne need: :
– the c e capability t to r rei einven ent t them emsel elves es, , – seg egreg egate t e thei eir d data-d
en a audien ences es, , – have t e thei eir h human d dignity r res espec ected ed b by t the d e data-d
en i infrastructures es – make s e sure M e ML a applications d don’t t tel ell o
them em b beyond w what i is n nec eces essary – the c e capability t to d detec ect a and c contes est b bias i in t thei eir d data-d
en en environmen ents
8/12/2016 Hildebrandt - NISP ML and the LAW 32
■ the he a archi hitects o
new d data-d
n world ld ne need t to mi mind nd: :
– integ egrity o
method: r : rigorously s sound a and c contes estable m e methodologies es – acco account untab abiit iity: ( : (con)tes estability o
both d data s sets a and a algorithms – fairnes ess: t : tes esting b bias i in t the t e training s set, t , tes esting b bias i in t the l e lea earning a algorithm – privacy & d & data p protec ection: r : red educe m e manipulability, g , go f for p participation a and r res espec ect
8/12/2016 Hildebrandt - NISP ML and the LAW 33
■ law should en enable ( e (not f force) e) companies to a act e ethically (Montesquieu) ■ need to create a level el p playing f fiel eld that puts a thres eshold in the market ■ to the extent that one cannot g give r e rea easons for an automated decision, it can be contested
8/12/2016 Hildebrandt - NISP ML and the LAW 34
■ GDPR en enforcem emen ent: : – fines es o
up t to 4 4% g global t turnover er – inves estigative p e power ers D DPAs As i including
■ acces ess t to a any p prem emises es, d , data p proces essing eq equipmen ent a and m mea eans
8/12/2016 Hildebrandt - NISP ML and the LAW 35
8/12/2016 Hildebrandt - NISP ML and the LAW 36