DATA PROTECTION RIGHT Prof. dr. Mireille Hildebrandt Interfacing - - PowerPoint PPT Presentation

data protection
SMART_READER_LITE
LIVE PREVIEW

DATA PROTECTION RIGHT Prof. dr. Mireille Hildebrandt Interfacing - - PowerPoint PPT Presentation

GETTING DATA PROTECTION RIGHT Prof. dr. Mireille Hildebrandt Interfacing Law & Technology Vrije Universiteit Brussel Smart Environments, Data Protection & the Rule of Law Radboud University 21/2/17 Hildebrandt SNS seminar Stockholm


slide-1
SLIDE 1

GETTING DATA PROTECTION RIGHT

  • Prof. dr. Mireille Hildebrandt

Interfacing Law & Technology Vrije Universiteit Brussel Smart Environments, Data Protection & the Rule of Law Radboud University

slide-2
SLIDE 2

21/2/17 Hildebrandt SNS seminar Stockholm 2

slide-3
SLIDE 3

what’s next?

1.

  • 1. From

m online ine to onlif ife 2.

  • 2. Machine

ne Learning rning 3.

  • 3. Data

a Protec ectio tion

21/2/17 Hildebrandt SNS seminar Stockholm 3

slide-4
SLIDE 4

what’s next?

  • 1. From
  • m on
  • nli

line to

  • on
  • nli

life

21/2/17 Hildebrandt SNS seminar Stockholm 4

slide-5
SLIDE 5
  • nline → onlife

■ internet: packet switching & routing, network structure, ■ world wide web: hyperlinking ■ search engines, blogs, social media, web portals ■ web platforms [network effects & filter bubbles; reputation & fake news] ■ mobile applications [moving towards IoT, wearables] ■ IoT: cyberphysical infrastructures [connected cars, smart energy grids] ■ cloud computing, fog computing & edge computing

21/2/17 Hildebrandt SNS seminar Stockholm 5

slide-6
SLIDE 6

21/2/17 Hildebrandt SNS seminar Stockholm 6

slide-7
SLIDE 7

21/2/17 Hildebrandt SNS seminar Stockholm 7

slide-8
SLIDE 8

21/2/17 Hildebrandt SNS seminar Stockholm 8

slide-9
SLIDE 9
  • nlife: data driven agency

■ creating added value from big data or small data ■ predicting behaviours ■ pre-empting behaviours ■ interplay of backend & frontend of computing systems ■ interfaces enable but they also hide, nudge and force [AB testing, ‘by design’ paradigms]

21/2/17 Hildebrandt SNS seminar Stockholm 9

slide-10
SLIDE 10
  • nlife: digital unconscious

Big Data Space: ce: ■ accumulation of behavioural and other data ■ mobile and polymorphous data & hypothesis spaces ■ distributed storage [once data has been shared, control becomes a challenge] ■ distributed access [access to data or to the inferences, to training set & algos]

21/2/17 Hildebrandt SNS seminar Stockholm 10

slide-11
SLIDE 11

21/2/17 Hildebrandt SNS seminar Stockholm 11

slide-12
SLIDE 12
  • nlife: digital unconscious

Big Data Space: ce: the e envelop elop of big data space drives human agency, providing convenience & resilience Weiser’s calm computi uting, IBM’s auton

  • nom
  • mic

ic computi uting:

  • increasing dependence on the dynamics of interacting data driven cyberphysical systems

21/2/17 Hildebrandt SNS seminar Stockholm 12

slide-13
SLIDE 13

21/2/17 Hildebrandt SNS seminar Stockholm 13

slide-14
SLIDE 14

what’s next?

  • 2. Machin

ine Learni ning ng

21/2/17 Hildebrandt SNS seminar Stockholm 14

slide-15
SLIDE 15

21/2/17 Hildebrandt SNS seminar Stockholm 15

slide-16
SLIDE 16

big data, open data, personal data

■ BIG – volume (but, n=all is nonsense) – variety (unstructured in sense of different formats) – velocity (real time, streaming) ■ OP OPEN EN as opposed to proprietary? reuse? repurposing? public-private? – creating added value is hard work, not evident, no guarantees for return on investment ■ PER ERSONA ONAL data: IoT will contribute to a further explosion of personal data – high risk high gain (think DPIA)? anonymisation will mostly be pseudonymisation!

21/2/17 Hildebrandt SNS seminar Stockholm 16

slide-17
SLIDE 17

machine Learning (ML)

“we say that a machine learns:

  • with respect to a particular task T,
  • performance metric P, and
  • type of experience E,

if if

  • the system reliably improves its performance P
  • at task T,
  • following experience E.”

(Tom Mitchell)

http://www.cs.cmu.edu/~tom/mlbook.html

21/2/17 Hildebrandt SNS seminar Stockholm 17

slide-18
SLIDE 18

types of machine learning

■ super pervi vised sed (lear arning ning from

  • m exam

ample les – requi uire res s labelli elling, ng, doma main in exper ertise tise) ■ reinf inforce rcement ment (lea earnin rning g by correcti rection

  • n - requires

uires prior r doma

  • main

in exper erti tise) se) ■ uns nsuper upervised vised (bott ttum up up, induc ucti tive e – danger nger of overfitt tting) ing)

21/2/17 Hildebrandt SNS seminar Stockholm 18

slide-19
SLIDE 19

21/2/17 Hildebrandt SNS seminar Stockholm 19

slide-20
SLIDE 20

bias

  • ptimisation

spurious correlations

  • 2. have a network trained to recognize animal faces

  • 1. present it with a picture of a flower

  • 2. run the algorithms

  • 3. check the output (see what it sees)

http://www.nature.com/news/can-we-open-the-black-box-of-ai-1.20731

21/2/17 Hildebrandt SNS seminar Stockholm 20

slide-21
SLIDE 21

Wol

  • lper

pert: : no no free ee lu lunc nch h theor

  • rem

em

Wher here d = trainin ning g set; et; f = ‘target’ input-ou

  • utp

tput ut relat ationshi ionships; s; h = hypo poth thesi esis (the he algori rith thm's m's gue uess ss for f made de in response ponse to d); ; and C = off-trai training ng-set ‘loss’ associated with f and h (‘generalization error’)

How well you do is determined by how ‘aligned’ your learning algorithm P(h|d) is with the actual posterior, P(f|d).

Check http://www.no-free-lunch.org

21/2/17 Hildebrandt SNS seminar Stockholm 21

slide-22
SLIDE 22

Wol

  • lper

pert: : no no free ee lu lunc nch h theor

  • rem

em

Summary: – The bias that is necessary to mine the data will co-determine the results – This relates to the fact that the data used to train an algorithm is finite – ‘Reality’, whatever that is, escapes the inherent reduction – Data is not the same as what it refers to or what it is a trace of

21/2/17 Hildebrandt SNS seminar Stockholm 22

slide-23
SLIDE 23

21/2/17 Hildebrandt SNS seminar Stockholm 23

slide-24
SLIDE 24

21/2/17 Hildebrandt SNS seminar Stockholm 24

slide-25
SLIDE 25

trade-offs

■ NFL FL theo eorem rem –

  • verfitting, overgeneralization

■ trainin ning g set, et, domai main n kno nowled wledge, ge, hypo poth theses ses space, ce, test st set et – accuracy, precision, speed, iteration ■ low w hanging ging frui uit t – may be cheap and/or available but not very helpfull ■ data nor algori rith thms s are object jectiv ive e – bias in the data, bias of the algos, guess what: bias in the output ■ the e more re data, a, the e larger er the e hypo poth theses es sp space, e, the e more

  • re pattern

erns – spurious correlations, computational artefacts

21/2/17 Hildebrandt SNS seminar Stockholm 25

slide-26
SLIDE 26

data hoarding & obesitas

■ data obes esitas itas: : lots of data, but often incorrect, incomplete, irrelevant (low hanging fruit) – any personal data stored presents security and other risks sks (need for DPIA, DPbD) – pu purpose rpose limitati tion

  • n is crucial: select

ect before re you

  • u collect

lect (and while, and after) ■ pattern ern obesi esitas tas: : trained algorithms can see patterns anywhere, added value? – training set and algorithms ne necessari essarily ly contain bias, this may be problematic (need for DPIA, DPbD) – pu purpose rpose limitati tion

  • n is crucial: to prevent spurious correlations, to test

t rele levance nce

21/2/17 Hildebrandt SNS seminar Stockholm 26

slide-27
SLIDE 27

agile and lean computing

■ agile e softw tware are developme elopment: nt:

– iteration instead of waterfall – collaboration domain experts, data scientists, whoever invests – initial purpose (prediction of behaviour, example: tax office, car insurance) – granular purposing (testing specific patterns, AB testing to nudge specific behaviour)

■ lean n com

  • mputing:

uting:

– less data = more effective & more efficient

■ meth ethodo dologi logica cal l integri egrity ty:

– make your software testable and contestable: mathematical & empirical software verification – secure logging, open source

21/2/17 Hildebrandt SNS seminar Stockholm 27

slide-28
SLIDE 28

what’s next?

  • 4. Data

a Protect ection ion Law

21/2/17 Hildebrandt SNS seminar Stockholm 28

slide-29
SLIDE 29

pr priv ivacy acy and nd aut utonomy

  • nomy

■ th the im impli lica cati tion

  • ns

s of

  • f pre-empti

tive co computi ting:

– AB testing & nudging – pre-emption of our intent, playing with our autonomy – we become subject to decisions of data-driven agents – this choice architecture may generate manipulability

21/2/17 Hildebrandt SNS seminar Stockholm 29

slide-30
SLIDE 30

no non-discrimination discrimination

■ three ee type pes of bi bias:

– bias inherent in any action-perception-system (APS) – bias that some would qualify as unfair – bias that discriminates on the basis of prohibited legal grounds

21/2/17 Hildebrandt SNS seminar Stockholm 30

slide-31
SLIDE 31

the opacity argument in ML:

1. 1. intent ntional nal conceal alment ment

– trade de secre rets ts, , IP right hts, s, pub ublic c security urity

2. 2. we we have learned d to read and write, , not

  • t to code or do machine

hine learning ing

– monopoly of the new ‘clerks’, the end of democracy

3. 3. mismatc match h betwee etween mathe hematic matical al optimi miza zation tion and human an semant ntics ics

– when it comes to law and justice we cannot settle for ‘computer says no’

– inspired by: Jenna Burrell, How the machine ‘thinks’: Understanding opacity in machine learning algorithms’, in Big Data ta & Society ty, January-June 2016, 1-12 21/2/17 Hildebrandt SNS seminar Stockholm 31

slide-32
SLIDE 32

due due pr proc

  • cess

ess

■ in the case of automated decisions taken by AI systems we need:

  • 1. to know th

that ML or other algorithms determined the decision

  • 2. to know which

h data ta points nts inform the decision and how they are weighted

  • 3. which are the envis

visaged aged conseq sequences uences of the employment of the algorithms

21/2/17 Hildebrandt SNS seminar Stockholm 32

slide-33
SLIDE 33

21/2/17 Hildebrandt SNS seminar Stockholm 33

slide-34
SLIDE 34

21/2/17 Hildebrandt SNS seminar Stockholm 34

slide-35
SLIDE 35

Nature editorial 22 september 2016

■ “To avoid id bias and imp mprove e transp sparen rency cy, , algori rith thm m desi signer gners s mus ust t make e data so sour urce ces and profil iles es pub ublic.” ■ “People shou

  • uld

ld have e the right t to see e their eir own n data, , how w profiles les are deriv rived ed and d have e the e right ht to ch challenge llenge them em.” ■ “Some propose

  • sed

d reme medies dies are techn hnica cal, l, suc uch as developi eloping ng new w com

  • mputat

utational ional techni chniques ues that bet etter er addre ress s and correct rrect disc scrim riminat nation ion both

  • th in training

ning data set sets and in the e algori rith thms ms — a sort t of affirmat rmative e algor

  • rit

ithm hmic c action ion.”

21/2/17 Hildebrandt SNS seminar Stockholm 35

slide-36
SLIDE 36

the Onlife’s Choice Architecture

■ nudge e theor

  • ry,

, cognitiv itive psychology

  • logy,

, behavioura ioural l economics

  • mics

■ what t options ions does an n environ

  • nmen

ment t give e its s inh nhabitants bitants? ■ what t options ions does a data-driven environment give its ‘users’? ■ which are the defaults ults? ? e.g. . withdra hdrawal al of consent nt in GDPR ■ archit itecture ecture is politic itics

21/2/17 Hildebrandt SNS seminar Stockholm 36

slide-37
SLIDE 37

Data Protection Law’s Choice Architecture

■ how does DP DP law const nstrain rain and nd reconfigure nfigure the e Onl nlife’s choice e archit itectu ectures? res? 1.

  • 1. what

t choice e archit itec ecture ture does s DP law provide ide data a subjects? ects? 2.

  • 2. what

t choice e archit itec ecture ture does s DP law provide ide data a controller

  • llers?

s?

21/2/17 Hildebrandt SNS seminar Stockholm 37

slide-38
SLIDE 38

da data a mi minim nimisation isation

= a c choic

  • ice

e archit itecture ecture for da data contr trolle

  • llers:

s:

■ think ‘training sets’: select before you collect ■ think of how to avoid ‘low hanging fruit’ ■ think nk of how w to ensure re accura uracy cy, , rele levance, ance, perti tine nenc nce ■ data minimisat imisation, ion, if done ne well, ll, shou

  • uld

ld avoid id both

  • th data and

d pattern ern obesita esitas – det etect ct pr productiv

  • ductive

e bias, s, while ile also det etecti ecting ng unf nfair ir or r pr prohibit

  • hibited

ed bias – make e data sets ets ava vaila lable ble for r ins nspe pecti ction

  • n and

nd cont ntes estation tation

21/2/17 Hildebrandt SNS seminar Stockholm 38

slide-39
SLIDE 39

pur purpose pose li limi mitation tation

= a c choic

  • ice

e archit itecture ecture for da data contr trolle

  • llers

■ think ‘training sets’: select before you collect (and nd while le you u collec lect t and nd after er) ■ think of how to avoid ‘low hanging fruit’ (GIGA IGA) ■ think nk of how w to ensure re accura uracy cy, , rele levance, ance, perti tine nenc nce e (dependin epending g on n pu purpose

  • se)

– pu purpose rpose spe peci cific icat ation, ion, if done ne well, l, should

  • uld avoi
  • id

d both

  • th data

a and nd pa patter ern obes besit itas – pu purpose rpose should

  • uld dire

rect ct the e developme elopment nt and nd empl ploymen yment of data-driv riven en app pplicati tions

  • ns

– experimen erimenta tati tion

  • n can

n be a p purpose,

  • se, but

t not not in it n itself elf ■ the e choice

  • ice of algorit

rithm hms shou

  • uld

ld be informe

  • rmed

d by the e purp urpose

  • se

21/2/17 Hildebrandt SNS seminar Stockholm 39

slide-40
SLIDE 40

auto utomated mated de decisi cision

  • n rig

ights hts

■ curr rrent ent choice

  • ice archit

itecture ecture of AI:

■ ML, , IoT is meant ant to pre-empt t our ur intent ent ■ to run un smoo

  • oth

thly y un under er the e radar r of everyd yday y life ■ it is all abou

  • ut

t continuous ntinuous sur urrep repti titi tiou

  • us

s aut utoma

  • mated

ed deci cisi sions

  • ns

21/2/17 Hildebrandt SNS seminar Stockholm 40

slide-41
SLIDE 41

auto utomated mated de decisi cision

  • n rig

ights hts

= choice

  • ice archit

itecture ecture for r da data subjects jects (EU leg egis islati tion)

  • n)

1. 1. the e right ht not

  • t to be sub

ubject ject to aut utoma

  • mated

d decis cisions ions that t have e a signi nific icant ant imp mpact ct 2. 2. the e right ht to a noti

  • tification,

cation, an explana anati tion

  • n and antici

cipation pation if excep epti tion

  • n applies

es

21/2/17 Hildebrandt SNS seminar Stockholm 41

slide-42
SLIDE 42

auto utomated mated de decisi cision

  • n rig

ights hts

= = choice

  • ice archit

itec ecture ture for r da data subje jects cts:

1. 1. the e right ht not

  • t to be sub

ubject ject to aut utoma

  • mated

d decis cisions ions that t have e a signi nific icant ant imp mpact, ct, un unless ess a. a. ne necessar essary y for r contract ntract b. b. autho horised rised by EU U or MS la S law c. c. expli plicit cit consen nsent un under der a and c: right t to hum uman an inter ervention ention, , possi sibil bility ty to cont ntest est prohi hibition bition to make e su such deci cisions ions based sed on se sensi nsitiv tive data

21/2/17 Hildebrandt SNS seminar Stockholm 42

slide-43
SLIDE 43

auto utomated mated de decisi cision

  • n rig

ights hts

= choice

  • ice archit

itecture ecture for r da data subjects jects:

2. 2. the e right ht to a noti

  • tification,

cation, an explana anati tion

  • n and antici

cipation pation if excep epti tion

  • n applies

es – exis isten ence ce of deci cisions sions based ed on n pr prof

  • filing

ing – me meani ningfu ful inf nfor

  • rma

mati tion

  • n about
  • ut the

e logic c involv lved ed (= explanation anation?) – signi nificance cance and nd envisage isaged conse nsequence uences s of such h pr processing

  • cessing

21/2/17 Hildebrandt SNS seminar Stockholm 43

slide-44
SLIDE 44

legal protection by design

■ Data a Protec ectio tion n by Design sign: – JASP SP (open en source) ce) v SP SPSS SS (proprie

  • prietar

tary) y) – can n give e you outp tput t based ed on Bayesian sian and d classical sical sta tati tistics tics – on same data ta set – imagine ine tr train ining ing same algorithm

  • rithms on differen

erent t data ta sets ts ■ scienc ence e and open socie iety ty fit well l together: ther: – ma make systems ms testa table ble (science), cience), ma make systems ms cont ntes estable table (Rule ule of Law) w) – if you can’t test it you can’t contest it

21/2/17 Hildebrandt SNS seminar Stockholm 44

slide-45
SLIDE 45

DP & P & Pr Priv ivacy acy La Law: w: Ch Choice

  • ice Ar

Architec chitecture ture

■ in indi divid idual cit itiz izen ens s nee eed: d:

– th the capabi ability lity to rein invent ent th themselv elves es, , – to segreg regate e th their ir data ta-driv riven en audien iences, s, – have e th their ir human n dign gnity ity respect ected ed by th the data ta-driv riven en infras frastruc tructu tures res – make sure th their r ro robo boti tic social al compani anion

  • ns don’t

t tell on th them beyon

  • nd

d neces cessar ary – th the capability ability to detect ct and d cont ntes est t bias as in th their r data ta-driv riven en envir viron

  • nments

ments

21/2/17 Hildebrandt SNS seminar Stockholm 45

slide-46
SLIDE 46

DP & P & Pr Priv ivacy acy La Law: w: Ch Choice

  • ice Ar

Architec chitecture ture

■ the e archit itects ects of our new da data-dr driv iven en world ld nee eed: d:

– integr egrity ity of meth thod: d: rig igor

  • rou
  • usly

ly sound nd and d cont ntes estable table meth thodolo

  • dologi

gies es (bias) ias) – accou

  • untabi

ntabiity ty: : (con)t

  • n)tes

estabili tability ty of both

  • th data

ta sets ts and d algorithm

  • rithms

– fairn rness ess: : testi ting ng bias as in th the tr train ining ing set, t, testi ting ng bias s in th the learnin rning g algorit

  • rithm

hm – privacy acy & d data ta pro rotect ection: ion: reduce duce manipulability ipulability, , go for parti tici cipation ation and d respect ect

21/2/17 Hildebrandt SNS seminar Stockholm 46

slide-47
SLIDE 47

‘by design’ paradigm

■ archit itectu cture is is pol

  • lit

itic ics

■ tr transla nslate e fairne rness ss, , meth thodologi

  • dological

cal integr egrity ity, , fund ndamental mental right hts into

  • th

the archit itec ecture ture ■ Data ta Pr Prot

  • tectio

ection n by Defaul ult: t: engine ineer er data ta mi minimisat imisation ion as a requirem uirement ent ■ Data ta Prot rotection ection by Design gn: : engi gineer neer sta tate of th the art t DP tools s as a r requirement uirement

21/2/17 Hildebrandt SNS seminar Stockholm 47

slide-48
SLIDE 48

‘by design’ paradigm

■ archit itectu cture is is pol

  • lit

itic ics – we ca cannot t alwa lways ys be be s saved d by de desig ign – resil ilie ience ce wil ill l de depend d on

  • n: testab

tabil ilit ity y & co & contestab tabil ilit ity – ult ltim imately ly th this is is is in in th the in interest t of

  • f da

data ta subj bjects cts and d co control trolle lers ■ ML as a ut util ilit ity

21/2/17 Hildebrandt SNS seminar Stockholm 48

slide-49
SLIDE 49

21/2/17 Hildebrandt SNS seminar Stockholm 49

slide-50
SLIDE 50

21/2/17 Hildebrandt SNS seminar Stockholm 50