Making AI forget you: Data deletion in machine learning T ONY G - - PowerPoint PPT Presentation

making ai forget you data deletion in machine learning
SMART_READER_LITE
LIVE PREVIEW

Making AI forget you: Data deletion in machine learning T ONY G - - PowerPoint PPT Presentation

Making AI forget you: Data deletion in machine learning T ONY G INART M ELODY G UAN , G REG V ALIANT , J AMES Z OU Advances in Neural Information Processing Systems December 12, 2019 AI systems today... Data Algorithm Model Users AI systems


slide-1
SLIDE 1

Making AI forget you: Data deletion in machine learning

Advances in Neural Information Processing Systems December 12, 2019

TONY GINART MELODY GUAN, GREG VALIANT, JAMES ZOU

slide-2
SLIDE 2

AI systems today...

Users Data Algorithm Model

slide-3
SLIDE 3

AI systems today...

Users deletion Algorithm Model Data

slide-4
SLIDE 4

AI systems today...

Users Deletion Op Updated deletion Algorithm Model Model Data

slide-5
SLIDE 5

Deletion requests in the wild...

EMAIL ---- UK BIOBANK ---- Subject: UK Biobank Application [REDACTED], Participant Withdrawal Notification [REDACTED] Dear Researcher, As you are aware, participants are free to withdraw form the UK Biobank at any time and request that their data no longer be used. Since our last review, some participants involved with Application [REDACTED] have requested that their data should longer be used.

slide-6
SLIDE 6

Contributions

1) Define deletion in ML system and notion of efficient deletion 2) Propose general principles for co-design of ML algorithms and deletion operations 3) Introduce deletion efficient unsupervised learning

slide-7
SLIDE 7

What is “data deletion” for an ML system?

Informal definition: Deleting a data point from a trained ML model means updating the model as if this point had never existed.

slide-8
SLIDE 8

What is “deletion efficiency” for an ML system?

▪ Setting: online deletion requests from users ▪ Figure-of-Merit: amortized computation

X X ... X

slide-9
SLIDE 9

Toolbox for deletion efficient ML

▪ Linearity: fast O(1) deletion with respect to n data points ▪ Laziness: E.g. nearest neighbors ▪ Modularity: Control dependency from data to parameters ▪ Quantization: Efficiently check if deletion matters

slide-10
SLIDE 10

State of progress

Supervised learning: ▪ Linear regressions/models ▪ Non-parameteric (k-NN) ▪ Incremental SVMs Unsupervised learning: ▪ 1) Quantized k-means ▪ 2) Divide-and-Conquer k-means

slide-11
SLIDE 11

State of progress

Supervised learning: ▪ Linear regressions/models ▪ Non-parameteric (k-NN) ▪ Incremental SVMs Unsupervised learning: ▪ 1) Quantized k-means ▪ 2) Divide-and-Conquer k-means

100X faster deletion without loss of clustering quality

slide-12
SLIDE 12

Next steps in deletion efficient ML

Models: ▪ Decision trees/forests ▪ Artificial neural networks Settings: ▪ Approximate deletions ▪ Adversarial requests Paradigms: ▪ Reinforcement learning ▪ Representation/embedding learning

Want to know more? Poster session @ 5pm #123, East Exhibition Hall B + C Thank you! Happy to chat more: tginart@stanford.edu