Fast and Private Submodular and k- Submodular Functions Maximization - - PowerPoint PPT Presentation

fast and private submodular and k submodular functions
SMART_READER_LITE
LIVE PREVIEW

Fast and Private Submodular and k- Submodular Functions Maximization - - PowerPoint PPT Presentation

ICML | 2020 Thirty-seventh International Conference on Machine Learning Fast and Private Submodular and k- Submodular Functions Maximization with Matroid Constraints Akbar Rafiey Yuichi Yoshida 1 Core massage What is the problem?


slide-1
SLIDE 1

Fast and Private Submodular and k- Submodular Functions Maximization with Matroid Constraints

ICML | 2020

Thirty-seventh International Conference on Machine Learning

Yuichi Yoshida Akbar Rafiey

1

slide-2
SLIDE 2

Core massage

  • What is the problem?
  • What do we want to achieve?
  • What do we achieve in this paper?

2

slide-3
SLIDE 3

What is the problem?

Sensitive data Examples:

  • medical data ,
  • web search data,
  • social networks,
  • Salary data
  • Etc,

Analyst: wants to do statistical analysis of data How to answer to queries while preserving privacy of data?

3

slide-4
SLIDE 4

What do we want to achieve?

We need an algorithm such that:

  • It returns almost a correct answer to a query
  • It is efficient and fast
  • Preserves privacy when we have sensitive data.

4

slide-5
SLIDE 5

What we achieve in this paper?(part 1)

  • We consider a class of set function queries, namely submodular set functions
  • We present an algorithm for submodular maximization and prove:
  • It is computationally efficient,
  • Outputs solutions close to an optimal solution
  • Preserves privacy of dataset

5

slide-6
SLIDE 6

What we achieve in this paper?(part 2)

  • Further, we consider a generalization of submodular functions, namely k-submodular functions.
  • This allows to capture more problems.
  • We present an algorithm for k-submodular maximization and prove:
  • It is computationally efficient,
  • Outputs solutions close to an optimal solution
  • Preserves privacy of dataset

6

slide-7
SLIDE 7

Differential privacy:

A rigorous notion of privacy

Dataset Dataset without X’s data analysis/ computation analysis/ computation 99 100 How many people have diabetes ? Analyst e.g., health insurance company Individual X

7

slide-8
SLIDE 8

Differential privacy:

A rigorous notion of privacy

Dataset Dataset without X’s data analysis/ computation analysis/ computation 100 ± 𝜗 How many people have diabetes ? Analyst e.g., health insurance company add NOISE 100 ± 𝜗 Individual X

8

slide-9
SLIDE 9

Differential privacy:

A rigorous notion of privacy

Dataset Dataset without X’s data analysis/ computation analysis/ computation

  • utput
  • utput

“Difference” at most 𝜗 add NOISE Intuitively, any one individual’s data should NOT significantly change the outcome.

9

slide-10
SLIDE 10

Differential Privacy (definition)

  • For 𝜗, 𝜀 ∈ 𝑆!, we say that a randomized computation M is 𝜗, 𝜀 -differentially

private if

  • 1. for any neighboring datasets 𝐸 ∼ 𝐸′, and
  • 2. for any set of outcomes 𝑇 ⊆ range(M),

Pr[M(D) ∈ S] ≤ 𝑓" Pr[M(D’) ∈ S]+δ Neighboring datasets: two datasets that differ in at most one record.

10

slide-11
SLIDE 11

Set function queries

Id gender diabetes …. asthma Class 1 F …. 1 C1 2 M 1 …. 1 C1 3 F …. 1 C1 4 M 1 …. C1 5 F …. C1 6 NA 1 …. C1 7 F …. 1 C2 8 M 1 …. 1 C2 9 NA ….. 1 C2 10 M 1 …. 1 C2 Set function 𝑔

!: 2" → 𝑆

  • Given dataset 𝐸, function 𝑔

!(𝑇)

measures “values” of set S in dataset D

  • 𝑔

! {𝑕𝑓𝑜𝑒𝑓𝑠, 𝑒𝑗𝑏𝑐𝑓𝑢𝑓𝑡} = 5

  • 𝑔

! {𝑏𝑡𝑢ℎ𝑛𝑏} = 7

Dataset 𝐸 Query: what are k most informative features ? m features

Answer while preserving individual’s privacy?

11

slide-12
SLIDE 12

Submodular Function

  • In words: the marginal contribution of any element 𝑓 to the value of the function 𝑔(𝑇) diminishes

as the input set 𝑇 increases.

  • Mathematically, a function 𝑔: 2# → 𝑆 is submodular if
  • for all 𝐵 ⊆ 𝐶 ⊆ 𝐹 ,
  • and all elements 𝑓 ∈ 𝐹 ∖ 𝐶 we have

𝑔 A ∪ {𝑓} − 𝑔(𝐵) ≥ 𝑔 𝐶 ∪ 𝑓 − 𝑔(𝐶)

12

diminishing gain property

slide-13
SLIDE 13

Problem

  • Design a framework for differentially private submodular maximization under matroid constraint.
  • A pair 𝑁 = (𝐹, 𝐽) of a set 𝐹 and 𝐽 ⊆ 2# is called a matroid if
  • ∅ ∈ 𝐽,
  • 𝐵 ∈ 𝐽 for any 𝐵 ⊆ 𝐶 ∈ 𝐽,
  • for any 𝐵, 𝐶 ∈ 𝐽 with 𝐵 < |𝐶|, there exists 𝑓 ∈ 𝐶 ∖ 𝐵 such that 𝐵 ∪ 𝑓 ∈ 𝐽.
  • Our objective:

argmax

$∈&

𝑔(𝑇)

13

slide-14
SLIDE 14

Examples of submodularity

  • Feature selection
  • Influence maximization
  • Facility location
  • Maximum coverage
  • Data summarization
  • Image summarization
  • Document summarization

…. Document Summary

This Photo by Unknown Author is licensed under CC BY- NC

14

slide-15
SLIDE 15

Objective: find 𝑇 ⊆ 𝐹 in the matroid that maximizes

A toy example

Set 𝐹 : m resources n agents 𝑠

#

𝑠

$

𝑠

%

Each agent has a private submodular function 𝐺

&: 2" → 𝑆

𝐺

#

𝐺

$

𝐺

'

agent 1 agent 2 agent n ⋯ ⋯ ⋯ D

&(# '

𝐺

&(𝑇)

15

slide-16
SLIDE 16

Our contributions

non-private previous result (Mitrovic et al.,)

  • ur result

utility 1 − 1 𝑓 𝑃𝑄𝑈 1 2 𝑃𝑄𝑈 − 𝑃(Δ ⋅ 𝑠(𝑁) ⋅ ln(|𝐹|) 𝜗 ) 1 − 1 𝑓 𝑃𝑄𝑈 − 𝑃( 𝜗 + Δ ⋅ 𝑠(𝑁) ⋅ ln(|𝐹|) 𝜗) ) privacy

  • 𝜗. 𝑠(𝑁)

𝜗. 𝑠 𝑁 $

  • 1 −

# * 𝑃𝑄𝑈 is the best possible approximation ratio unless P=NP.

  • Our algorithm uses almost cubic number of function evaluations 𝑃(𝑠 𝑁 ⋅ 𝐹 $ ⋅ ln(

+ ,

  • )).
  • Our privacy factor is worse than the previous work since we deal with multilinear extension.
  • Please see our paper for details and proofs

16

slide-17
SLIDE 17

Generalization of submodularity:

K-submodular functions

𝑔 𝑇 + 𝑔 𝑈 ≥ 𝑔 𝑇 ⊓ 𝑈 + 𝑔(𝑇 ⊔ 𝑈) A function 𝑔: 𝑙 + 1 # → 𝑆! defined on 𝑙-tuples of pairwise disjoint subsets of 𝐹 is called k-submodular if for all 𝑙-tuples 𝑇 = (𝑇', … , 𝑇() and 𝑈 = (𝑈

', … , 𝑈() of pairwise disjoint subsets of 𝐹,

𝑇 ⊓ 𝑈 = ( 𝑇' ∩ 𝑈

' , … , 𝑇( ∩ 𝑈( )

𝑇 ⊔ 𝑈 = ( 𝑇' ∪ 𝑈

' ∖

R

)*'

𝑇) ∪ 𝑈) , … , 𝑇( ∪ 𝑈( ∖ R

)*(

𝑇) ∪ 𝑈) ) where we define

17

A simpler definition: A monotone function is k-submodular if each orthant (fix the domain of each element to be {0, 𝑗} for some 𝑗 ∈ {1,2, … , 𝑙} ) is submodular.

slide-18
SLIDE 18

Examples of k-submodularity

  • Coupled feature selection
  • Sensor placement with k kinds of measures
  • Influence maximization with k topics
  • Variant of facility location
  • ….

(a) Example placement

Picture from: Near-optimal Sensor Placements : Maximizing Information while Minimizing Communication Cost.

  • A. Krause, A. Gupta, C. Guestrin, J. Kleinberg

Picture from: On Bisubmodular Maximization

  • A. P. Singh, A. Guillory, J. Bilmes

This Photo by Unknown Author is licensed under CC BY- NC

18

slide-19
SLIDE 19

A toy example

Objective: allocate at most B≤ 𝑛 ad slots to ad agencies so that it maximizes number

  • f influenced users.

𝐻#: influence graph of ad agency 1. 𝐻$: influence graph of ad agency 2. 𝐻.: influence graph of ad agency k. 𝑤# 𝑤# 𝑤# 𝑤) 𝑤$ 𝑤% 𝑤) 𝑤$ 𝑤% 𝑤) 𝑤$ 𝑤% 𝑣# 𝑣$ 𝑣) 𝑣' 𝑣# 𝑣$ 𝑣) 𝑣' 𝑣# 𝑣$ 𝑣) 𝑣'

ad slots users ⋮

users users ad slots ad slots Edges incident to a user 𝑣& in 𝐻#, … , 𝐻. are sensitive data about 𝑣&.

19

slide-20
SLIDE 20

Our contributions

non-private previous result

  • ur result

utility 1 2 𝑃𝑄𝑈 1 2 𝑃𝑄𝑈 − 𝑃(Δ ⋅ r M ⋅ ln(|𝐹|) 𝜗 ) privacy 𝜗. 𝑠(𝑁)

𐄃 𐄃 𐄃

  • Our algorithm is the first differentially private k-submodular maximization algorithm.
  • #

$ 𝑃𝑄𝑈 is asymptotically tight assuming P≠NP.

  • Our algorithm uses almost linear number of function evaluations i.e., 𝑃(𝑙 ⋅ 𝐹 ⋅ ln(𝑠 𝑁 )).

20

slide-21
SLIDE 21

Thanks!

21

slide-22
SLIDE 22

A function 𝑔: 2" → 𝑆 is submodular if

  • for all 𝐵 ⊆ 𝐶 ⊆ 𝐹 ,
  • and all elements 𝑓 ∈ 𝐹 ∖ 𝐶 we have

𝑔 A ∪ {𝑓} − 𝑔(𝐵) ≥ 𝑔 𝐶 ∪ 𝑓 − 𝑔(𝐶) Definition of submodular function

  • Viral marketing
  • Information gathering
  • Feature selection for classification
  • Influence maximization in social network
  • Document summarization…

Applications We need an optimization method such that

  • It returns almost an optimal solution
  • It is efficient and fast
  • Preserves individuals’ privacy when we have sensitive

data: medical data ,web search data, social networks What is our objective? A rigorous notion of privacy that allows statistical analysis

  • f sensitive data while providing strong privacy guarantees.

Differential privacy We present a differentially private algorithm for submodular maximization and:

  • Prove that our algorithm returns a solution with quality

at least 1 −

# * 𝑃𝑄𝑈 + 𝑡𝑛𝑏𝑚𝑚 𝑏𝑒𝑒𝑗𝑢𝑗𝑤𝑓 𝑓𝑠𝑠𝑝𝑠

  • Prove that our algorithm preserve privacy
  • Improve the number of function evaluations via a

sampling technique while still preserving privacy Result 1 We present the first differentially private algorithm for k- submodular maximization and:

  • Prove that our algorithm returns a solution with quality

at least

# $ 𝑃𝑄𝑈 + 𝑡𝑛𝑏𝑚𝑚 𝑏𝑒𝑒𝑗𝑢𝑗𝑤𝑓 𝑓𝑠𝑠𝑝𝑠

  • Prove our algorithm preserve privacy
  • Reduce number of function evaluations to almost linear

by a sampling technique while preserving privacy Result 2 (generalization of submodularity)