m-Invariance and Dynamic Datasets based on: Xiaokui Xiao, Yufei Tao - PowerPoint PPT Presentation

m-Invariance and Dynamic Datasets based on: Xiaokui Xiao, Yufei Tao m-Invariance: Towards Privacy Preserving Re-publication of Dynamic Datasets Slawomir Goryczka

Panta rhei (Heraclitus) "everything is in a state of flux" ● To provide most recent anonymized data publisher needs to re-publish them ● Most of the current approaches do not consider this! ● Exception: – Support only insertions of data – J.-W. Byun, Y. Sohn, E. Bertino, and N. Li Secure anonymization for incremental datasets . (2006) ● Where is the problem?

Maybe it's simple? We just need to ensure that: ● Dataset is not published too often ( movie effect) ● We use different algorithm for each dataset snapshot ( “white” noise instead of the movie effect, but may be used to identify part of the data!) ● Play with data to keep similar statistics of attribute values – what with long time trends, i.e. flu pandemic, which change global and local statistics of the data

Deletion of tuples ● Deletion of data may introduce critical absence :

Deletion of tuples ● Deletion of data may introduce critical absence : Bob has dyspepsia

Deletion of tuples ● Deletion of data may introduce critical absence : Bob has dyspepsia Solution(?) Ignore deletions

Counterfeit generalization ● Add some counterfeit tuples to avoid critical absence ● Publish number and location of these tuples (utility)

Counterfeit generalization (continued) ● Crucial to preserve privacy is to ensure certain invariance in all quasi-identifier groups that a tuple (here: Bob's tuple) is generalized to in different snapshots ● Existing generalization schemas are special cases of counterfeited generalization, where there is no counterfeits ● Goal: minimize number of counterfeit tuples, but ensure privacy among all snapshots. How?

m-Invariance m-unique each QI group in anonymized table T*(j) contains ≥ m tuples with different sensitive data among them m-invariant ● T*(j) is m-unique for all 1≤j≤n ● For each tuple t, for each data snapshot where this tuple appears, its QI generalized group have the same set of distinct sensitive values (For each QI generalized group its set of distinct sensitive values is constant – no problems with critical absence, but each tuple have limited number of QI generalized groups where it can belongs to)

Privacy disclosure risk ● Privacy disclosure risk for tuple t: risk (t) = nis (t)/ nrs – nis (t) – number of reasonable surjective functions that correctly reconstruct t – nrs – number of all reasonable surjections

m-Invariance (properties) ● If {T*(1), ..., T*(n)} is m-invariant, then risk (i) ≤ 1/m, 1 ≤ i ≤ n ● If {T*(1), ..., T*(n-1)} is m-invariant, then {T*(1), ..., T*(n)} is also m-invariant if and only if: – T*(n) is m-unique – For any tuple its generalized QI t ∈ T  n − 1 ∩ T  n  groups in snapshots T*(n-1) and T*(n) have the same signature (set of distinct sensitive values).

m-Invariant algorithm ● n-th publication is allowed, only if T(n)-T(n-1) is m-eligible, that is, at most 1/m of the tuples in T(n)-T(n-1) have an identical sensitive value ● Algorithm (4 phases): 1.Division 2.Balancing 3.Assignment 4.Split

m-Invariant algorithm (continued) ● Division – group tuples common for T*(n-1) and T(n) with the same signature into one bucket ● Balancing – balance number of tuples in buckets using counterfeits if necessary (they have no value for QI attributes)

m-Invariant algorithm (continued) ● Assignment – add tuples, which were not in T*(n-1), but are in T(n) using similar steps to Dividing and Balancing ● Split – split each bucket B into |B|/s QI generalized groups where s (≥m) is the number of values in the signature of B. Each group has s tuples, taking the s sensitive values in the signature, respectively.

Experiments ● Datasets (Tooc, Tsal): – 400k tuples (600k in total) – Attributes: Age, Gender, Education, Birthplace, Occupation, Salary

Pros and cons ● Incremental ● Preserving current statistics of attribute ● Small data values – what if they disturbance change? ● High data utility ● What about (measured as a continues attributes median relative error (numbers)? for queries) ● ... ● ...

Q & I* * Ideas

m-Invariance and Dynamic Datasets based on: Xiaokui Xiao, Yufei Tao - PowerPoint PPT Presentation

m-Invariance and Dynamic Datasets based on: Xiaokui Xiao, Yufei Tao m-Invariance: Towards Privacy Preserving Re-publication of Dynamic Datasets Slawomir Goryczka Panta rhei (Heraclitus) "everything is in a state of flux" To provide

Invariance Explains Multiplicative and Natural Invariance: . . . Exponential Skedactic Functions

Why LASSO, EN, and General Regularization CLOT: Invariance-Based Scale-Invariance: . . .

Outline Types of transformations and invariance Scale invariance Lecture 13: Local

Derivation of Scale-Invariance: . . . Louisville-Bratu-Gelfand Shift-Invariance: . . . From

Scale-Invariance Ideas Scale-Invariance: . . . Which Dependencies . . . Explain the Empirical

Generalized Measurement Invariance Tests for Proposed Proposed Tests Tests Factor Analysis

Scale-invariance from spontaneously broken conformal invariance Austin Joyce Center for Particle

1 Examples The ETH-80 Dataset (Bastian Leibe and Bernt Schiele) The Caltech 101 average image

COMMUNICATING [with empathy] @ DY DYNAMIC JILL JILL @ DY DYNAMIC JILL TENSION IS INEVITABLE @

Dynamic Adaptation Dynamic Adaptation Dynamic Adaptation Dynamic Adaptation Minema Minema

Invariance and Equivariance: Benefits, Costs, and Methods Robert Serfling 1 University of Texas

Empirical Invariance in Stock Market and Chii-Ruey Hwang Institute of Related Problems

Invariance groups of functions, orbit equivalence and group actions Reinhard P oschel

Why Linear Interpolation? y -Scale-Invariance Consistency Andrzej Pownuk and Vladik Kreinovich

Invariance and symbolic control of cooperative systems for temperature regulation in intelligent

Neutrinos, DUNE and the world best bound on CPT invariance Christoph Andreas Ternes IFIC

Tracking AR(1) Process with limited communication Rooji Jinan Parimal Parag , Himanshu Tyagi

OPE coe ffi cients, string field theory vertex and integrability Romuald A. Janik Jagiellonian

GR: Indian Contributions Naresh Dadhich CTP, Jamia, New Delhi & IUCAA, Pune (ICGC 2015,

Adaptation Strategies for Climate & Fire in the Southwest Martha Sample, Northern Arizona

CSAT ( ALC ) is in NExpTime (tableau algorithm is non-deterministic, builds c-tree of linear

About Let's Code Blacksburg! By: Thomas Tweeks Weeks For: ICAT Playdate, 2014-02-31

Toward an imaginary Ax-Kochen-Ershov principle Work in progress with Martin Hils Silvain Rideau

g n i t e e M l Dissemination, Implementation, Knowledge a u Translation, and Scale up

m-Invariance and Dynamic Datasets based on: Xiaokui Xiao, Yufei Tao - PowerPoint PPT Presentation

m-Invariance and Dynamic Datasets based on: Xiaokui Xiao, Yufei Tao m-Invariance: Towards Privacy Preserving Re-publication of Dynamic Datasets Slawomir Goryczka Panta rhei (Heraclitus) "everything is in a state of flux" To provide

Invariance Explains Multiplicative and Natural Invariance: . . . Exponential Skedactic Functions

Why LASSO, EN, and General Regularization CLOT: Invariance-Based Scale-Invariance: . . .

Outline Types of transformations and invariance Scale invariance Lecture 13: Local

Derivation of Scale-Invariance: . . . Louisville-Bratu-Gelfand Shift-Invariance: . . . From

Scale-Invariance Ideas Scale-Invariance: . . . Which Dependencies . . . Explain the Empirical

Generalized Measurement Invariance Tests for Proposed Proposed Tests Tests Factor Analysis

Scale-invariance from spontaneously broken conformal invariance Austin Joyce Center for Particle

1 Examples The ETH-80 Dataset (Bastian Leibe and Bernt Schiele) The Caltech 101 average image

COMMUNICATING [with empathy] @ DY DYNAMIC JILL JILL @ DY DYNAMIC JILL TENSION IS INEVITABLE @

Dynamic Adaptation Dynamic Adaptation Dynamic Adaptation Dynamic Adaptation Minema Minema

Invariance and Equivariance: Benefits, Costs, and Methods Robert Serfling 1 University of Texas

Empirical Invariance in Stock Market and Chii-Ruey Hwang Institute of Related Problems

Invariance groups of functions, orbit equivalence and group actions Reinhard P oschel

Why Linear Interpolation? y -Scale-Invariance Consistency Andrzej Pownuk and Vladik Kreinovich

Invariance and symbolic control of cooperative systems for temperature regulation in intelligent

Neutrinos, DUNE and the world best bound on CPT invariance Christoph Andreas Ternes IFIC

Tracking AR(1) Process with limited communication Rooji Jinan Parimal Parag , Himanshu Tyagi

OPE coe ffi cients, string field theory vertex and integrability Romuald A. Janik Jagiellonian

GR: Indian Contributions Naresh Dadhich CTP, Jamia, New Delhi &amp; IUCAA, Pune (ICGC 2015,

Adaptation Strategies for Climate &amp; Fire in the Southwest Martha Sample, Northern Arizona

CSAT ( ALC ) is in NExpTime (tableau algorithm is non-deterministic, builds c-tree of linear

About Let's Code Blacksburg! By: Thomas Tweeks Weeks For: ICAT Playdate, 2014-02-31

Toward an imaginary Ax-Kochen-Ershov principle Work in progress with Martin Hils Silvain Rideau

g n i t e e M l Dissemination, Implementation, Knowledge a u Translation, and Scale up

GR: Indian Contributions Naresh Dadhich CTP, Jamia, New Delhi & IUCAA, Pune (ICGC 2015,

Adaptation Strategies for Climate & Fire in the Southwest Martha Sample, Northern Arizona