Full statistical analyses with secure multi-party computation Dan - - PowerPoint PPT Presentation

full statistical analyses with secure multi party
SMART_READER_LITE
LIVE PREVIEW

Full statistical analyses with secure multi-party computation Dan - - PowerPoint PPT Presentation

Full statistical analyses with secure multi-party computation Dan Bogdanov, Liina Kamm, Ville Sokk dan@cyber.ee http://sharemind.cyber.ee/ The Sharemind model Input Computing Result parties parties parties x 11 CP 1 y 1 IP 1 ... RP 1 x


slide-1
SLIDE 1

Full statistical analyses with secure multi-party computation

Dan Bogdanov, Liina Kamm, Ville Sokk dan@cyber.ee http://sharemind.cyber.ee/

slide-2
SLIDE 2

The Sharemind model

Input parties

IP1 IPk ...

Computing parties

CP1 CP2 CP3

x11 xk1 ... x12 xk2 ... x13 xk3 ...

y1 y3 y2

...

Result parties

RP1 RPl

x1 xk y y

Step 1: secret sharing and storage of inputs Step 3: reconstruction

  • f results

Step 2: secure multi-party computation

slide-3
SLIDE 3

Secret sharing (simplified)

75 53 38

75 - 53 - 38 = 84 mod 100

84

Reconstruction: 53 + 38 + 84 = 75 mod 100

slide-4
SLIDE 4

MPC from secret sharing

P1 x3 x1 x2

Inputs Outputs Computation (y1, y2, y3) = f(x1, x2, x3)

P2 P3 y1 y2 y3

All operations are composable.

slide-5
SLIDE 5

Strengths / weaknesses

  • Requires three servers

for best possible efficiency (works with 2 to n servers as well).

  • Performance profile

not immediately intuitive.

  • Custom protocols may

perform better in some cases.

  • Easy to write code for.

Developers apply privacy patterns on classical algorithms.

  • Hybrid execution

model for balancing public and private computations.

  • Very high performance


for arithmetic circuits.

  • Small storage overhead

(3 times for 3 servers).

slide-6
SLIDE 6

Genome data and MPC

case/control index vector (based on available phenotypes) Data acquisition and secure storage Determining cases and controls

Secure genome-wide association study workflow

Scenario 1: secure 23andMe Scenario 2: international consortium study Scenario 2: Phenotype-based filtering Scenario 1: Extended clinical study B C A Wetlab Survey Secure storage and processing genotype/phenotype (donors D11,…, D1m) genotype/phenotype ... Research institution Available phenotype information filtering query

  • n securely stored

phenotypes Data acquisition Genotype & phenotype Secure coding and storage Securely stored genotype & phenotype Case & control determination Case & control group index Secure statistical testing SNP p<0.1 Results of the study genotype (GATGAG…) phenotype (age, diseases, ...) Research institution Secure storage and processing securely computed case/control index vector Secure storage and processing Gene bank n (donors Dn1, …, Dnm) Gene bank 1

slide-7
SLIDE 7

Application development

secure application servers

Description of the data analysis task Business logic Data model UX requirements

end users (data owners, analysts etc)

Application Server package SecreC language End user applications Controller library

slide-8
SLIDE 8

Our competition entry

  • Task 2.1
  • Importer (C++/SecreC), ~200 lines of code
  • Analyzer (C++/SecreC), ~200 lines of code
  • Secure operations used: secure integer

arithmetic, floating point arithmetic, including division.

  • Task 2.2
  • Importer (C++/SecreC), ~200 lines of code
  • Analyzer (C++/SecreC), ~300 lines of code
  • Secure operations used: secure integer

arithmetic, shuffling, AES.

slide-9
SLIDE 9

The Rmind tool

Rmind

slide-10
SLIDE 10

The Rmind tool

Rmind

slide-11
SLIDE 11

Features of Rmind

  • Data import: CSV, anything with custom importers
  • Descriptive statistics: stdev, var, cov, quantiles,

histogram, frequency plots, heatmap

  • Quality assurance: filtering, outlier removal with

median absolute deviation

  • Transformations: Sorting, merging, aggregation
  • Testing: t-test, chi-square, Cochrane-Armitage,

transmission disequilibrium, Wilcoxon, Mann-Whitney

  • Multiple testing: Bonferroni correction, Benjamini-

Hochberg procedure

  • Regressions: linear, logistic
  • We are continuously implementing new functions.
slide-12
SLIDE 12

Legal situation

  • In January 2014, the Estonian Data Protection

Agency cleared the use of Sharemind/Rmind for education records of Estonian students.

  • In January 2015, the Estonian Tax and Customs

Board cleared the use of Sharemind/Rmind for analyzing tax records of working students.

  • We also have experience in forming contracts with all

associated parties under European law.

  • The EU PRACTICE project published a legal analysis
  • f the technology from a European perspective.

http://practice-project.eu/downloads/publications/ D31.1-Risk-assessment-legal-status-PU-M12.pdf

slide-13
SLIDE 13

Literature

  • 1. [K15] Liina Kamm. Privacy-preserving statistical analysis using secure multi-

party computation. PhD thesis. University of Tartu. 2015. http://hdl.handle.net/ 10062/45343

  • 2. [BKLS14] Dan Bogdanov, Liina Kamm, Sven Laur, Ville Sokk. Rmind: a tool for

cryptographically secure statistical analysis. Cryptology ePrint Archive, Report 2014/512. 2014. http://eprint.iacr.org/2014/512.pdf

  • 3. [KBLV13] Liina Kamm, Dan Bogdanov, Sven Laur, Jaak Vilo. A new way to protect

privacy in large-scale genome-wide association studies. Bioinformatics 29 (7): 886-893, 2013. http://bioinformatics.oxfordjournals.org/content/29/7/886

  • 4. [B13] Dan Bogdanov. Sharemind: programmable secure computations with

practical applications. PhD thesis. University of Tartu. 2013. http://hdl.handle.net/ 10062/29041

slide-14
SLIDE 14

Acknowledgments

"The ¡PRACTICE ¡project ¡has ¡received ¡funding ¡from ¡the ¡European ¡Union's ¡Seventh ¡Framework ¡ Programme ¡([FP7/2007-­‑2013]) ¡under ¡grant ¡agreement ¡number ¡ICT-­‑609611.” ¡

The ¡informaPon ¡in ¡this ¡document ¡is ¡provided ¡“as ¡is”, ¡and ¡no ¡guarantee ¡or ¡warranty ¡is ¡given ¡that ¡the ¡informaPon ¡is ¡fit ¡for ¡any ¡parPcular ¡

  • purpose. ¡The ¡user ¡thereof ¡uses ¡ ¡the ¡informaPon ¡at ¡its ¡sole ¡risk ¡and ¡liability. ¡

Our ¡entry ¡to ¡the ¡iDASH ¡Privacy ¡& ¡Security ¡Workshop ¡Secure ¡Genome ¡Analysis ¡CompePPon ¡ was ¡prepared ¡with ¡support ¡from

http://practice-project.eu/