Full accounting for verifiable outsourcing Riad S. Wahby , Ye Ji , - - PowerPoint PPT Presentation

full accounting for verifiable outsourcing
SMART_READER_LITE
LIVE PREVIEW

Full accounting for verifiable outsourcing Riad S. Wahby , Ye Ji , - - PowerPoint PPT Presentation

Full accounting for verifiable outsourcing Riad S. Wahby , Ye Ji , Andrew J. Blumberg , abhi shelat , Justin Thaler , Michael Walfish , and Thomas Wies Stanford University New York University The University of


slide-1
SLIDE 1

Full accounting for verifiable outsourcing

Riad S. Wahby⋆, Ye Ji◦, Andrew J. Blumberg†, abhi shelat‡, Justin Thaler△, Michael Walfish◦, and Thomas Wies◦

⋆Stanford University

  • New York University

†The University of Texas at Austin ‡Northeastern University △Georgetown University

July 6th, 2017

slide-2
SLIDE 2

Probabilistic proofs enable outsourcing

client server program, inputs

  • utputs
slide-3
SLIDE 3

Probabilistic proofs enable outsourcing

client server program, inputs

  • utputs

+ short proof

Approach: Server’s response includes short proof of correctness. [Babai85, GMR85, BCC86, BFLS91, FGLSS91, ALMSS92, AS92, Kilian92, LFKN92,

Shamir92, Micali00, BG02, BS05, GOS06, BGHSV06, IKO07, GKR08, KR09, GGP10, Groth10, GLR11, Lipmaa11, BCCT12, GGPR13, BCCT13, Thaler13, KRR14, . . .]

slide-4
SLIDE 4

Probabilistic proofs enable outsourcing

client server program, inputs

  • utputs

+ short proof

Approach: Server’s response includes short proof of correctness. [Babai85, GMR85, BCC86, BFLS91, FGLSS91, ALMSS92, AS92, Kilian92, LFKN92,

Shamir92, Micali00, BG02, BS05, GOS06, BGHSV06, IKO07, GKR08, KR09, GGP10, Groth10, GLR11, Lipmaa11, BCCT12, GGPR13, BCCT13, Thaler13, KRR14, . . .]

SBW11 CMT12 SMBW12 TRMP12 SVPBBW12 SBVBPW13 VSBW13 PGHR13 BCGTV13 BFRSBW13 BFR13 DFKP13 BCTV14a BCTV14b BCGGMTV14 FL14 KPPSST14 FTP14 WSRHBW15 BBFR15 CFHKNPZ15 CTV15 KZMQCPPsS15 D-LFKP16 NT16 ZGKPP17 . . .

slide-5
SLIDE 5

Probabilistic proofs enable outsourcing

client server program, inputs

  • utputs

+ short proof

Goal: outsourcing should be less expensive than just executing the computation

SBW11 CMT12 SMBW12 TRMP12 SVPBBW12 SBVBPW13 VSBW13 PGHR13 BCGTV13 BFRSBW13 BFR13 DFKP13 BCTV14a BCTV14b BCGGMTV14 FL14 KPPSST14 FTP14 WSRHBW15 BBFR15 CFHKNPZ15 CTV15 KZMQCPPsS15 D-LFKP16 NT16 ZGKPP17 . . .

slide-6
SLIDE 6

Do systems achieve this goal? Verifier: can easily check proof (asymptotically)

slide-7
SLIDE 7

Do systems achieve this goal? Verifier: can easily check proof (asymptotically) Prover: has massive overhead (≈10,000,000×)

slide-8
SLIDE 8

Do systems achieve this goal? Verifier: can easily check proof (asymptotically) Prover: has massive overhead (≈10,000,000×) Precomputation: proportional to computation size

slide-9
SLIDE 9

Do systems achieve this goal? Verifier: can easily check proof (asymptotically) Prover: has massive overhead (≈10,000,000×) Precomputation: proportional to computation size How do systems handle these costs?

slide-10
SLIDE 10

Do systems achieve this goal? Verifier: can easily check proof (asymptotically) Prover: has massive overhead (≈10,000,000×) Precomputation: proportional to computation size How do systems handle these costs? Precomputation: amortize over many instances

slide-11
SLIDE 11

Do systems achieve this goal? Verifier: can easily check proof (asymptotically) Prover: has massive overhead (≈10,000,000×) Precomputation: proportional to computation size How do systems handle these costs? Precomputation: amortize over many instances Prover: assume > 108× cheaper than verifier

slide-12
SLIDE 12

Our contribution Giraffe: first system to consider all costs and win.

slide-13
SLIDE 13

Our contribution Giraffe: first system to consider all costs and win. In Giraffe, P really is 108× cheaper than V! (setting: building trustworthy hardware)

slide-14
SLIDE 14

Our contribution Giraffe: first system to consider all costs and win. In Giraffe, P really is 108× cheaper than V! (setting: building trustworthy hardware) Giraffe extends Zebra [WHGsW, Oakland16] with:

  • an asymptotically optimal proof protocol that improves on

prior work [Thaler, CRYPTO13]

  • a compiler that generates optimized hardware designs

from a subset of C

slide-15
SLIDE 15

Our contribution Giraffe: first system to consider all costs and win. In Giraffe, P really is 108× cheaper than V! (setting: building trustworthy hardware) Giraffe extends Zebra [WHGsW, Oakland16] with:

  • an asymptotically optimal proof protocol that improves on

prior work [Thaler, CRYPTO13]

  • a compiler that generates optimized hardware designs

from a subset of C

Bottom line: Giraffe makes outsourcing worthwhile

slide-16
SLIDE 16

Our contribution Giraffe: first system to consider all costs and win. In Giraffe, P really is 108× cheaper than V! (setting: building trustworthy hardware) Giraffe extends Zebra [WHGsW, Oakland16] with:

  • an asymptotically optimal proof protocol that improves on

prior work [Thaler, CRYPTO13]

  • a compiler that generates optimized hardware designs

from a subset of C

Bottom line: Giraffe makes outsourcing worthwhile (. . . sometimes).

slide-17
SLIDE 17

Roadmap

  • 1. Verifiable ASICs
  • 2. Giraffe: a high-level view
  • 3. Evaluation
slide-18
SLIDE 18

Roadmap

  • 1. Verifiable ASICs
  • 2. Giraffe: a high-level view
  • 3. Evaluation
slide-19
SLIDE 19

How can we build trustworthy hardware?

Firewall

e.g., a custom chip for network packet processing whose manufacture we outsource to a third party

slide-20
SLIDE 20

Untrusted manufacturers can craft hardware Trojans

Firewall

What if the chip’s manufacturer inserts a back door?

slide-21
SLIDE 21

Untrusted manufacturers can craft hardware Trojans

Firewall

What if the chip’s manufacturer inserts a back door? Threat: incorrect execution of the packet filter

(Other concerns, e.g., secret state, are important but orthogonal)

slide-22
SLIDE 22

Untrusted manufacturers can craft hardware Trojans

Firewall

What if the chip’s manufacturer inserts a back door?

slide-23
SLIDE 23

Untrusted manufacturers can craft hardware Trojans

Firewall

US DoD controls supply chain with trusted foundries.

slide-24
SLIDE 24

Trusted fabs are the only way to get strong guarantees

For example, stealthy trojans can thwart post-fab detection [A2: Analog Malicious Hardware, Yang et al., Oakland16; Stealthy Dopant-Level Trojans, Becker et al., CHES13]

slide-25
SLIDE 25

Trusted fabs are the only way to get strong guarantees

For example, stealthy trojans can thwart post-fab detection [A2: Analog Malicious Hardware, Yang et al., Oakland16; Stealthy Dopant-Level Trojans, Becker et al., CHES13]

But trusted fabrication is not a panacea:

✗ Only 5 countries have cutting-edge fabs on-shore ✗ Building a new fab takes $$$$$$, years of R&D

slide-26
SLIDE 26

Trusted fabs are the only way to get strong guarantees

For example, stealthy trojans can thwart post-fab detection [A2: Analog Malicious Hardware, Yang et al., Oakland16; Stealthy Dopant-Level Trojans, Becker et al., CHES13]

But trusted fabrication is not a panacea:

✗ Only 5 countries have cutting-edge fabs on-shore ✗ Building a new fab takes $$$$$$, years of R&D ✗ Semiconductor scaling: chip area and energy go with square and cube of transistor length (“critical dimension”) ✗ So using an old fab means an enormous performance hit

e.g., India’s best on-shore fab is 108× behind state of the art

slide-27
SLIDE 27

Trusted fabs are the only way to get strong guarantees

For example, stealthy trojans can thwart post-fab detection [A2: Analog Malicious Hardware, Yang et al., Oakland16; Stealthy Dopant-Level Trojans, Becker et al., CHES13]

But trusted fabrication is not a panacea:

✗ Only 5 countries have cutting-edge fabs on-shore ✗ Building a new fab takes $$$$$$, years of R&D ✗ Semiconductor scaling: chip area and energy go with square and cube of transistor length (“critical dimension”) ✗ So using an old fab means an enormous performance hit

e.g., India’s best on-shore fab is 108× behind state of the art

Idea: outsource computations to untrusted chips

slide-28
SLIDE 28

Verifiable ASICs [WHGsW16]

Principal

F → designs for P, V

slide-29
SLIDE 29

Verifiable ASICs [WHGsW16]

Untrusted fab (fast) builds P Trusted fab (slow) builds V Principal

F → designs for P, V

slide-30
SLIDE 30

Verifiable ASICs [WHGsW16]

Untrusted fab (fast) builds P Trusted fab (slow) builds V Principal

F → designs for P, V

Integrator V P

slide-31
SLIDE 31

Verifiable ASICs [WHGsW16]

Untrusted fab (fast) builds P Trusted fab (slow) builds V Principal

F → designs for P, V

Integrator

V P

input

  • utput
slide-32
SLIDE 32

Verifiable ASICs [WHGsW16]

Untrusted fab (fast) builds P Trusted fab (slow) builds V Principal

F → designs for P, V

Integrator

V P

x y proof that y = F(x) input

  • utput
slide-33
SLIDE 33

Can Verifiable ASICs be practical?

V P

x y proof that y = F(x) input

  • utput

F vs.

V overhead: checking proof is cheap

slide-34
SLIDE 34

Can Verifiable ASICs be practical?

V P

x y proof that y = F(x) input

  • utput

F vs.

V overhead: checking proof is cheap P overhead: high compared to cost of F...

slide-35
SLIDE 35

Can Verifiable ASICs be practical?

V P

x y proof that y = F(x) input

  • utput

F vs.

V overhead: checking proof is cheap P overhead: high compared to cost of F... ...but P uses an advanced circuit technology

slide-36
SLIDE 36

Can Verifiable ASICs be practical?

V P

x y proof that y = F(x) input

  • utput

F vs.

V overhead: checking proof is cheap P overhead: high compared to cost of F... ...but P uses an advanced circuit technology Prior work: V + P < F

slide-37
SLIDE 37

Can Verifiable ASICs be practical?

V P

x y proof that y = F(x) input

  • utput

F vs.

V overhead: checking proof is cheap P overhead: high compared to cost of F... ...but P uses an advanced circuit technology Precomputation: proportional to cost of F Prior work: V + P + Precomp > F

slide-38
SLIDE 38

Can Verifiable ASICs be practical?

V P

x y proof that y = F(x) input

  • utput

F vs.

V overhead: checking proof is cheap P overhead: high compared to cost of F... ...but P uses an advanced circuit technology Precomputation: proportional to cost of F Prior work assumes this away Prior work: V + P + Precomp > F

slide-39
SLIDE 39

Can Verifiable ASICs be practical?

V P

x y proof that y = F(x) input

  • utput

F vs.

V overhead: checking proof is cheap P overhead: high compared to cost of F... ...but P uses an advanced circuit technology Precomputation: proportional to cost of F Prior work assumes this away Our goal: V + P + Precomp < F

slide-40
SLIDE 40

Roadmap

  • 1. Verifiable ASICs
  • 2. Giraffe: a high-level view
  • 3. Evaluation
slide-41
SLIDE 41

Evolution of Giraffe’s back-end GKR08 base protocol

slide-42
SLIDE 42

Evolution of Giraffe’s back-end GKR08 base protocol CMT12 reduces P and precomp costs for all ckts

slide-43
SLIDE 43

Evolution of Giraffe’s back-end GKR08 base protocol CMT12 reduces P and precomp costs for all ckts Thaler13 reduces precomp for structured circuits

slide-44
SLIDE 44

Evolution of Giraffe’s back-end GKR08 base protocol CMT12 reduces P and precomp costs for all ckts Thaler13 reduces precomp for structured circuits Giraffe reduces P cost for structured circuits (plus optimizations for V; see paper)

slide-45
SLIDE 45

Evolution of Giraffe’s back-end GKR08 base protocol CMT12 reduces P and precomp costs for all ckts Thaler13 reduces precomp for structured circuits Giraffe reduces P cost for structured circuits (plus optimizations for V; see paper)

Let’s take a high-level look at how these optimizations work. (The following all use a nice simplification [Thaler15].)

slide-46
SLIDE 46

GKR08 (a quick reminder) d G

For each layer of an arithmetic circuit, P and V engage in a sum-check protocol.

slide-47
SLIDE 47

GKR08 (a quick reminder) d G

For each layer of an arithmetic circuit, P and V engage in a sum-check protocol. In the first round, P computes (q ∈ Flog G):

  • h0∈{0,1}log G
  • h1∈{0,1}log G
  • ˜

add(q, h0, h1)

  • ˜

V(h0) + ˜ V(h1)

  • +

˜ mul(q, h0, h1)

  • ˜

V(h0) · ˜ V(h1)

slide-48
SLIDE 48

GKR08 (a quick reminder) d G

For each layer of an arithmetic circuit, P and V engage in a sum-check protocol. In the first round, P computes (q ∈ Flog G):

  • h0∈{0,1}log G
  • h1∈{0,1}log G
  • ˜

add(q, h0, h1)

  • ˜

V(h0) + ˜ V(h1)

  • +

˜ mul(q, h0, h1)

  • ˜

V(h0) · ˜ V(h1)

  • This has 22 log G = G 2 terms. In total, P’s work is O(poly(G)).
slide-49
SLIDE 49

GKR08 (a quick reminder) d G

For each layer of an arithmetic circuit, P and V engage in a sum-check protocol. In the first round, P computes (q ∈ Flog G):

  • h0∈{0,1}log G
  • h1∈{0,1}log G
  • ˜

add(q, h0, h1)

  • ˜

V(h0) + ˜ V(h1)

  • +

˜ mul(q, h0, h1)

  • ˜

V(h0) · ˜ V(h1)

  • This has 22 log G = G 2 terms. In total, P’s work is O(poly(G)).

Precomputation is one evaluation

  • f

˜ add and ˜ mul, costing O(poly(G)).

slide-50
SLIDE 50

CMT12: from polynomial to quasilinear d G

add(gO, gL, gR) = 0 except when gO is + with inputs gL, gR

slide-51
SLIDE 51

CMT12: from polynomial to quasilinear d G

add(gO, gL, gR) = 0 except when gO is + with inputs gL, gR add(3, 2, 3) = 1, otherwise add(· · ·) = 0

0 1 2 2 1 3 3 2 2 3 3

slide-52
SLIDE 52

CMT12: from polynomial to quasilinear d G

add(gO, gL, gR) = 0 except when gO is + with inputs gL, gR This means we can rewrite P’s sum in the first round as:

  • (h0,h1)∈Sadd

˜ add(q, h0, h1)

  • ˜

V(h0) + ˜ V(h1)

  • +
  • (h0,h1)∈Smul

˜ mul(q, h0, h1)

  • ˜

V(h0) · ˜ V(h1)

slide-53
SLIDE 53

CMT12: from polynomial to quasilinear d G

add(gO, gL, gR) = 0 except when gO is + with inputs gL, gR This means we can rewrite P’s sum in the first round as:

  • (h0,h1)∈Sadd

˜ add(q, h0, h1)

  • ˜

V(h0) + ˜ V(h1)

  • +
  • (h0,h1)∈Smul

˜ mul(q, h0, h1)

  • ˜

V(h0) · ˜ V(h1)

  • G terms/round for 2 log G rounds: P’s work is O(G log G).
slide-54
SLIDE 54

CMT12: from polynomial to quasilinear d G

add(gO, gL, gR) = 0 except when gO is + with inputs gL, gR This means we can rewrite P’s sum in the first round as:

  • (h0,h1)∈Sadd

˜ add(q, h0, h1)

  • ˜

V(h0) + ˜ V(h1)

  • +
  • (h0,h1)∈Smul

˜ mul(q, h0, h1)

  • ˜

V(h0) · ˜ V(h1)

  • G terms/round for 2 log G rounds: P’s work is O(G log G).

Using a related trick, precomputing ˜ add and ˜ mul costs O(G) in total.

slide-55
SLIDE 55

Thaler13: more structure, less precomputation d G G

· · ·

G N copies

Idea: for a batch of identical subckts, ˜ add and ˜ mul can be “small.”

slide-56
SLIDE 56

Thaler13: more structure, less precomputation d G G

· · ·

G N copies

Idea: for a batch of identical subckts, ˜ add and ˜ mul can be “small.” add(3, 2, 3) = 1, otherwise add(· · ·) = 0 Notice that ˜ add does not comprehend subcircuit number!

0 1 2 2 1 subckt #0 3 3 2 2 3 3 0 1 2 2 1 subckt #1 3 3 2 2 3 3

slide-57
SLIDE 57

Thaler13: more structure, less precomputation d G G

· · ·

G N copies

Idea: for a batch of identical subckts, ˜ add and ˜ mul can be “small.” ➔ Precomp costs O(G), amortized over N copies!

slide-58
SLIDE 58

Thaler13: more structure, less precomputation d G G

· · ·

G N copies

Idea: for a batch of identical subckts, ˜ add and ˜ mul can be “small.” ➔ Precomp costs O(G), amortized over N copies! Now P’s sum in the first round is (q′ ∈ Flog N):

  • (h0,h1)∈Sadd

˜ add(q, h0, h1)

  • h′∈{0,1}log N

˜ eq

  • q′, h′

˜ V(h′, h0) + ˜ V(h′, h1)

  • +
  • (h0,h1)∈Smul

˜ mul(q, h0, h1)

  • h′∈{0,1}log N

˜ eq

  • q′, h′

˜ V(h′, h0) · ˜ V(h′, h1)

slide-59
SLIDE 59

Thaler13: more structure, less precomputation d G G

· · ·

G N copies

Idea: for a batch of identical subckts, ˜ add and ˜ mul can be “small.” ➔ Precomp costs O(G), amortized over N copies! Now P’s sum in the first round is (q′ ∈ Flog N):

  • (h0,h1)∈Sadd

˜ add(q, h0, h1)

  • h′∈{0,1}log N

˜ eq

  • q′, h′

˜ V(h′, h0) + ˜ V(h′, h1)

  • +
  • (h0,h1)∈Smul

˜ mul(q, h0, h1)

  • h′∈{0,1}log N

˜ eq

  • q′, h′

˜ V(h′, h0) · ˜ V(h′, h1)

  • eq(x, y) = 1 iff x = y
slide-60
SLIDE 60

Thaler13: more structure, less precomputation d G G

· · ·

G N copies

Idea: for a batch of identical subckts, ˜ add and ˜ mul can be “small.” ➔ Precomp costs O(G), amortized over N copies! Now P’s sum in the first round is (q′ ∈ Flog N):

  • (h0,h1)∈Sadd

˜ add(q, h0, h1)

  • h′∈{0,1}log N

˜ eq

  • q′, h′

˜ V(h′, h0) + ˜ V(h′, h1)

  • +
  • (h0,h1)∈Smul

˜ mul(q, h0, h1)

  • h′∈{0,1}log N

˜ eq

  • q′, h′

˜ V(h′, h0) · ˜ V(h′, h1)

  • For each gate,
slide-61
SLIDE 61

Thaler13: more structure, less precomputation d G G

· · ·

G N copies

Idea: for a batch of identical subckts, ˜ add and ˜ mul can be “small.” ➔ Precomp costs O(G), amortized over N copies! Now P’s sum in the first round is (q′ ∈ Flog N):

  • (h0,h1)∈Sadd

˜ add(q, h0, h1)

  • h′∈{0,1}log N

˜ eq

  • q′, h′

˜ V(h′, h0) + ˜ V(h′, h1)

  • +
  • (h0,h1)∈Smul

˜ mul(q, h0, h1)

  • h′∈{0,1}log N

˜ eq

  • q′, h′

˜ V(h′, h0) · ˜ V(h′, h1)

  • For each gate, sum over each subcircuit.
slide-62
SLIDE 62

Thaler13: more structure, less precomputation d G G

· · ·

G N copies

Idea: for a batch of identical subckts, ˜ add and ˜ mul can be “small.” ➔ Precomp costs O(G), amortized over N copies! Now P’s sum in the first round is (q′ ∈ Flog N):

  • (h0,h1)∈Sadd

˜ add(q, h0, h1)

  • h′∈{0,1}log N

˜ eq

  • q′, h′

˜ V(h′, h0) + ˜ V(h′, h1)

  • +
  • (h0,h1)∈Smul

˜ mul(q, h0, h1)

  • h′∈{0,1}log N

˜ eq

  • q′, h′

˜ V(h′, h0) · ˜ V(h′, h1)

  • NG terms/round in first 2 log G rounds: P’s work is Ω(NG log G).
slide-63
SLIDE 63

Giraffe: leveraging structure to reduce P costs d G G

· · ·

G N copies

Idea: arrange for copies to “collapse” during sum-check protocol.

slide-64
SLIDE 64

Giraffe: leveraging structure to reduce P costs d G G

· · ·

G N copies

Idea: arrange for copies to “collapse” during sum-check protocol. Rewriting the prior sum and changing sumcheck order:

  • h′∈{0,1}log N

˜ eq

  • q′, h′
  • (h0,h1)∈Sadd

˜ add(q, h0, h1)

  • ˜

V(h′, h0) + ˜ V(h′, h1)

  • +
  • h′∈{0,1}log N

˜ eq

  • q′, h′
  • (h0,h1)∈Smul

˜ mul(q, h0, h1)

  • ˜

V(h′, h0) · ˜ V(h′, h1)

slide-65
SLIDE 65

Giraffe: leveraging structure to reduce P costs d G G

· · ·

G N copies

Idea: arrange for copies to “collapse” during sum-check protocol. Rewriting the prior sum and changing sumcheck order:

  • h′∈{0,1}log N

˜ eq

  • q′, h′
  • (h0,h1)∈Sadd

˜ add(q, h0, h1)

  • ˜

V(h′, h0) + ˜ V(h′, h1)

  • +
  • h′∈{0,1}log N

˜ eq

  • q′, h′
  • (h0,h1)∈Smul

˜ mul(q, h0, h1)

  • ˜

V(h′, h0) · ˜ V(h′, h1)

  • For each subcircuit,
slide-66
SLIDE 66

Giraffe: leveraging structure to reduce P costs d G G

· · ·

G N copies

Idea: arrange for copies to “collapse” during sum-check protocol. Rewriting the prior sum and changing sumcheck order:

  • h′∈{0,1}log N

˜ eq

  • q′, h′
  • (h0,h1)∈Sadd

˜ add(q, h0, h1)

  • ˜

V(h′, h0) + ˜ V(h′, h1)

  • +
  • h′∈{0,1}log N

˜ eq

  • q′, h′
  • (h0,h1)∈Smul

˜ mul(q, h0, h1)

  • ˜

V(h′, h0) · ˜ V(h′, h1)

  • For each subcircuit, sum over each gate.
slide-67
SLIDE 67

Giraffe: leveraging structure to reduce P costs d G G

· · ·

G N copies

Idea: arrange for copies to “collapse” during sum-check protocol. Rewriting the prior sum and changing sumcheck order:

  • h′∈{0,1}log N

˜ eq

  • q′, h′
  • (h0,h1)∈Sadd

˜ add(q, h0, h1)

  • ˜

V(h′, h0) + ˜ V(h′, h1)

  • +
  • h′∈{0,1}log N

˜ eq

  • q′, h′
  • (h0,h1)∈Smul

˜ mul(q, h0, h1)

  • ˜

V(h′, h0) · ˜ V(h′, h1)

  • In round 1, h′ ∈ {0, 1}log N
slide-68
SLIDE 68

Giraffe: leveraging structure to reduce P costs d G G

· · ·

G N copies

Idea: arrange for copies to “collapse” during sum-check protocol. Rewriting the prior sum and changing sumcheck order:

  • h′∈{0,1}log N

˜ eq

  • q′, h′
  • (h0,h1)∈Sadd

˜ add(q, h0, h1)

  • ˜

V(h′, h0) + ˜ V(h′, h1)

  • +
  • h′∈{0,1}log N

˜ eq

  • q′, h′
  • (h0,h1)∈Smul

˜ mul(q, h0, h1)

  • ˜

V(h′, h0) · ˜ V(h′, h1)

  • In round 2, h′ ∈ {0, 1}log N−1
slide-69
SLIDE 69

Giraffe: leveraging structure to reduce P costs d G G

· · ·

G N copies

Idea: arrange for copies to “collapse” during sum-check protocol. Rewriting the prior sum and changing sumcheck order:

  • h′∈{0,1}log N

˜ eq

  • q′, h′
  • (h0,h1)∈Sadd

˜ add(q, h0, h1)

  • ˜

V(h′, h0) + ˜ V(h′, h1)

  • +
  • h′∈{0,1}log N

˜ eq

  • q′, h′
  • (h0,h1)∈Smul

˜ mul(q, h0, h1)

  • ˜

V(h′, h0) · ˜ V(h′, h1)

  • In round 3, h′ ∈ {0, 1}log N−2
slide-70
SLIDE 70

Giraffe: leveraging structure to reduce P costs d G G

· · ·

G N copies

Idea: arrange for copies to “collapse” during sum-check protocol. Rewriting the prior sum and changing sumcheck order:

  • h′∈{0,1}log N

˜ eq

  • q′, h′
  • (h0,h1)∈Sadd

˜ add(q, h0, h1)

  • ˜

V(h′, h0) + ˜ V(h′, h1)

  • +
  • h′∈{0,1}log N

˜ eq

  • q′, h′
  • (h0,h1)∈Smul

˜ mul(q, h0, h1)

  • ˜

V(h′, h0) · ˜ V(h′, h1)

  • P does
  • N + N

2 + N 4 + ...

  • G + 2G log G = O(NG + G log G) work.
slide-71
SLIDE 71

Giraffe: leveraging structure to reduce P costs d G G

· · ·

G N copies

Idea: arrange for copies to “collapse” during sum-check protocol. Rewriting the prior sum and changing sumcheck order:

  • h′∈{0,1}log N

˜ eq

  • q′, h′
  • (h0,h1)∈Sadd

˜ add(q, h0, h1)

  • ˜

V(h′, h0) + ˜ V(h′, h1)

  • +
  • h′∈{0,1}log N

˜ eq

  • q′, h′
  • (h0,h1)∈Smul

˜ mul(q, h0, h1)

  • ˜

V(h′, h0) · ˜ V(h′, h1)

  • P does
  • N + N

2 + N 4 + ...

  • G + 2G log G = O(NG + G log G) work.

➔ Linear in size of computation when N > log G!

slide-72
SLIDE 72

Roadmap

  • 1. Verifiable ASICs
  • 2. Giraffe: a high-level view
  • 3. Evaluation
slide-73
SLIDE 73

Implementation Giraffe is an end-to-end hardware generator:

slide-74
SLIDE 74

Implementation Giraffe is an end-to-end hardware generator: a hardware design template

given computation, chip parameters (technology, size, . . . ), produces optimized hardware designs for P and V

slide-75
SLIDE 75

Implementation Giraffe is an end-to-end hardware generator: a hardware design template

given computation, chip parameters (technology, size, . . . ), produces optimized hardware designs for P and V

a (subset of) C compiler

produces the representation used by the design template

slide-76
SLIDE 76

Evaluation questions How does Giraffe perform on real-world computations?

  • 1. Curve25519 point multiplication
  • 2. Image matching
slide-77
SLIDE 77

Evaluation questions How does Giraffe perform on real-world computations?

  • 1. Curve25519 point multiplication
  • 2. Image matching

Goal: total cost of V, P, and precomputation should be less than building F on a trusted chip

slide-78
SLIDE 78

Evaluation method

V P

x y proof that y = F(x) input

  • utput

F vs.

Baselines: Zebra; implementation of F in same technology as V

slide-79
SLIDE 79

Evaluation method

V P

x y proof that y = F(x) input

  • utput

F vs.

Baselines: Zebra; implementation of F in same technology as V Metric: total energy consumption

slide-80
SLIDE 80

Evaluation method

V P

x y proof that y = F(x) input

  • utput

F vs.

Baselines: Zebra; implementation of F in same technology as V Metric: total energy consumption Measurements: based on circuit synthesis and simulation, published chip designs, and CMOS scaling models Charge for V, P, communication; precomputation; PRNG

slide-81
SLIDE 81

Evaluation method

V P

x y proof that y = F(x) input

  • utput

F vs.

Baselines: Zebra; implementation of F in same technology as V Metric: total energy consumption Measurements: based on circuit synthesis and simulation, published chip designs, and CMOS scaling models Charge for V, P, communication; precomputation; PRNG Constraints: trusted fab = 350 nm; untrusted fab = 7 nm; 200 mm2 max chip area; 150 W max total power

350 nm: 1997 (Pentium II) 7 nm: ≈ 2018 ≈ 20 year gap between trusted and untrusted fab

slide-82
SLIDE 82

Application #1: Curve25519 point multiplication Curve25519: a commonly-used elliptic curve Point multiplication: primitive, e.g., for ECDH

slide-83
SLIDE 83

Application #1: Curve25519 point multiplication Energy consumption, Joules

1 3 5 7 9 11 13 15 log2 N, number of copies of subcircuit 0.01 0.1 1 10 100 Total energy cost, Joules (lower is better)

Native Giraffe Zebra

slide-84
SLIDE 84

Application #2: Image matching Image matching via Fast Fourier transform C implementation, compiled by Giraffe’s front-end to V and P hardware designs—no hand tweaking!

slide-85
SLIDE 85

Application #2: Image matching Energy consumption, Joules

3 5 7 9 11 13 15 log2 N, number of copies of subcircuit 0.01 0.1 1 10 100 Total energy cost, Joules (lower is better)

Native Giraffe

slide-86
SLIDE 86

Recap: is it practical?

V P

x y proof that y = F(x) input

  • utput
slide-87
SLIDE 87

Recap: is it practical?

V P

x y proof that y = F(x) input

  • utput

✗ Giraffe is restricted to batched computations

slide-88
SLIDE 88

Recap: is it practical?

V P

x y proof that y = F(x) input

  • utput

✗ Giraffe is restricted to batched computations

Giraffe’s front-end includes two static analysis passes: Slicing extracts only the parts of programs that can be efficiently outsourced Squashing extracts batch-parallelism from serial computations

slide-89
SLIDE 89

Recap: is it practical?

V P

x y proof that y = F(x) input

  • utput

✗ Giraffe is restricted to batched computations ✓ Giraffe’s proof protcol and optimizations save

  • rders of magnitude compared to prior work
slide-90
SLIDE 90

Recap: is it practical?

V P

x y proof that y = F(x) input

  • utput

✗ Giraffe is restricted to batched computations ✓ Giraffe’s proof protcol and optimizations save

  • rders of magnitude compared to prior work

✓ Giraffe is the first system in the literature to account for all costs—and win.

slide-91
SLIDE 91

Recap: is it practical?

V P

x y proof that y = F(x) input

  • utput

✗ Giraffe is restricted to batched computations ✓ Giraffe’s proof protcol and optimizations save

  • rders of magnitude compared to prior work

✓ Giraffe is the first system in the literature to account for all costs—and win. Giraffe is a step, but much work remains!

slide-92
SLIDE 92

Recap: is it practical?

V P

x y proof that y = F(x) input

  • utput

✗ Giraffe is restricted to batched computations ✓ Giraffe’s proof protcol and optimizations save

  • rders of magnitude compared to prior work

✓ Giraffe is the first system in the literature to account for all costs—and win. Giraffe is a step, but much work remains!

https://giraffe.crypto.fyi http://www.pepper-project.org