Low-Depth, Low-Size Circuits for Cryptographic Applications Joan - - PowerPoint PPT Presentation

low depth low size circuits for cryptographic applications
SMART_READER_LITE
LIVE PREVIEW

Low-Depth, Low-Size Circuits for Cryptographic Applications Joan - - PowerPoint PPT Presentation

Low-Depth, Low-Size Circuits for Cryptographic Applications Joan Boyar* 1 Magnus Gausdal Find 2 Ren Peralta 2 1 University of Southern Denmark 2 National Institute of Standards and Technology, USA BFA 2017 Boyar, Find, Peralta Heuristic:


slide-1
SLIDE 1

Low-Depth, Low-Size Circuits for Cryptographic Applications

Joan Boyar*1 Magnus Gausdal Find2 René Peralta2

1University of Southern Denmark 2National Institute of Standards and Technology, USA

BFA 2017

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 1 / 22

slide-2
SLIDE 2

Circuits over GF(2)

AND gates ×/∧ XOR gates + XNOR gates #

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 2 / 22

slide-3
SLIDE 3

Circuits over GF(2)

AND gates ×/∧ XOR gates + XNOR gates #

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 2 / 22

slide-4
SLIDE 4

Circuits over GF(2)

AND gates ×/∧ XOR gates + XNOR gates # Both circuits compute the predicate MAJ(a,b,c) in size 4 and depth 3.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 2 / 22

slide-5
SLIDE 5

Boolean Circuit Complexity

The (Boolean) circuit complexity of a function f is the number of gates necessary and sufficient to compute f .

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 3 / 22

slide-6
SLIDE 6

Boolean Circuit Complexity

The (Boolean) circuit complexity of a function f is the number of gates necessary and sufficient to compute f . Shannon-Lupanov bound: the circuit complexity of a predicate on n bits is about 2n

n almost everywhere.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 3 / 22

slide-7
SLIDE 7

Multiplicative Complexity

The multiplicative complexity of a function f is the number of multiplications (ANDs) necessary and sufficient to compute f (over the basis AND, XOR, XNOR).

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 4 / 22

slide-8
SLIDE 8

Multiplicative Complexity

The multiplicative complexity of a function f is the number of multiplications (ANDs) necessary and sufficient to compute f (over the basis AND, XOR, XNOR). Almost all Boolean predicates on n bits have multiplicative complexity close to 2

n 2 (i.e. about the square root of the total number of gates needed).

[B., Peralta, Pochuev],[Nechiporuk]

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 4 / 22

slide-9
SLIDE 9

Multiplicative Complexity

The multiplicative complexity of a function f is the number of multiplications (ANDs) necessary and sufficient to compute f (over the basis AND, XOR, XNOR). Almost all Boolean predicates on n bits have multiplicative complexity close to 2

n 2 (i.e. about the square root of the total number of gates needed).

[B., Peralta, Pochuev],[Nechiporuk] Our thesis is that this observation can be used for Boolean circuit

  • ptimization.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 4 / 22

slide-10
SLIDE 10

Motivation

Why do we care?

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 5 / 22

slide-11
SLIDE 11

Motivation

Why do we care?

1 Smaller chip area, less power

Lower depth, faster

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 5 / 22

slide-12
SLIDE 12

Motivation

Why do we care?

1 Smaller chip area, less power

Lower depth, faster

2 Multi-party computations:

Communication complexity can depend (only) on the number of ANDs in the circuit.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 5 / 22

slide-13
SLIDE 13

Motivation

Why do we care?

1 Smaller chip area, less power

Lower depth, faster

2 Multi-party computations:

Communication complexity can depend (only) on the number of ANDs in the circuit.

3 Homomorphic computations:

Performing computations on encrypted data, such as in the cloud. The multiplicative complexity can affect the number of bootstrappings.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 5 / 22

slide-14
SLIDE 14

An example function: AES S-Box

Advanced Encryption Standard (AES) Block cipher - 128 bit blocks, 128 bit keys

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 6 / 22

slide-15
SLIDE 15

An example function: AES S-Box

Advanced Encryption Standard (AES) Block cipher - 128 bit blocks, 128 bit keys 10 rounds using 4 operations: SubBytes — Nonlinear substitution step (S-Box) ShiftRows MixColumns AddRoundKey

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 6 / 22

slide-16
SLIDE 16

AES S-Box

The S-Box has 8 inputs and 8 outputs. Inversion in GF(28), followed by affine transformation (linear, followed by some negations).

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 7 / 22

slide-17
SLIDE 17

AES S-Box

The S-Box has 8 inputs and 8 outputs. Inversion in GF(28), followed by affine transformation (linear, followed by some negations). Can be done by table look-up. 256 different inputs, each with 8 bits output 2048 bits large area — 16 S-Boxes in each round

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 7 / 22

slide-18
SLIDE 18

AES S-Box

The S-Box has 8 inputs and 8 outputs. Inversion in GF(28), followed by affine transformation. Tower of fields constructions: Concentration on size:

Wolkerstorfer, Oswald, Lamberger 2002 — work over subfield GF(24) Satoh, Morioka, Takano, Munetoh 2001 — within GF(24) use GF(22) Canright 2005 — tried many different bases B., Peralta 2010 — used Canright’s base - 115 gates (improved to 113 gates by Calik; same technique, exploring all ties)

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 8 / 22

slide-19
SLIDE 19

AES S-Box

The S-Box has 8 inputs and 8 outputs. Inversion in GF(28), followed by affine transformation. Tower of fields constructions: Concentration on size:

Wolkerstorfer, Oswald, Lamberger 2002 — work over subfield GF(24) Satoh, Morioka, Takano, Munetoh 2001 — within GF(24) use GF(22) Canright 2005 — tried many different bases B., Peralta 2010 — used Canright’s base - 115 gates (improved to 113 gates by Calik; same technique, exploring all ties) depth 28

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 8 / 22

slide-20
SLIDE 20

AES S-Box

The S-Box has 8 inputs and 8 outputs. Inversion in GF(28), followed by affine transformation. Tower of fields constructions: Depth:

Canright 2005 — depth 25 (≥ 125 gates) Nogami, Nekado, Toyota, Hongo, Morikawa 2010

choose mixed bases so ≤ 4 ones for top and bottom transformations, so depth 2 for each depth 22, size 148

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 9 / 22

slide-21
SLIDE 21

AES S-Box

The S-Box has 8 inputs and 8 outputs. Inversion in GF(28), followed by affine transformation. Tower of fields constructions: Depth:

Canright 2005 — depth 25 (≥ 125 gates) Nogami, Nekado, Toyota, Hongo, Morikawa 2010

choose mixed bases so ≤ 4 ones for top and bottom transformations, so depth 2 for each depth 22, size 148

B., Peralta 2012 — depth 16, size 128 this presentation — depth 16, size 125, more automated

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 9 / 22

slide-22
SLIDE 22

AES S-Box

Goal: minimize size (number of gates) and depth Technique:

1 Start with a circuit with small size

(using previous techniques, for example [B.,Matthews,Peralta 2013])

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 10 / 22

slide-23
SLIDE 23

AES S-Box

Goal: minimize size (number of gates) and depth Technique:

1 Start with a circuit with small size

(using previous techniques, for example [B.,Matthews,Peralta 2013])

2 Use techniques from automatic theorem proving to re-synthesize non-linear

components into lower-depth constructions (reused from [B., Peralta 2012])

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 10 / 22

slide-24
SLIDE 24

AES S-Box

Goal: minimize size (number of gates) and depth Technique:

1 Start with a circuit with small size

(using previous techniques, for example [B.,Matthews,Peralta 2013])

2 Use techniques from automatic theorem proving to re-synthesize non-linear

components into lower-depth constructions (reused from [B., Peralta 2012])

3 Apply a randomized, greedy heuristic to re-synthesize linear components into

lower-depth constructions, using a new See-Saw Method

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 10 / 22

slide-25
SLIDE 25

Circuit for the S-Box of AES

8 bits in 22 bits 22 bits 18 bits 18 bits 8 bits out Bottom linear Middle nonlinear Top linear · · · · · ·

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 11 / 22

slide-26
SLIDE 26

See-Saw Method

8 bits in 22 bits 22 bits 18 bits 18 bits 8 bits out Bottom linear Middle nonlinear Top linear · · · · · · 63 gates, fixed 34 gates 27 gates variable depth depth 0 variable depth depth ≤ 19 Start: Total depth 19, size 124 gates.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 12 / 22

slide-27
SLIDE 27

See-Saw Method

8 bits in 22 bits Top linear 27 gates variable depth depth 0 After processing.... Top linear 29 gates depth ≤ 3 depth 0

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 13 / 22

slide-28
SLIDE 28

See-Saw Method

Top linear 29 gates depth ≤ 3 depth 0 22 bits 18 bits 18 bits 8 bits out Bottom linear Middle nonlinear · · · · · · 63 gates, fixed 34 gates variable depth depth ≤ 18 Start: Total depth 19, size 124 gates. Now: Total depth 18, size 126 gates.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 14 / 22

slide-29
SLIDE 29

See-Saw Method

8 bits out Bottom linear 18 bits 34 gates variable depth inputs variable depth ≤ 18 After processing.... 8 bits out Bottom linear 18 bits 35 gates variable depth inputs variable depth ≤ 16

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 15 / 22

slide-30
SLIDE 30

See-Saw Method

Top linear 29 gates depth ≤ 3 depth 0 22 bits 18 bits Middle nonlinear · · · · · · 63 gates, fixed 35 gates 8 bits out Bottom linear 18 bits variable depth inputs variable depth ≤ 16 Previous: Total depth 18, size 126 gates. Now: Total depth 16, size 127 gates.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 16 / 22

slide-31
SLIDE 31

See-Saw Method

Top linear 29 gates depth ≤ 3 depth 0 After processing.... Top linear 27 gates depth ≤ 4 depth 0

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 17 / 22

slide-32
SLIDE 32

See-Saw Method

Top linear 27 gates depth ≤ 4 depth 0 22 bits 18 bits Middle nonlinear · · · · · · 63 gates, fixed 35 gates 8 bits out Bottom linear 18 bits variable depth inputs variable depth ≤ 16 Previous: Total depth 16, size 127 gates. Now: Total depth 16, size 125 gates.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 18 / 22

slide-33
SLIDE 33

See-Saw Method

Top linear 27 gates depth ≤ 4 depth 0 22 bits 18 bits Middle nonlinear · · · · · · 63 gates, fixed 35 gates 8 bits out Bottom linear 18 bits variable depth inputs variable depth ≤ 16 Previous: Total depth 16, size 127 gates. Now: Total depth 16, size 125 gates. Work on bottom linear to get all outputs at depth 16.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 18 / 22

slide-34
SLIDE 34

Optimizing the linear components

[B.,Matthews,Peralta 2013] It is NP-hard to find the optimal linear program (circuit).

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 19 / 22

slide-35
SLIDE 35

Optimizing the linear components

[B.,Matthews,Peralta 2013] It is NP-hard to find the optimal linear program (circuit). Unless P = NP there exists no ǫ-approximation scheme.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 19 / 22

slide-36
SLIDE 36

Optimizing the linear components

[B.,Matthews,Peralta 2013] It is NP-hard to find the optimal linear program (circuit). Unless P = NP there exists no ǫ-approximation scheme. So our problem is intractable.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 19 / 22

slide-37
SLIDE 37

Optimizing the linear components

[B.,Matthews,Peralta 2013] It is NP-hard to find the optimal linear program (circuit). Unless P = NP there exists no ǫ-approximation scheme. So our problem is intractable. Use heuristics.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 19 / 22

slide-38
SLIDE 38

Optimizing the linear components

[B.,Matthews,Peralta 2013] It is NP-hard to find the optimal linear program (circuit). Unless P = NP there exists no ǫ-approximation scheme. So our problem is intractable. Use heuristics. Modify Paar’s greedy heuristic to maintain feasibility for required max depth (given input depths).

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 19 / 22

slide-39
SLIDE 39

Optimizing the linear components

[B.,Matthews,Peralta 2013] It is NP-hard to find the optimal linear program (circuit). Unless P = NP there exists no ǫ-approximation scheme. So our problem is intractable. Use heuristics. Modify Paar’s greedy heuristic to maintain feasibility for required max depth (given input depths). Allow some cancellation, using preprocessing.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 19 / 22

slide-40
SLIDE 40

Other results (polynomial multiplication)

Multiplication of degree 9 polynomials over GF(2): Starting from Bernstein’s result, obtained same size, 155, but reduced depth from 9 to 6.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 20 / 22

slide-41
SLIDE 41

Other results (polynomial multiplication)

Multiplication of degree 9 polynomials over GF(2): Starting from Bernstein’s result, obtained same size, 155, but reduced depth from 9 to 6. Cenk, Hasan 2015 — 155 gates, but depth 8.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 20 / 22

slide-42
SLIDE 42

Other results (polynomial multiplication)

Multiplication of degree 9 polynomials over GF(2): Starting from Bernstein’s result, obtained same size, 155, but reduced depth from 9 to 6. Cenk, Hasan 2015 — 155 gates, but depth 8. Find, Peralta 2016 — 154 gates, and depth 9. Using the Find-Peralta nonlinear component, we achieved 154 gates in depth 7.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 20 / 22

slide-43
SLIDE 43

Other results (polynomial multiplication)

Multiplication of degree 9 polynomials over GF(2): Starting from Bernstein’s result, obtained same size, 155, but reduced depth from 9 to 6. Cenk, Hasan 2015 — 155 gates, but depth 8. Find, Peralta 2016 — 154 gates, and depth 9. Using the Find-Peralta nonlinear component, we achieved 154 gates in depth 7. Multiplication of degree 12 polynomials over GF(2): Starting from Bernstein’s result, improved from 256 gates and depth 9 to 255 gates and depth 8.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 20 / 22

slide-44
SLIDE 44

Other results (polynomial multiplication)

Multiplication of degree 9 polynomials over GF(2): Starting from Bernstein’s result, obtained same size, 155, but reduced depth from 9 to 6. Cenk, Hasan 2015 — 155 gates, but depth 8. Find, Peralta 2016 — 154 gates, and depth 9. Using the Find-Peralta nonlinear component, we achieved 154 gates in depth 7. Multiplication of degree 12 polynomials over GF(2): Starting from Bernstein’s result, improved from 256 gates and depth 9 to 255 gates and depth 8. Cenk, Hasan 2015 — Also 255 gates and depth 8.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 20 / 22

slide-45
SLIDE 45

Other results (multiplication in GF(2n))

Multiplication in GF(28): Improved a result with 117 gates and depth 7 to 106 gates and depth 6.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 21 / 22

slide-46
SLIDE 46

Other results (multiplication in GF(2n))

Multiplication in GF(28): Improved a result with 117 gates and depth 7 to 106 gates and depth 6. Former result from Circuit Minimization Work: http://cs-www.cs.yale.edu/homes/peralta/CircuitStuff/CMT.html

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 21 / 22

slide-47
SLIDE 47

Other results (multiplication in GF(2n))

Multiplication in GF(28): Improved a result with 117 gates and depth 7 to 106 gates and depth 6. Former result from Circuit Minimization Work: http://cs-www.cs.yale.edu/homes/peralta/CircuitStuff/CMT.html Multiplication in GF(216): 374 gates and depth 8

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 21 / 22

slide-48
SLIDE 48

Other results (multiplication in GF(2n))

Multiplication in GF(28): Improved a result with 117 gates and depth 7 to 106 gates and depth 6. Former result from Circuit Minimization Work: http://cs-www.cs.yale.edu/homes/peralta/CircuitStuff/CMT.html Multiplication in GF(216): 374 gates and depth 8 Used in a 16-bit S-box from [Kelly,Kaminsky,Kurdziel,Lukowiak,Radziszowski 2015] “Customizable spone-based authenticated encryption using 16-bit S-boxes” Reduced 1382 gates to 462.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 21 / 22

slide-49
SLIDE 49

Thank you for your attention.

Boyar, Find, Peralta Heuristic: Low-Depth, Low-Size Circuits BFA 2017 22 / 22