on a quantum computer On quantum arithmetic and space-time - - PowerPoint PPT Presentation

โ–ถ
on a quantum computer
SMART_READER_LITE
LIVE PREVIEW

on a quantum computer On quantum arithmetic and space-time - - PowerPoint PPT Presentation

Attacking binary elliptic curves on a quantum computer On quantum arithmetic and space-time trade-offs Martin Roetteler Microsoft Research Based on joint work with Brittanney Amento and Rainer Steinwandt [arXiv.org: 1209.5491, 1209.6348,


slide-1
SLIDE 1

Attacking binary elliptic curves

  • n a quantum computer

On quantum arithmetic and space-time trade-offs

Martin Roetteler Microsoft Research Based on joint work with Brittanney Amento and Rainer Steinwandt [arXiv.org: 1209.5491, 1209.6348, 1306.1161] DIMACS Workshop on the Mathematics

  • f Post-Quantum Cryptography

January 15, 2015

slide-2
SLIDE 2

Motivation

  • Analyze resources needed to implement Shor
  • Focus: Computing dlogs over abelian groups
  • Possible circuit optimizations
  • Scaling of space (=#qubits) and time (=depth)?

Please ask questions during talk!

1/15/2015 2

  • M. Roetteler -- QuArC Group @ MSR
slide-3
SLIDE 3

Background: Quantum resources

slide-4
SLIDE 4

Quantum bits and registers

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

4

โ‰ 

slide-5
SLIDE 5

Measurements

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

5

slide-6
SLIDE 6

Examples: local operations and CNOT

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

6

slide-7
SLIDE 7

Notation for unitary matrices

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

7

Wire = qubit

slide-8
SLIDE 8

Universality theorem

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

8

slide-9
SLIDE 9

Levels of abstraction

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

9

Many more levels down (FTQECC, q control) and up (prog lang)

slide-10
SLIDE 10

Operations on subspaces

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

10

slide-11
SLIDE 11

Controlled rotations

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

11

Remark: For ๐‘‰ = ๐‘‚๐‘ƒ๐‘ˆ, the gate ฮ›1 ๐‘‚๐‘ƒ๐‘ˆ is the CNOT gate. The gate ฮ›2(๐‘‚๐‘ƒ๐‘ˆ) is called the Toffoli gate.

slide-12
SLIDE 12

Discrete universal gate sets

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

12

Important universal gate set โ€œClifford + Tโ€ (for logical operations):

Consists of all Clifford operations (i.e., the group generated by ๐ผ2, ๐ท๐‘‚๐‘ƒ๐‘ˆ and ๐‘’๐‘—๐‘๐‘•(1, ๐‘—)) and the โ€œT gateโ€ (T = ๐‘’๐‘—๐‘๐‘•(1, ๐œ•8)). Can be shown to be universal, i.e., for any unitary U and any given ๐œ— > 0, there exists an element A in the Clifford+T group such that || ๐‘‰ โˆ’ ๐ต || โ‰ค ๐œ— .

  • This gate set arises naturally in the context of fault-tolerant quantum computing

for several quantum codes, e.g., Steane code, surface code.

  • T gate usually implemented via a process called โ€œmagic state distillationโ€ which is

very expensive. Much more expensive than Clifford gates.

  • Common metrics used to measure resources:
  • T-count = total number of T gates used in a circuit
  • T-depth = number of T-layers when a circuit is written as C T C โ€ฆ T C
  • #qubits = total number of qubits used, including โ€œancillasโ€ (=scratch space)

Typically, single-qubit rotations account for most of the cost!

slide-13
SLIDE 13

Bounding resources: T gates

1/15/2015 13

  • M. Roetteler -- QuArC Group @ MSR

A useful factorization: Lemma: If a unitary U can be implemented exactly over Clifford+T, then also ฮ›(U) can be implemented exactly. [arxiv.org:1206.0758] This Lemma be used in some situations to avoid all errors due to single qubit approximations. Cost of controlled unitaries:

  • Tracking v=[#loc, #CNOT,#H, #P, #T]
  • From U to ฮ›(U): matrix vector multiplication Mv.

๏ƒบ ๏ƒบ ๏ƒบ ๏ƒบ ๏ƒบ ๏ƒบ ๏ƒป ๏ƒน ๏ƒช ๏ƒช ๏ƒช ๏ƒช ๏ƒช ๏ƒช ๏ƒซ ๏ƒฉ ๏€ฝ 15 14 2 7 2 3 2 1 4 4 2 2 16 16 3 6 1 2 M

slide-14
SLIDE 14

Solovay-Kitaev algorithm

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

14

Goal: Approximate unitaries by elements of dense subgroup ๐ป โ‰ค ๐‘‰(๐‘‚) Basic idea: Successive refining of a โ€œnetโ€ using commutators Implementations:

  • [Kitaev, Shen, Vyialyi, AMS 2002]: log3+ฮด (1/ฮต) time, log3+ฮด(1/ฮต) length
  • [Dawson, Nielsen, quant-ph/0505030]: log2.71(1/ฮต) time, log3.97(1/ฮต) length
  • [Harrow, Recht, Chuang, quant-ph/0111031]: non-constructive, log (1/ฮต) length

[Image source: Nielsen/Chuang, CUP 2000]

slide-15
SLIDE 15

Single qubit gates: synthesis methods

Basic idea: Shown are all unitaries in โŒฉ๐ผ, ๐‘ˆโŒช that are obtainable from a simple round-off procedure and have T-count โ‰ค 12. [Kliuchnikov/Maslov/Mosca 2012], [Selinger 2012]

[Slide concept by V. Kliuchnikov]

1/15/2015 15

  • M. Roetteler -- QuArC Group @ MSR
slide-16
SLIDE 16

T

  • ols from the theory of

reversible computing

slide-17
SLIDE 17

Classical circuits

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

17

  • Consider functions from nโ‰ฅ1 bits to mโ‰ฅ1 bits. We are interested

in implementing functions by combinational circuits, i.e., circuits that do not make use of memory elements or feedback.

  • Universal families of gates exist, i.e., sets of elementary

gates from which any circuit can be built.

  • We can compose gates together to make larger circuits.
  • Problem for quantum computing: many gates are not reversible!

a b a ฮ› b a a

[Slide concept by M. Mosca, Waterloo]

slide-18
SLIDE 18

How to invert an irreversible operation?

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

18

slide-19
SLIDE 19

Reversible computation

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

19

slide-20
SLIDE 20

How to make circuits reversible?

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

20

Example: Replace each gate with a reversible one:

[Slide concept by M. Mosca, Waterloo]

slide-21
SLIDE 21
  • Replacing each gate with a reversible one works fine,

however, it produces โ€œgarbageโ€, i.e., help registers will be in a state different from 0 at the end.

  • While this is fine for reversible computing, it is bad for

quantum computing (it will prevent interference).

  • There is a way out of this dilemma: the Bennett trick

Idea: compute forward, copy the result, โ€œuncomputeโ€ the garbage by running the computation backwards.

How to avoid garbage?

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

21

slide-22
SLIDE 22

Uncomputing the garbage

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

22

Replace each gate with a reversible one: T2 T1 Tn Tn

  • 1

T2

  • 1

T1

  • 1
slide-23
SLIDE 23

The pebble game

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

23

Example: Rules of the game: [Bennett, SIAM J. Comp., 1989]

  • n boxes, labeled i = 1, โ€ฆ, n
  • in each move, either add or remove a pebble
  • a pebble can be added or removed in i=1 at any time
  • a pebble can be added of removed in i>1 if and only if

there is a pebble in i-1. 1 2 3 4 # i 1 1 2 2 3 3 4 4 5 3 6 2 7 1

slide-24
SLIDE 24

The pebble game

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

24

Example: (n=3, S=3) Imposing resource constraints:

  • only a total of S pebbles are allowed
  • corresponds to reversible algorithm with at most S

ancilla qubits 1 2 3 4 # i 1 1 2 2 3 3 4 1 5 4 6 3 7 1 8 2 9 1

slide-25
SLIDE 25

Optimal pebbling strategies

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

25

Definition: Let X be solution of pebble game. Let T(X) be # steps and Let S(X) be #pebbles. Define F(n,S) = min { T(X) : S(X) โ‰ค S }. Table (small values of F):

[E.Knill, arxiv:math/9508218]

slide-26
SLIDE 26

Let A be an algorithm with time complexity T and space complexity S.

  • Using reversible pebble game, [Bennett, SIAM J. Comp. 1989]

showed that for any ฮต>0 there is a reversible algorithm Aโ€™ with time complexity O(T1+ ฮต) and space complexity O(S ln(T)).

  • Issue: one cannot simply take the limit ฮตโ†’0. The space would

grow in an unbounded way (as O(ฮต21/ฮต S ln(T))).

  • Improved analysis [Levine, Sherman, SIAM J. Comp. 1990]

showed that for any ฮต>0 there is a reversible algorithm Aโ€™ with time complexity O(T1+ ฮต/S ฮต) and space complexity O(S (1+ln(T/S))).

  • Other time/space tradeoffs: [Buhrman, Tromp, Vitรกnyi, ICALPโ€™01]

Research topic: develop a โ€œcompilerโ€ that takes a classical combinational circuit as input and translates it into a reversible circuit, with respect to various resource constraints.

Time-space tradeoffs

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

26

slide-27
SLIDE 27

Shor

slide-28
SLIDE 28

Reducing factoring to period finding

  • Modular exponentiation: Let N be an integer and let a be in
  • ZN. Modular exponentiation is the map f(x) := ax mod N.
  • Fact: The map f can be implemented in O(poly(log N)) ops.
  • Fact: It can be shown that it can also be implemented

efficiently on a quantum computer.

  • More facts:

โ€“ Recall that the order of a is defined as the smallest integer r such that ar = 1 mod N. โ€“ The function f(x) := ax mod N is periodic with period r equal to the order of a, i. e., f (x) = f (x + r) for all x. โ€“ The problem of factoring N can be reduced to period finding for modular exponentiation f (for random a).

1/15/2015 28

  • M. Roetteler -- QuArC Group @ MSR
slide-29
SLIDE 29

29 1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

Setting up a periodic state

  • Observation: The function f(x) = ax mod N is periodic and has period length r,
  • i. e., f (x) = f (x + r) for all inputs x.
  • Example: graph of the function f (x) = 2x mod 165:

x | f(x) y | ๏€ฝ

slide-30
SLIDE 30

Shorโ€™s algorithm for period finding

1/15/2015 30

  • M. Roetteler -- QuArC Group @ MSR
slide-31
SLIDE 31

Period finding using coset states

1/15/2015 31

  • M. Roetteler -- QuArC Group @ MSR
slide-32
SLIDE 32

1/15/2015 32

  • M. Roetteler -- QuArC Group @ MSR

Discrete Fourier Transforms

slide-33
SLIDE 33

Discrete Fourier Transform (DFT/QFT)

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

33

slide-34
SLIDE 34

Quantum Fast Fourier Transform

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

15

slide-35
SLIDE 35

The Hidden Subgroup Problem

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

35

slide-36
SLIDE 36

Shorโ€™s algorithm for dlogs:

1/15/2015 36

  • M. Roetteler -- QuArC Group @ MSR

Step 1: Create ๐‘™โˆˆ 0,1 ๐‘œ ๐‘™1, โ€ฆ , ๐‘™๐‘œ โŠ— โ„“โˆˆ 0,1 ๐‘œ โ„“1, โ€ฆ , โ„“๐‘œ โŠ— |๐’ซ โŒช by applying Hadamard gates to 2 registers of ๐‘œ qubits; ๐‘œ = โŒˆlog ๐‘๐‘ ๐‘’๐‘„ โŒ‰ Step 2: For fixed generator ๐‘„ and fixed target ๐‘… โˆˆ ๐‘„ compute the transformation that maps this state to

๐‘™โˆˆ 0,1 ๐‘œ

๐‘™ โŠ—

โ„“โˆˆ 0,1 ๐‘œ

โ„“ โŠ— |๐‘™๐‘„ + โ„“๐‘…โŒช Step 3: Measure the 3rd register. Obtain a result ๐‘†. Letting ๐‘… = ๐›ฝ๐‘„ and ๐‘† = ๐›พ๐‘„, we obtain a state corresponding to a โ€œlineโ€

๐‘™,โ„“โˆˆ 0,1 ๐‘œ: ๐‘™+๐›ฝโ„“=๐›พ

๐‘™ โŠ— โ„“ โŠ— ๐‘† =

โ„“โˆˆ 0,1 ๐‘œ

๐›พ โˆ’ ๐›ฝโ„“ โŠ— โ„“ Step 4: Apply ๐‘…๐บ๐‘ˆ โŠ— ๐‘…๐บ๐‘ˆ and measure to sample from the line { ๐‘ฆ, ๐›ฝ๐‘ฆ , ๐‘ฆ โˆˆ 0, . . , 2๐‘œ โˆ’ 1 . If ๐‘ฆ is a unit, we obtain ๐›ฝ.

slide-37
SLIDE 37

Visualizing Fourier duality

| โŒช | โŒช

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

13

slide-38
SLIDE 38

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

38

Circuit for Shorโ€™s dlog algorithm

Phase estimation circuit layout:

slide-39
SLIDE 39

Simple circuit optimizations

slide-40
SLIDE 40

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

40

Double & Add

Input: binary string ๐‘ฆ๐‘œโˆ’1, ๐‘ฆ๐‘œโˆ’2, โ€ฆ , ๐‘ฆ1, ๐‘ฆ0 Output: ๐‘ฆ = ๐‘— ๐‘ฆ๐‘—2๐‘— = x0 + 2(x1 + 2 x2 + โ€ฆ ) Method 1 (โ€œevaluate left-to-right") x โ† ๐‘ฆ0 for i = 1 โ€ฆ n โˆ’ 1 do x โ† ๐‘ฆ + 2๐‘—๐‘ฆ๐‘— end for return x Method 2 (โ€œevaluate right-to-left") x โ† ๐‘ฆ๐‘œโˆ’1 for i = n โˆ’ 2 โ€ฆ 1 do x โ† 2๐‘ฆ + ๐‘ฆ๐‘— end for return x

slide-41
SLIDE 41

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

41

Rewriting the ECC dlog circuit

slide-42
SLIDE 42

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

42

Rewriting the ECC dlog circuit

Improvement 2: use Shamirโ€™s trick to combine double& add for P and Q

slide-43
SLIDE 43

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

43

Double & Add: Shamirโ€™s Trick

Saves 50% of the doublers

slide-44
SLIDE 44

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

44

More rewriting: Shamirโ€™s trick

+P H H H H H H +Q +Q QFT22n+2 2x 2x +P +Q

slide-45
SLIDE 45

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

45

Semi-classical QFT

+P H H H H H H +Q +Q QFT 2x 2x +P +Q

measure

+P/+Q H Z(๐œ„) H

Equivalent protocol: measure |๐ตโŒช |๐’ซโŒช

Saves a lot of qubits!

slide-46
SLIDE 46

Example: ECC point addition

1/15/2015 46

  • M. Roetteler -- QuArC Group @ MSR

[Bernstein, Lange: http://www.hyperelliptic.org/EFD/] Consider elliptic curve in short Weierstrass form over ๐ป๐บ(2๐‘›) Adding 2 projective points ๐‘„

1 = (X1, Y1, Z1) and ๐‘„2 = (X2, Y2, Z2)

can be done with 12 ๐ป๐บ 2๐‘› -multsโ€”of which 9 are genericโ€” 7 ๐ป๐บ(2๐‘›)-adds, and 1 squaring (madd-2008-bl):

slide-47
SLIDE 47

Complete binary Edwards curves

[Bernstein, Lange, Farashahi, 2008]: For n๏‚ณ3 each ordinary binary elliptic curve is birationally equivalent to a complete binary Edwards curve: (d1, d2๏ƒŽGF(2n) with Tr(d2)=1).

  • no projective closure needed
  • one formula to implement group law for all points
  • identity: (0,0)

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

47

Point addition / group law:

slide-48
SLIDE 48

Complete binary Edwards curves

Consider complete binary Edwards curve:

  • One can work projectively to avoid inversions.
  • Adding projective points ๐‘„

1 = (X1, Y1, Z1) and ๐‘„2 = (X2, Y2, Z2)

can be done with 21 ๐ป๐บ 2๐‘› -multsโ€”of which 17 are genericโ€” 15 ๐ป๐บ(2๐‘›)-adds, and 1 squaring:

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

48

slide-49
SLIDE 49

Example: higher genus

1/15/2015 49

  • M. Roetteler -- QuArC Group @ MSR

Projective coordinates

  • f the points require

division at the end to make representation unambiguous Some formulas require modular division

[Bos, Costello, Hisil, Lauter, 2013]

slide-50
SLIDE 50

Quantum arithmetic

what is the problem? why is this non-trivial? who cares?

slide-51
SLIDE 51

Adders

1/15/2015

This is a space optimized adder. Runs in T-depth 2n-1. Quite poor load factor, i.e., most qubits in the computation are idle. Explore time/space trade-offs.

51

  • M. Roetteler -- QuArC Group @ MSR
slide-52
SLIDE 52

Controlled quantum adder

1/15/2015 52

  • M. Roetteler -- QuArC Group @ MSR

Resource estimate: 14๐‘œ โˆ’ 11 Toffoli gates

[Draper, Kutin, Rains, Svore, 2004]

slide-53
SLIDE 53

Multipliers

1/15/2015

Wallace tree multiplier. T-count of ๐‘œ2 + 4๐‘œ log2(๐‘œ) and T-depth ๐‘ƒ(log2(๐‘œ)). Shown is an implementation in .qc/QCViewer of a circuit generated dynamically by a Haskell library.

53

  • M. Roetteler -- QuArC Group @ MSR
slide-54
SLIDE 54

Division with remainder

1/15/2015 54

  • M. Roetteler -- QuArC Group @ MSR
slide-55
SLIDE 55

Adders for ๐‘œ bit integers:

  • Low depth circuit:
  • [Draper, Kutin, Rains, Svore, quant-ph/0406142].
  • Depth ๐‘ƒ(log ๐‘œ), however, requires ๐‘ƒ(๐‘œ) ancillas.
  • In-place version exists. Easy to modify into controlled adder
  • Space optimized circuit:
  • [Cuccaro, Draper, Kutin, Moulton, quant-ph/0410184].
  • Can be used to implement in-place addition ๐‘ฆ, ๐‘ง โ†ฆ ๐‘ฆ, ๐‘ฆ + ๐‘ง with
  • nly 1 additional ancilla qubit. Depth scales linear with ๐‘œ.

Multipliers for ๐‘œ bit integers:

  • Simple ๐‘ƒ(๐‘œ2) โ€œschoolโ€ method using controlled adders. Disadvantage:

circuit depth scales linear with ๐‘œ. Improvement: Wallace tree in log depth.

  • Limitation: only out-of-place multipliers ๐‘ฆ, ๐‘ง, 0 โ†ฆ ๐‘ฆ, ๐‘ง, ๐‘ฆ โ‹… ๐‘ง known.

Arithmetic for modular exponentiation:

  • Computing ๐‘ฆ โ†ฆ ๐‘๐‘ฆ ๐‘›๐‘๐‘’ ๐‘‚ for fixed ๐‘, ๐‘‚ is relatively easy and can be done

using 2๐‘œ + 3 qubits and ๐‘ƒ ๐‘œ3 time: [Beauregard, quant-ph/0205095]

Time-space tradeoffs II

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

55

slide-56
SLIDE 56

Modular inverses:

Approaches based on Fermatโ€™s little theorem

slide-57
SLIDE 57

Modular Inverse a la Fermat

1/15/2015 57

  • M. Roetteler -- QuArC Group @ MSR

Basic idea:

  • Let ๐‘ž be prime, let ๐‘ฆ โˆˆ 1, โ€ฆ , ๐‘ž โˆ’ 2 .
  • Recall that in any finite group: ๐‘ฆ|๐ป| = ๐‘“.
  • When applied to ๐ป๐บ ๐‘ž ร— this implies
  • ๐‘ฆ๐‘žโˆ’1 โ‰ก 1 ๐‘ž
  • Or in other words: ๐‘ฆ๐‘žโˆ’2 โ‹… ๐‘ฆ โ‰ก 1 ๐‘ž
  • Or in other words: ๐‘ฆโˆ’1 โ‰ก ๐‘ฆ๐‘žโˆ’2 ๐‘ž
  • That means we can compute the inverse by exponentiation
  • f the (unknown) ๐‘ฆ for the (known, fixed) exponent ๐‘ž.
slide-58
SLIDE 58

Modular multiplier

1/15/2015 58

  • M. Roetteler -- QuArC Group @ MSR

A MUL ๐‘ง ๐‘ฆ ๐‘จ + ๐‘ฆ๐‘ง ๐‘ฆ ๐‘จ ๐‘ง

slide-59
SLIDE 59

Square & multiply by unrolling

1/15/2015 59

  • M. Roetteler -- QuArC Group @ MSR

A MUL ๐‘ฆ A MUL ๐‘ฆ ๐‘ฆ ๐‘ฆ2 ๐‘ฆ4 ๐‘ฆ8 ๐‘ฆ16 โ€ฆ โ€ฆ

Depth: 2๐‘œ ร— ๐‘’๐‘“๐‘ž๐‘ขโ„Ž ๐‘๐‘‰๐‘€ + 2๐‘’๐‘“๐‘ž๐‘ขโ„Ž ๐ต๐ธ๐ธ ) + ๐‘œ Width: 2๐‘œ ร— ๐‘œ = ๐‘œ2

  • Here ๐‘œ is the bit-size of ๐‘ฆ
  • Use binary representation of ๐‘ž โˆ’ 2 to compute ๐‘ฆ๐‘žโˆ’2
  • +
  • +
slide-60
SLIDE 60

Open problem: improvements?

1/15/2015 60

  • M. Roetteler -- QuArC Group @ MSR

A MUL ? ? ? ? โ€ฆ

Depth: ๐‘ƒ ๐‘œ2 Width: ๐‘ƒ(๐‘œ)

  • Partial success: using MUL and suitable permutations U we can

compute the Chebyshev polynomials ๐‘ˆ

๐‘œ(๐‘ฆ) ๐‘›๐‘๐‘’ ๐‘ž.

  • Unclear whether they allow to efficiently compute monomials ๐‘ฆ๐‘œ

A U A MUL โ€ฆ โ† Can we achieve this using suitable initial configuration, suitable U? Unknown whether linear space can be achieved by this approach

slide-61
SLIDE 61

Modular inverses:

Approaches based on the Euclidean algorithm

slide-62
SLIDE 62

Modular Inverse via GCD

1/15/2015 62

  • M. Roetteler -- QuArC Group @ MSR

Basic idea:

  • Let ๐‘ž be prime, let ๐‘ฆ โˆˆ 1, โ€ฆ , ๐‘ž โˆ’ 2 .
  • Compute the greatest common divisor (GCD) of p and x
  • โ€ฆ and find the linear representation of the GCD:

๐‘ ๐‘ž + ๐‘ ๐‘ฆ = ๐ป๐ท๐ธ(๐‘ž, ๐‘ฆ) = 1

  • This means that modulo p we have that ๐‘ ๐‘ฆ = 1
  • In other words: ๐‘ฆโˆ’1 ๐‘›๐‘๐‘’ ๐‘ž = ๐‘.

How to find a and b? โ†’ Extended Euclidean Algorithm

slide-63
SLIDE 63

An Orwellian principle (?)

1/15/2015 63

  • M. Roetteler -- QuArC Group @ MSR

โ€œIgnorance is Strengthโ€

Any computation that a quantum computer carries

  • ut must be independent of the input data.
  • Reason: quantum programs must be able to run on

superposition of input data. If the execution flow of the depended on the input in any way that makes 2 or more inputs distinguishable, this can lead unwanted entanglement that destroys interference.

  • In quantum context first studied by [Bernstein/Vaziraniโ€™93]

โ†’ path synchronization technique for Quantum TMs.

  • Classically studied too: โ€œOblivious Turing Machinesโ€
slide-64
SLIDE 64

Saeedi & Markovโ€™s method

1/15/2015 64

  • M. Roetteler -- QuArC Group @ MSR

Uses binary Euclid: Single round: Summary: + Easy to circuitize + Depth scales as O(n log n)

  • But does not yield linear representation of GCD

[Saeedi, Markov arXiv:1304.7516]

slide-65
SLIDE 65

Shor for factoring vs ECC dlog

1/15/2015 65

  • M. Roetteler -- QuArC Group @ MSR
  • Suggests that quantum attacks on ECC/dlog can be done more

efficiently than RSA/factoring with comparable level of security.

  • Circuits are somewhat non-trivial to implement and to layout.
  • Only short Weierstrass forms considered, unclear how classical
  • ptimizations of point additions can be leveraged.
  • Leaves open how to optimize depth for Shor ECC.

[Proos, Zalka, quant-ph/0301141]

slide-66
SLIDE 66

Optimizing the circuit depth for the binary case

slide-67
SLIDE 67

Low-depth GF(2n)-arithmetic

Design decision: polynomial basis representation

  • Addition: depth O(1)
  • Squaring: matrix-vector mult. โ†’ addition

trees+โ€œmulti-fan-out CNOT w/ |0๏ƒฑ-inputโ€: O(log n)

  • Multiplication: Maslov et al.โ€™s construction

reduces to 3 matrix-vector multiplications parallelization: depth O(log n) Projective point addition: depth O(log n) Note: all this is irrelevant for the large p case !!

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

67

slide-68
SLIDE 68

Inversion: prior work

Beauregard et al. 2003, Kaye-Zalka 2004, Maslov et al. 2009 offer circuits for GF(2m)-inversion: Inversion: apply extended Euclidean algorithm in depth O(m2) using 2m + O(log m) qubits. We can actually do much better in the binary case and achieve poly-log scaling of depth!

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

68

slide-69
SLIDE 69

Ghost-bit basis representation

[Itoh-Tsujii 1989], [Silverman 1999]: If f=1+x+โ€ฆ+xm๏ƒŽ GF(2m)[x] is irreducible, the maps GF(2m)[x]/(f) ๏‚พ๏‚ฎ GF(2m)[x]/(xm+1+1) Sai+(f) ๏‚ฎ Sai+(xm+1+1) S(ai+am)xi+(f) ๏‚ฌ a0x0+โ€ฆ+amxm+(xm+1+1) allow to move arithmetic to GF(2m)[x]/(xm+1+1).

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

69

slide-70
SLIDE 70

Ghost-bit basis arithmetic

  • Addition: bit-wise ๏ƒ… (i.e., depth 1 with CNOTs)
  • Multiplication: ( aixi)๏ƒ—( bi xi) = i( j ajb(i-j) mod (m+1))๏ƒ—xi
  • Squaring:

( aixi)2 = ap-1(i)๏ƒ—xi with p(i)=2๏ƒ—i mod (m+1) Squaring is a shuffle of the coefficient vector

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

70

slide-71
SLIDE 71

Gaussian normal basis of type T

Vector space basis {h, h2, h22,โ€ฆ, h2m-1} of GF(2m); let p=Tm+1, u๏ƒŽGF(2m)* of order T, F(2iuj mod p)=i

  • Addition: bit-wise ๏ƒ…
  • Multiplication: ( ai ๏ƒ— h2i)๏ƒ—( bi ๏ƒ— h2i) = gi ๏ƒ—h 2i with

gi=aF(1+1)+i๏ƒ— bF(p-1)+i + โ€ฆ + aF(Tm-1+1)+i๏ƒ—bF(p-(Tm-1))+i

  • Squaring: ( ai๏ƒ—h2i)2 = ai-1 (mod m)๏ƒ—h2i

Squaring is a rotation of the coefficient vector

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

71

slide-72
SLIDE 72

Itoh-Tsujii inversion algorithm

For a๏ƒŽGF(2m)* let bi=a2i-1. Then b1=a, a-1 ๏€ฝ (bm-1)2, and bi+j=bi๏ƒ—(bj)2i . (*) (1) write m-1=2k1+โ€ฆ+2kHW(m-1) with ๏ƒซlog2(m-1)๏ƒป=k1>โ€ฆ>kHW(m-1)๏‚ณ0 (2) find ฮฒ20,ฮฒ21,...,ฮฒ2k1 applying (*) with i=j (3) find ฮฒ2k1+2k2,โ€ฆ,ฮฒ2k1+โ€ฆ+ 2kHW(m-1)(=bm-1) with (*) Total cost: ๏ƒซlog2(m-1)๏ƒป +HW(m-1)-1 multiplications (+ squarings)

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

72

slide-73
SLIDE 73

Inversion in depth O(log๐Ÿ‘๐’)

(1) find ฮฒ20,ฮฒ21,...,ฮฒ2k1 from Itoh-Tsujii algorithm with ๏ƒซlog2(m-1)๏ƒป โ€œsingle-inputโ€ multipliers (squaring is free: permute control positions) (2) find ฮฒ2k1+โ€ฆ+ 2kHW(m-1)(=bm-1) with HW(m-1)-1 โ€œordinaryโ€ multipliers (not needed for m=2n+1, e.g., a Fermat prime) (3) Finally, ๐›ฝโˆ’1 = ๐›พ๐‘›โˆ’1 2 which is just a shuffle

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

73

slide-74
SLIDE 74

How not to compute k๏ƒ—P+l๏ƒ—Qโ€ฆ

Maslov et al.โ€™strategy โ€“ right-to-left double-and-add: R โ† 0 for i = 0 to n step 1 if ki= 1 then R โ† R + 2iยทP if li= 1 then R โ† R + 2iยทQ return R โ€ฆ yields depth O(n๏ƒ—log n) circuit requires O(n) potentially different adder circuits

precomputed

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

74

slide-75
SLIDE 75

Instead: Parallelized double-and-add

  • requires โ€œmulti-fan-out CNOT w/ |0๏ƒฑ-inputโ€
  • depth O(log2n), using general addition circuits

1/15/2015

  • M. Roetteler -- QuArC Group @ MSR

75

slide-76
SLIDE 76

Open problems

  • Can we adapt the methods to a 2D NN architecture?
  • Can square&multiply based ideas be modified to make

them space efficient?

  • Can the โ€œquantum-quantumโ€ techniques based on the

quantum Fourier transform (e.g., Draper adder) be applied to the modular inversion problem? Can we avoid modular inversions altogether?

  • Can we simplify the (Edwards) point addition circuits? Few T-

gates, less T-depth, less qubits?

  • Use the resource estimates to obtain resource estimates

for quantum attacks on ECC dlog for NIST curves and generalize this to Jacobians of hyperelliptic curves.

1/15/2015 76

  • M. Roetteler -- QuArC Group @ MSR