Faster Homomorphic Linear Transformations in HElib Shai Halevi - - PowerPoint PPT Presentation

faster homomorphic linear transformations in helib
SMART_READER_LITE
LIVE PREVIEW

Faster Homomorphic Linear Transformations in HElib Shai Halevi - - PowerPoint PPT Presentation

Faster Homomorphic Linear Transformations in HElib Shai Halevi (IBM) Victor Shoup (IBM & NYU) Fully Homomorphic Encryption allows for arbitrary computation on encrypted data In this talk, the focus is on linear transformations . . . more


slide-1
SLIDE 1

Faster Homomorphic Linear Transformations in HElib

Shai Halevi (IBM) Victor Shoup (IBM & NYU)

slide-2
SLIDE 2

Fully Homomorphic Encryption allows for arbitrary computation on encrypted data In this talk, the focus is on linear transformations . . . more specifically, applying a fixed, public linear transformation to a vector encrypted in the BGV (Brakerski-Gentry-Vaikuntanathan) cryptosystem We present new algorithms and their implementation in HElib We get speed ups of up to ≈ 75× One important application: bootstrapping ➪ in Chen and Han’s new bootstrapping algorithm (Eurocrypt 2018), most of the time is spent performing a change of basis ➪ speed up of up to ≈ 6× for bootstrapping as a whole

slide-3
SLIDE 3

Fully Homomorphic Encryption allows for arbitrary computation on encrypted data In this talk, the focus is on linear transformations . . . more specifically, applying a fixed, public linear transformation to a vector encrypted in the BGV (Brakerski-Gentry-Vaikuntanathan) cryptosystem We present new algorithms and their implementation in HElib We get speed ups of up to ≈ 75× One important application: bootstrapping ➪ in Chen and Han’s new bootstrapping algorithm (Eurocrypt 2018), most of the time is spent performing a change of basis ➪ speed up of up to ≈ 6× for bootstrapping as a whole

slide-4
SLIDE 4

Fully Homomorphic Encryption allows for arbitrary computation on encrypted data In this talk, the focus is on linear transformations . . . more specifically, applying a fixed, public linear transformation to a vector encrypted in the BGV (Brakerski-Gentry-Vaikuntanathan) cryptosystem We present new algorithms and their implementation in HElib We get speed ups of up to ≈ 75× One important application: bootstrapping ➪ in Chen and Han’s new bootstrapping algorithm (Eurocrypt 2018), most of the time is spent performing a change of basis ➪ speed up of up to ≈ 6× for bootstrapping as a whole

slide-5
SLIDE 5

Fully Homomorphic Encryption allows for arbitrary computation on encrypted data In this talk, the focus is on linear transformations . . . more specifically, applying a fixed, public linear transformation to a vector encrypted in the BGV (Brakerski-Gentry-Vaikuntanathan) cryptosystem We present new algorithms and their implementation in HElib We get speed ups of up to ≈ 75× One important application: bootstrapping ➪ in Chen and Han’s new bootstrapping algorithm (Eurocrypt 2018), most of the time is spent performing a change of basis ➪ speed up of up to ≈ 6× for bootstrapping as a whole

slide-6
SLIDE 6

Fully Homomorphic Encryption allows for arbitrary computation on encrypted data In this talk, the focus is on linear transformations . . . more specifically, applying a fixed, public linear transformation to a vector encrypted in the BGV (Brakerski-Gentry-Vaikuntanathan) cryptosystem We present new algorithms and their implementation in HElib We get speed ups of up to ≈ 75× One important application: bootstrapping ➪ in Chen and Han’s new bootstrapping algorithm (Eurocrypt 2018), most of the time is spent performing a change of basis ➪ speed up of up to ≈ 6× for bootstrapping as a whole

slide-7
SLIDE 7

Fully Homomorphic Encryption allows for arbitrary computation on encrypted data In this talk, the focus is on linear transformations . . . more specifically, applying a fixed, public linear transformation to a vector encrypted in the BGV (Brakerski-Gentry-Vaikuntanathan) cryptosystem We present new algorithms and their implementation in HElib We get speed ups of up to ≈ 75× One important application: bootstrapping ➪ in Chen and Han’s new bootstrapping algorithm (Eurocrypt 2018), most of the time is spent performing a change of basis ➪ speed up of up to ≈ 6× for bootstrapping as a whole

slide-8
SLIDE 8

Fully Homomorphic Encryption allows for arbitrary computation on encrypted data In this talk, the focus is on linear transformations . . . more specifically, applying a fixed, public linear transformation to a vector encrypted in the BGV (Brakerski-Gentry-Vaikuntanathan) cryptosystem We present new algorithms and their implementation in HElib We get speed ups of up to ≈ 75× One important application: bootstrapping ➪ in Chen and Han’s new bootstrapping algorithm (Eurocrypt 2018), most of the time is spent performing a change of basis ➪ speed up of up to ≈ 6× for bootstrapping as a whole

slide-9
SLIDE 9

Fully Homomorphic Encryption allows for arbitrary computation on encrypted data In this talk, the focus is on linear transformations . . . more specifically, applying a fixed, public linear transformation to a vector encrypted in the BGV (Brakerski-Gentry-Vaikuntanathan) cryptosystem We present new algorithms and their implementation in HElib We get speed ups of up to ≈ 75× One important application: bootstrapping ➪ in Chen and Han’s new bootstrapping algorithm (Eurocrypt 2018), most of the time is spent performing a change of basis ➪ speed up of up to ≈ 6× for bootstrapping as a whole

slide-10
SLIDE 10

BGV encryption R = Z[X]/(n(X)) Plaintext space: Rp := R/pR (p = small prime) Ciphertext space: Rq := R/qR (n, p, q pairwise coprime) Ciphertext: ¯ c ∈ R2×1

q

Secret key: ¯ s = (1, s1) ∈ R2×1

q

, where s1 has small norm Decryption: 〈¯ s, ¯ c〉 = pε

  • “noise”

+m

slide-11
SLIDE 11

BGV encryption R = Z[X]/(n(X)) Plaintext space: Rp := R/pR (p = small prime) Ciphertext space: Rq := R/qR (n, p, q pairwise coprime) Ciphertext: ¯ c ∈ R2×1

q

Secret key: ¯ s = (1, s1) ∈ R2×1

q

, where s1 has small norm Decryption: 〈¯ s, ¯ c〉 = pε

  • “noise”

+m

slide-12
SLIDE 12

BGV encryption R = Z[X]/(n(X)) Plaintext space: Rp := R/pR (p = small prime) Ciphertext space: Rq := R/qR (n, p, q pairwise coprime) Ciphertext: ¯ c ∈ R2×1

q

Secret key: ¯ s = (1, s1) ∈ R2×1

q

, where s1 has small norm Decryption: 〈¯ s, ¯ c〉 = pε

  • “noise”

+m

slide-13
SLIDE 13

BGV encryption R = Z[X]/(n(X)) Plaintext space: Rp := R/pR (p = small prime) Ciphertext space: Rq := R/qR (n, p, q pairwise coprime) Ciphertext: ¯ c ∈ R2×1

q

Secret key: ¯ s = (1, s1) ∈ R2×1

q

, where s1 has small norm Decryption: 〈¯ s, ¯ c〉 = pε

  • “noise”

+m

slide-14
SLIDE 14

BGV encryption R = Z[X]/(n(X)) Plaintext space: Rp := R/pR (p = small prime) Ciphertext space: Rq := R/qR (n, p, q pairwise coprime) Ciphertext: ¯ c ∈ R2×1

q

Secret key: ¯ s = (1, s1) ∈ R2×1

q

, where s1 has small norm Decryption: 〈¯ s, ¯ c〉 = pε

  • “noise”

+m

slide-15
SLIDE 15

BGV encryption R = Z[X]/(n(X)) Plaintext space: Rp := R/pR (p = small prime) Ciphertext space: Rq := R/qR (n, p, q pairwise coprime) Ciphertext: ¯ c ∈ R2×1

q

Secret key: ¯ s = (1, s1) ∈ R2×1

q

, where s1 has small norm Decryption: 〈¯ s, ¯ c〉 = pε

  • “noise”

+m

slide-16
SLIDE 16

Representation of ciphertext space Rq Coefficient representation DoubleCRT representation

  • q = q1 · · · qℓ, where each q is a small prime such that Zq contains nth roots of

unity

  • A polynomial in Rq is reduced modulo each q, and then evaluated at the

primitive nth roots of unity in Zq

Addition of ciphertexts in DoubleCRT representation takes linear time . . . so does multiplication by a constant Switching between DoubleCRT and coefficient representations: somewhat expensive (requires CRT and FFT)

slide-17
SLIDE 17

Representation of ciphertext space Rq Coefficient representation DoubleCRT representation

  • q = q1 · · · qℓ, where each q is a small prime such that Zq contains nth roots of

unity

  • A polynomial in Rq is reduced modulo each q, and then evaluated at the

primitive nth roots of unity in Zq

Addition of ciphertexts in DoubleCRT representation takes linear time . . . so does multiplication by a constant Switching between DoubleCRT and coefficient representations: somewhat expensive (requires CRT and FFT)

slide-18
SLIDE 18

Representation of ciphertext space Rq Coefficient representation DoubleCRT representation

  • q = q1 · · · qℓ, where each q is a small prime such that Zq contains nth roots of

unity

  • A polynomial in Rq is reduced modulo each q, and then evaluated at the

primitive nth roots of unity in Zq

Addition of ciphertexts in DoubleCRT representation takes linear time . . . so does multiplication by a constant Switching between DoubleCRT and coefficient representations: somewhat expensive (requires CRT and FFT)

slide-19
SLIDE 19

Representation of ciphertext space Rq Coefficient representation DoubleCRT representation

  • q = q1 · · · qℓ, where each q is a small prime such that Zq contains nth roots of

unity

  • A polynomial in Rq is reduced modulo each q, and then evaluated at the

primitive nth roots of unity in Zq

Addition of ciphertexts in DoubleCRT representation takes linear time . . . so does multiplication by a constant Switching between DoubleCRT and coefficient representations: somewhat expensive (requires CRT and FFT)

slide-20
SLIDE 20

Representation of ciphertext space Rq Coefficient representation DoubleCRT representation

  • q = q1 · · · qℓ, where each q is a small prime such that Zq contains nth roots of

unity

  • A polynomial in Rq is reduced modulo each q, and then evaluated at the

primitive nth roots of unity in Zq

Addition of ciphertexts in DoubleCRT representation takes linear time . . . so does multiplication by a constant Switching between DoubleCRT and coefficient representations: somewhat expensive (requires CRT and FFT)

slide-21
SLIDE 21

Representation of ciphertext space Rq Coefficient representation DoubleCRT representation

  • q = q1 · · · qℓ, where each q is a small prime such that Zq contains nth roots of

unity

  • A polynomial in Rq is reduced modulo each q, and then evaluated at the

primitive nth roots of unity in Zq

Addition of ciphertexts in DoubleCRT representation takes linear time . . . so does multiplication by a constant Switching between DoubleCRT and coefficient representations: somewhat expensive (requires CRT and FFT)

slide-22
SLIDE 22

Multiplication and Key Switching Multiplying two ciphertexts in DoubleCRT representation takes linear time But . . . we get a ciphertext defined with respect to a different secret key So . . . we include an encryption of this other key under the original key in the public parameters (called a “key switching matrix”) Using this, we can convert the product ciphertext to an equivalent

  • ne under the original key

Key switching is expensive: ☞ conversions between coefficient and DoubleCRT representations

slide-23
SLIDE 23

Multiplication and Key Switching Multiplying two ciphertexts in DoubleCRT representation takes linear time But . . . we get a ciphertext defined with respect to a different secret key So . . . we include an encryption of this other key under the original key in the public parameters (called a “key switching matrix”) Using this, we can convert the product ciphertext to an equivalent

  • ne under the original key

Key switching is expensive: ☞ conversions between coefficient and DoubleCRT representations

slide-24
SLIDE 24

Multiplication and Key Switching Multiplying two ciphertexts in DoubleCRT representation takes linear time But . . . we get a ciphertext defined with respect to a different secret key So . . . we include an encryption of this other key under the original key in the public parameters (called a “key switching matrix”) Using this, we can convert the product ciphertext to an equivalent

  • ne under the original key

Key switching is expensive: ☞ conversions between coefficient and DoubleCRT representations

slide-25
SLIDE 25

Multiplication and Key Switching Multiplying two ciphertexts in DoubleCRT representation takes linear time But . . . we get a ciphertext defined with respect to a different secret key So . . . we include an encryption of this other key under the original key in the public parameters (called a “key switching matrix”) Using this, we can convert the product ciphertext to an equivalent

  • ne under the original key

Key switching is expensive: ☞ conversions between coefficient and DoubleCRT representations

slide-26
SLIDE 26

Multiplication and Key Switching Multiplying two ciphertexts in DoubleCRT representation takes linear time But . . . we get a ciphertext defined with respect to a different secret key So . . . we include an encryption of this other key under the original key in the public parameters (called a “key switching matrix”) Using this, we can convert the product ciphertext to an equivalent

  • ne under the original key

Key switching is expensive: ☞ conversions between coefficient and DoubleCRT representations

slide-27
SLIDE 27

Plaintext space structure Chinese Remainder Theorem: Rp = Zp[X]/(n(X)) ∼ =

h

  • =1

Zp[X]/(ƒ(X)) where n(X) =

h

=1 ƒ(X)

Each ƒ irreducible of degree d = order of p mod n So we have Rp ∼ = (GF(pd))h [dh = ϕ(n)] We can view plaintext space as GF(pd), and we can work on vectors

  • f h plaintext “slots” in parallel

Reminiscent of vectorized or SIMD computation

slide-28
SLIDE 28

Plaintext space structure Chinese Remainder Theorem: Rp = Zp[X]/(n(X)) ∼ =

h

  • =1

Zp[X]/(ƒ(X)) where n(X) =

h

=1 ƒ(X)

Each ƒ irreducible of degree d = order of p mod n So we have Rp ∼ = (GF(pd))h [dh = ϕ(n)] We can view plaintext space as GF(pd), and we can work on vectors

  • f h plaintext “slots” in parallel

Reminiscent of vectorized or SIMD computation

slide-29
SLIDE 29

Plaintext space structure Chinese Remainder Theorem: Rp = Zp[X]/(n(X)) ∼ =

h

  • =1

Zp[X]/(ƒ(X)) where n(X) =

h

=1 ƒ(X)

Each ƒ irreducible of degree d = order of p mod n So we have Rp ∼ = (GF(pd))h [dh = ϕ(n)] We can view plaintext space as GF(pd), and we can work on vectors

  • f h plaintext “slots” in parallel

Reminiscent of vectorized or SIMD computation

slide-30
SLIDE 30

Plaintext space structure Chinese Remainder Theorem: Rp = Zp[X]/(n(X)) ∼ =

h

  • =1

Zp[X]/(ƒ(X)) where n(X) =

h

=1 ƒ(X)

Each ƒ irreducible of degree d = order of p mod n So we have Rp ∼ = (GF(pd))h [dh = ϕ(n)] We can view plaintext space as GF(pd), and we can work on vectors

  • f h plaintext “slots” in parallel

Reminiscent of vectorized or SIMD computation

slide-31
SLIDE 31

Plaintext space structure Chinese Remainder Theorem: Rp = Zp[X]/(n(X)) ∼ =

h

  • =1

Zp[X]/(ƒ(X)) where n(X) =

h

=1 ƒ(X)

Each ƒ irreducible of degree d = order of p mod n So we have Rp ∼ = (GF(pd))h [dh = ϕ(n)] We can view plaintext space as GF(pd), and we can work on vectors

  • f h plaintext “slots” in parallel

Reminiscent of vectorized or SIMD computation

slide-32
SLIDE 32

Some useful automorphisms Each j ∈ Z∗

n defines an automorphism on Rp that sends X → Xj

Homomorphic evaluation: just apply X → Xj directly to Rq ☞ easy . . . but it requires “key switching” This gives us a set of “rotations” that allow us to move data between “slots”

slide-33
SLIDE 33

Some useful automorphisms Each j ∈ Z∗

n defines an automorphism on Rp that sends X → Xj

Homomorphic evaluation: just apply X → Xj directly to Rq ☞ easy . . . but it requires “key switching” This gives us a set of “rotations” that allow us to move data between “slots”

slide-34
SLIDE 34

Some useful automorphisms Each j ∈ Z∗

n defines an automorphism on Rp that sends X → Xj

Homomorphic evaluation: just apply X → Xj directly to Rq ☞ easy . . . but it requires “key switching” This gives us a set of “rotations” that allow us to move data between “slots”

slide-35
SLIDE 35

A simplified (but not very typical) setting: p ≡ 1 (mod n) =⇒ n(X) splits completely over Zp We have: Rp = Zp[X]/(n(X)) ∼ = GF(p)h where h = ϕ(n) via the isomorphism [ƒ(X) mod n(X)] → [ƒ(ω)]∈Z∗

n

where ω ∈ Z∗

p is a primitive nth root of unity

The automorphism X → Xj sends [ƒ(ω)]∈Z∗

n → [ƒ(ωj)]∈Z∗ n

So the data in slot j moves to slot 

slide-36
SLIDE 36

A simplified (but not very typical) setting: p ≡ 1 (mod n) =⇒ n(X) splits completely over Zp We have: Rp = Zp[X]/(n(X)) ∼ = GF(p)h where h = ϕ(n) via the isomorphism [ƒ(X) mod n(X)] → [ƒ(ω)]∈Z∗

n

where ω ∈ Z∗

p is a primitive nth root of unity

The automorphism X → Xj sends [ƒ(ω)]∈Z∗

n → [ƒ(ωj)]∈Z∗ n

So the data in slot j moves to slot 

slide-37
SLIDE 37

A simplified (but not very typical) setting: p ≡ 1 (mod n) =⇒ n(X) splits completely over Zp We have: Rp = Zp[X]/(n(X)) ∼ = GF(p)h where h = ϕ(n) via the isomorphism [ƒ(X) mod n(X)] → [ƒ(ω)]∈Z∗

n

where ω ∈ Z∗

p is a primitive nth root of unity

The automorphism X → Xj sends [ƒ(ω)]∈Z∗

n → [ƒ(ωj)]∈Z∗ n

So the data in slot j moves to slot 

slide-38
SLIDE 38

A simplified (but not very typical) setting: p ≡ 1 (mod n) =⇒ n(X) splits completely over Zp We have: Rp = Zp[X]/(n(X)) ∼ = GF(p)h where h = ϕ(n) via the isomorphism [ƒ(X) mod n(X)] → [ƒ(ω)]∈Z∗

n

where ω ∈ Z∗

p is a primitive nth root of unity

The automorphism X → Xj sends [ƒ(ω)]∈Z∗

n → [ƒ(ωj)]∈Z∗ n

So the data in slot j moves to slot 

slide-39
SLIDE 39

General case: the available rotations are determined by the group structure of Z∗

n /〈p〉

Structure theorem: Z∗

n /〈p〉 ∼

= Zn1 × · · · × Znk, where n+1 | n for each  Example: suppose Z∗

n /〈p〉 ∼

= Z3 × Z3 We have 9 slots arranged in a 3 × 3 array:

  1 2 3 4 5 6 7 8  

We can rotate all the rows (simultaneously) by any amount, or all the columns simultaneously by any amount More generally: we have a k-dimensional hypercube, with rotations in each dimension

slide-40
SLIDE 40

General case: the available rotations are determined by the group structure of Z∗

n /〈p〉

Structure theorem: Z∗

n /〈p〉 ∼

= Zn1 × · · · × Znk, where n+1 | n for each  Example: suppose Z∗

n /〈p〉 ∼

= Z3 × Z3 We have 9 slots arranged in a 3 × 3 array:

  1 2 3 4 5 6 7 8  

We can rotate all the rows (simultaneously) by any amount, or all the columns simultaneously by any amount More generally: we have a k-dimensional hypercube, with rotations in each dimension

slide-41
SLIDE 41

General case: the available rotations are determined by the group structure of Z∗

n /〈p〉

Structure theorem: Z∗

n /〈p〉 ∼

= Zn1 × · · · × Znk, where n+1 | n for each  Example: suppose Z∗

n /〈p〉 ∼

= Z3 × Z3 We have 9 slots arranged in a 3 × 3 array:

  1 2 3 4 5 6 7 8  

We can rotate all the rows (simultaneously) by any amount, or all the columns simultaneously by any amount More generally: we have a k-dimensional hypercube, with rotations in each dimension

slide-42
SLIDE 42

General case: the available rotations are determined by the group structure of Z∗

n /〈p〉

Structure theorem: Z∗

n /〈p〉 ∼

= Zn1 × · · · × Znk, where n+1 | n for each  Example: suppose Z∗

n /〈p〉 ∼

= Z3 × Z3 We have 9 slots arranged in a 3 × 3 array:

  1 2 3 4 5 6 7 8  

We can rotate all the rows (simultaneously) by any amount, or all the columns simultaneously by any amount More generally: we have a k-dimensional hypercube, with rotations in each dimension

slide-43
SLIDE 43

General case: the available rotations are determined by the group structure of Z∗

n /〈p〉

Structure theorem: Z∗

n /〈p〉 ∼

= Zn1 × · · · × Znk, where n+1 | n for each  Example: suppose Z∗

n /〈p〉 ∼

= Z3 × Z3 We have 9 slots arranged in a 3 × 3 array:

  1 2 3 4 5 6 7 8  

We can rotate all the rows (simultaneously) by any amount, or all the columns simultaneously by any amount More generally: we have a k-dimensional hypercube, with rotations in each dimension

slide-44
SLIDE 44

General case: the available rotations are determined by the group structure of Z∗

n /〈p〉

Structure theorem: Z∗

n /〈p〉 ∼

= Zn1 × · · · × Znk, where n+1 | n for each  Example: suppose Z∗

n /〈p〉 ∼

= Z3 × Z3 We have 9 slots arranged in a 3 × 3 array:

  1 2 3 4 5 6 7 8  

We can rotate all the rows (simultaneously) by any amount, or all the columns simultaneously by any amount More generally: we have a k-dimensional hypercube, with rotations in each dimension

slide-45
SLIDE 45

The main topic: computing GF(pd)-linear maps Input: an encrypted vector  with h slots in GF(pd) Output: L(), for some fixed, public GF(pd)-linear map L

  • Equivalently: M, where M ∈ GF(pd)h×h
slide-46
SLIDE 46

The main topic: computing GF(pd)-linear maps Input: an encrypted vector  with h slots in GF(pd) Output: L(), for some fixed, public GF(pd)-linear map L

  • Equivalently: M, where M ∈ GF(pd)h×h
slide-47
SLIDE 47

An obvious approach: Example: h = 3   11 12 13 21 22 23 31 32 33     1 2 3   =   11 21 31  1 +   12 22 32  2 +   13 23 33  3 =   111 211 311   +   122 222 322   +   133 233 333   Requires a “multibroadcast”:

  1 2 3   →     1 1 1   ,   2 2 2   ,   3 3 3    

  • can be done using O(h)

rotations/mul-by-const

  • overkill
slide-48
SLIDE 48

An obvious approach: Example: h = 3   11 12 13 21 22 23 31 32 33     1 2 3   =   11 21 31  1 +   12 22 32  2 +   13 23 33  3 =   111 211 311   +   122 222 322   +   133 233 333   Requires a “multibroadcast”:

  1 2 3   →     1 1 1   ,   2 2 2   ,   3 3 3    

  • can be done using O(h)

rotations/mul-by-const

  • overkill
slide-49
SLIDE 49

An obvious approach: Example: h = 3   11 12 13 21 22 23 31 32 33     1 2 3   =   11 21 31  1 +   12 22 32  2 +   13 23 33  3 =   111 211 311   +   122 222 322   +   133 233 333   Requires a “multibroadcast”:

  1 2 3   →     1 1 1   ,   2 2 2   ,   3 3 3    

  • can be done using O(h)

rotations/mul-by-const

  • overkill
slide-50
SLIDE 50

A better idea: Cannon [1969], Bernstein [2008] Example: h = 3   11 12 13 21 22 23 31 32 33     1 2 3   =   111 222 333   +   122 233 311   +   133 211 322   The constants C0 = (11, 22, 33), C1 = (12, 23, 31), C2 = (13, 21, 32) constructed using CRT and converted to DoubleCRT . . . as a pre-computation Total cost: h rotations (expensive), h mul-by-const (cheap)

slide-51
SLIDE 51

A better idea: Cannon [1969], Bernstein [2008] Example: h = 3   11 12 13 21 22 23 31 32 33     1 2 3   =   111 222 333   +   122 233 311   +   133 211 322   The constants C0 = (11, 22, 33), C1 = (12, 23, 31), C2 = (13, 21, 32) constructed using CRT and converted to DoubleCRT . . . as a pre-computation Total cost: h rotations (expensive), h mul-by-const (cheap)

slide-52
SLIDE 52

A better idea: Cannon [1969], Bernstein [2008] Example: h = 3   11 12 13 21 22 23 31 32 33     1 2 3   =   111 222 333   +   122 233 311   +   133 211 322   The constants C0 = (11, 22, 33), C1 = (12, 23, 31), C2 = (13, 21, 32) constructed using CRT and converted to DoubleCRT . . . as a pre-computation Total cost: h rotations (expensive), h mul-by-const (cheap)

slide-53
SLIDE 53

A better idea: Cannon [1969], Bernstein [2008] Example: h = 3   11 12 13 21 22 23 31 32 33     1 2 3   =   111 222 333   +   122 233 311   +   133 211 322   The constants C0 = (11, 22, 33), C1 = (12, 23, 31), C2 = (13, 21, 32) constructed using CRT and converted to DoubleCRT . . . as a pre-computation Total cost: h rotations (expensive), h mul-by-const (cheap)

slide-54
SLIDE 54

A better idea: Cannon [1969], Bernstein [2008] Example: h = 3   11 12 13 21 22 23 31 32 33     1 2 3   =   111 222 333   +   122 233 311   +   133 211 322   The constants C0 = (11, 22, 33), C1 = (12, 23, 31), C2 = (13, 21, 32) constructed using CRT and converted to DoubleCRT . . . as a pre-computation Total cost: h rotations (expensive), h mul-by-const (cheap)

slide-55
SLIDE 55

A better idea: Cannon [1969], Bernstein [2008] Example: h = 3   11 12 13 21 22 23 31 32 33     1 2 3   =   111 222 333   +   122 233 311   +   133 211 322   The constants C0 = (11, 22, 33), C1 = (12, 23, 31), C2 = (13, 21, 32) constructed using CRT and converted to DoubleCRT . . . as a pre-computation Total cost: h rotations (expensive), h mul-by-const (cheap)

slide-56
SLIDE 56

An even better idea: baby-step/giant-step Let ρ() denote rotation of  by  positions We want to compute L() =

  • ∈[h] C · ρ() for constants C0, . . . , Ch−1

Observation: ρ is an automorphism on the plaintext space Rp L() =

  • ∈[h]

C · ρ() =

  • j∈[ƒ]
  • k∈[g]

Cj+ƒk · ρj+ƒk(), where ƒ, g ≈

  • h

=

  • k∈[g]

ρƒk

j∈[ƒ]

C′

j+ƒk · ρj()

  • ,

where C′

j+ƒk := ρ−ƒk(Cj+ƒk)

slide-57
SLIDE 57

An even better idea: baby-step/giant-step Let ρ() denote rotation of  by  positions We want to compute L() =

  • ∈[h] C · ρ() for constants C0, . . . , Ch−1

Observation: ρ is an automorphism on the plaintext space Rp L() =

  • ∈[h]

C · ρ() =

  • j∈[ƒ]
  • k∈[g]

Cj+ƒk · ρj+ƒk(), where ƒ, g ≈

  • h

=

  • k∈[g]

ρƒk

j∈[ƒ]

C′

j+ƒk · ρj()

  • ,

where C′

j+ƒk := ρ−ƒk(Cj+ƒk)

slide-58
SLIDE 58

An even better idea: baby-step/giant-step Let ρ() denote rotation of  by  positions We want to compute L() =

  • ∈[h] C · ρ() for constants C0, . . . , Ch−1

Observation: ρ is an automorphism on the plaintext space Rp L() =

  • ∈[h]

C · ρ() =

  • j∈[ƒ]
  • k∈[g]

Cj+ƒk · ρj+ƒk(), where ƒ, g ≈

  • h

=

  • k∈[g]

ρƒk

j∈[ƒ]

C′

j+ƒk · ρj()

  • ,

where C′

j+ƒk := ρ−ƒk(Cj+ƒk)

slide-59
SLIDE 59

An even better idea: baby-step/giant-step Let ρ() denote rotation of  by  positions We want to compute L() =

  • ∈[h] C · ρ() for constants C0, . . . , Ch−1

Observation: ρ is an automorphism on the plaintext space Rp L() =

  • ∈[h]

C · ρ() =

  • j∈[ƒ]
  • k∈[g]

Cj+ƒk · ρj+ƒk(), where ƒ, g ≈

  • h

=

  • k∈[g]

ρƒk

j∈[ƒ]

C′

j+ƒk · ρj()

  • ,

where C′

j+ƒk := ρ−ƒk(Cj+ƒk)

slide-60
SLIDE 60

An even better idea: baby-step/giant-step Let ρ() denote rotation of  by  positions We want to compute L() =

  • ∈[h] C · ρ() for constants C0, . . . , Ch−1

Observation: ρ is an automorphism on the plaintext space Rp L() =

  • ∈[h]

C · ρ() =

  • j∈[ƒ]
  • k∈[g]

Cj+ƒk · ρj+ƒk(), where ƒ, g ≈

  • h

=

  • k∈[g]

ρƒk

j∈[ƒ]

C′

j+ƒk · ρj()

  • ,

where C′

j+ƒk := ρ−ƒk(Cj+ƒk)

slide-61
SLIDE 61

An even better idea: baby-step/giant-step Let ρ() denote rotation of  by  positions We want to compute L() =

  • ∈[h] C · ρ() for constants C0, . . . , Ch−1

Observation: ρ is an automorphism on the plaintext space Rp L() =

  • ∈[h]

C · ρ() =

  • j∈[ƒ]
  • k∈[g]

Cj+ƒk · ρj+ƒk(), where ƒ, g ≈

  • h

=

  • k∈[g]

ρƒk

j∈[ƒ]

C′

j+ƒk · ρj()

  • ,

where C′

j+ƒk := ρ−ƒk(Cj+ƒk)

slide-62
SLIDE 62

Baby-step/giant-step algorithm:

  • 1. for each j ∈ [ƒ]: compute j ← ρj()

// baby steps

  • 2. for each k ∈ [g]: compute k ←
  • j∈[ƒ] C′

j+ƒk · j

  • 3. compute  ←
  • k∈[g] ρƒk(k)

// giant steps Cost:

  • Step 1: ≈
  • h rotations
  • Step 2: ≈ h mul-by-const
  • Step 3: ≈
  • h rotations
slide-63
SLIDE 63

Baby-step/giant-step algorithm:

  • 1. for each j ∈ [ƒ]: compute j ← ρj()

// baby steps

  • 2. for each k ∈ [g]: compute k ←
  • j∈[ƒ] C′

j+ƒk · j

  • 3. compute  ←
  • k∈[g] ρƒk(k)

// giant steps Cost:

  • Step 1: ≈
  • h rotations
  • Step 2: ≈ h mul-by-const
  • Step 3: ≈
  • h rotations
slide-64
SLIDE 64

Baby-step/giant-step algorithm:

  • 1. for each j ∈ [ƒ]: compute j ← ρj()

// baby steps

  • 2. for each k ∈ [g]: compute k ←
  • j∈[ƒ] C′

j+ƒk · j

  • 3. compute  ←
  • k∈[g] ρƒk(k)

// giant steps Cost:

  • Step 1: ≈
  • h rotations
  • Step 2: ≈ h mul-by-const
  • Step 3: ≈
  • h rotations
slide-65
SLIDE 65

Baby-step/giant-step algorithm:

  • 1. for each j ∈ [ƒ]: compute j ← ρj()

// baby steps

  • 2. for each k ∈ [g]: compute k ←
  • j∈[ƒ] C′

j+ƒk · j

  • 3. compute  ←
  • k∈[g] ρƒk(k)

// giant steps Cost:

  • Step 1: ≈
  • h rotations
  • Step 2: ≈ h mul-by-const
  • Step 3: ≈
  • h rotations
slide-66
SLIDE 66

An even more better idea(?)

  • r . . . “if 2
  • h rotations are good, then a single rotation is better”

Anatomy of a homomorphic rotation

We want to apply a rotation ρ to an encrypted vector  The ciphertext is a pair (c0, c1) ∈ R2×1

q

A) Raw automorphism step (cheap): c′

j ← ρ(cj) for j = 0,1

B) Key Switching, part 1 – break into digits (expensive): decompose c′

1 as c′ 1 =

  • k d′

kRk, where the Rk’s are integer

constants and each “digit” d′

k has small norm ☞ requires DoubleCRT/coefficient conversion

C) Key Switching, part 2 – apply key switching matrix (cheap): compute the ciphertext (c′

0 + c′′ 0 , c′′ 1 ), where c′′ j =

  • k d′

kAjk and the

Ajk’s are pre-computed DoubleCRT objects

slide-67
SLIDE 67

An even more better idea(?)

  • r . . . “if 2
  • h rotations are good, then a single rotation is better”

Anatomy of a homomorphic rotation

We want to apply a rotation ρ to an encrypted vector  The ciphertext is a pair (c0, c1) ∈ R2×1

q

A) Raw automorphism step (cheap): c′

j ← ρ(cj) for j = 0,1

B) Key Switching, part 1 – break into digits (expensive): decompose c′

1 as c′ 1 =

  • k d′

kRk, where the Rk’s are integer

constants and each “digit” d′

k has small norm ☞ requires DoubleCRT/coefficient conversion

C) Key Switching, part 2 – apply key switching matrix (cheap): compute the ciphertext (c′

0 + c′′ 0 , c′′ 1 ), where c′′ j =

  • k d′

kAjk and the

Ajk’s are pre-computed DoubleCRT objects

slide-68
SLIDE 68

An even more better idea(?)

  • r . . . “if 2
  • h rotations are good, then a single rotation is better”

Anatomy of a homomorphic rotation

We want to apply a rotation ρ to an encrypted vector  The ciphertext is a pair (c0, c1) ∈ R2×1

q

A) Raw automorphism step (cheap): c′

j ← ρ(cj) for j = 0,1

B) Key Switching, part 1 – break into digits (expensive): decompose c′

1 as c′ 1 =

  • k d′

kRk, where the Rk’s are integer

constants and each “digit” d′

k has small norm ☞ requires DoubleCRT/coefficient conversion

C) Key Switching, part 2 – apply key switching matrix (cheap): compute the ciphertext (c′

0 + c′′ 0 , c′′ 1 ), where c′′ j =

  • k d′

kAjk and the

Ajk’s are pre-computed DoubleCRT objects

slide-69
SLIDE 69

An even more better idea(?)

  • r . . . “if 2
  • h rotations are good, then a single rotation is better”

Anatomy of a homomorphic rotation

We want to apply a rotation ρ to an encrypted vector  The ciphertext is a pair (c0, c1) ∈ R2×1

q

A) Raw automorphism step (cheap): c′

j ← ρ(cj) for j = 0,1

B) Key Switching, part 1 – break into digits (expensive): decompose c′

1 as c′ 1 =

  • k d′

kRk, where the Rk’s are integer

constants and each “digit” d′

k has small norm ☞ requires DoubleCRT/coefficient conversion

C) Key Switching, part 2 – apply key switching matrix (cheap): compute the ciphertext (c′

0 + c′′ 0 , c′′ 1 ), where c′′ j =

  • k d′

kAjk and the

Ajk’s are pre-computed DoubleCRT objects

slide-70
SLIDE 70

An even more better idea(?)

  • r . . . “if 2
  • h rotations are good, then a single rotation is better”

Anatomy of a homomorphic rotation

We want to apply a rotation ρ to an encrypted vector  The ciphertext is a pair (c0, c1) ∈ R2×1

q

A) Raw automorphism step (cheap): c′

j ← ρ(cj) for j = 0,1

B) Key Switching, part 1 – break into digits (expensive): decompose c′

1 as c′ 1 =

  • k d′

kRk, where the Rk’s are integer

constants and each “digit” d′

k has small norm ☞ requires DoubleCRT/coefficient conversion

C) Key Switching, part 2 – apply key switching matrix (cheap): compute the ciphertext (c′

0 + c′′ 0 , c′′ 1 ), where c′′ j =

  • k d′

kAjk and the

Ajk’s are pre-computed DoubleCRT objects

slide-71
SLIDE 71

An even more better idea(?)

  • r . . . “if 2
  • h rotations are good, then a single rotation is better”

Anatomy of a homomorphic rotation

We want to apply a rotation ρ to an encrypted vector  The ciphertext is a pair (c0, c1) ∈ R2×1

q

A) Raw automorphism step (cheap): c′

j ← ρ(cj) for j = 0,1

B) Key Switching, part 1 – break into digits (expensive): decompose c′

1 as c′ 1 =

  • k d′

kRk, where the Rk’s are integer

constants and each “digit” d′

k has small norm ☞ requires DoubleCRT/coefficient conversion

C) Key Switching, part 2 – apply key switching matrix (cheap): compute the ciphertext (c′

0 + c′′ 0 , c′′ 1 ), where c′′ j =

  • k d′

kAjk and the

Ajk’s are pre-computed DoubleCRT objects

slide-72
SLIDE 72

An even more better idea(?)

  • r . . . “if 2
  • h rotations are good, then a single rotation is better”

Anatomy of a homomorphic rotation

We want to apply a rotation ρ to an encrypted vector  The ciphertext is a pair (c0, c1) ∈ R2×1

q

A) Raw automorphism step (cheap): c′

j ← ρ(cj) for j = 0,1

B) Key Switching, part 1 – break into digits (expensive): decompose c′

1 as c′ 1 =

  • k d′

kRk, where the Rk’s are integer

constants and each “digit” d′

k has small norm ☞ requires DoubleCRT/coefficient conversion

C) Key Switching, part 2 – apply key switching matrix (cheap): compute the ciphertext (c′

0 + c′′ 0 , c′′ 1 ), where c′′ j =

  • k d′

kAjk and the

Ajk’s are pre-computed DoubleCRT objects

slide-73
SLIDE 73

An even more better idea(?)

  • r . . . “if 2
  • h rotations are good, then a single rotation is better”

Anatomy of a homomorphic rotation

We want to apply a rotation ρ to an encrypted vector  The ciphertext is a pair (c0, c1) ∈ R2×1

q

A) Raw automorphism step (cheap): c′

j ← ρ(cj) for j = 0,1

B) Key Switching, part 1 – break into digits (expensive): decompose c′

1 as c′ 1 =

  • k d′

kRk, where the Rk’s are integer

constants and each “digit” d′

k has small norm ☞ requires DoubleCRT/coefficient conversion

C) Key Switching, part 2 – apply key switching matrix (cheap): compute the ciphertext (c′

0 + c′′ 0 , c′′ 1 ), where c′′ j =

  • k d′

kAjk and the

Ajk’s are pre-computed DoubleCRT objects

slide-74
SLIDE 74

The idea: re-factor this three step process ✸ Basically, just swap steps (A) and (B), using the fact that ρ is an automorphism that does not change the norm by very much A’) Key Switching, part 1 – break into digits (expensive): decompose the original c1 as c1 =

  • k dkRk

B’) Raw automorphism step (cheap): c′

0 ← ρ(c0) and d′ k ← ρ(dk)

for each k C) Key Switching, part 2 – apply key switching matrix (cheap): exactly the same as above: compute (c′

0 + c′′ 0 , c′′ 1 ), where

c′′

j =

  • k d′

kAjk

Why is this better? . . . we can perform step (A’) just once for many rotations ρ

slide-75
SLIDE 75

The idea: re-factor this three step process ✸ Basically, just swap steps (A) and (B), using the fact that ρ is an automorphism that does not change the norm by very much A’) Key Switching, part 1 – break into digits (expensive): decompose the original c1 as c1 =

  • k dkRk

B’) Raw automorphism step (cheap): c′

0 ← ρ(c0) and d′ k ← ρ(dk)

for each k C) Key Switching, part 2 – apply key switching matrix (cheap): exactly the same as above: compute (c′

0 + c′′ 0 , c′′ 1 ), where

c′′

j =

  • k d′

kAjk

Why is this better? . . . we can perform step (A’) just once for many rotations ρ

slide-76
SLIDE 76

The idea: re-factor this three step process ✸ Basically, just swap steps (A) and (B), using the fact that ρ is an automorphism that does not change the norm by very much A’) Key Switching, part 1 – break into digits (expensive): decompose the original c1 as c1 =

  • k dkRk

B’) Raw automorphism step (cheap): c′

0 ← ρ(c0) and d′ k ← ρ(dk)

for each k C) Key Switching, part 2 – apply key switching matrix (cheap): exactly the same as above: compute (c′

0 + c′′ 0 , c′′ 1 ), where

c′′

j =

  • k d′

kAjk

Why is this better? . . . we can perform step (A’) just once for many rotations ρ

slide-77
SLIDE 77

The idea: re-factor this three step process ✸ Basically, just swap steps (A) and (B), using the fact that ρ is an automorphism that does not change the norm by very much A’) Key Switching, part 1 – break into digits (expensive): decompose the original c1 as c1 =

  • k dkRk

B’) Raw automorphism step (cheap): c′

0 ← ρ(c0) and d′ k ← ρ(dk)

for each k C) Key Switching, part 2 – apply key switching matrix (cheap): exactly the same as above: compute (c′

0 + c′′ 0 , c′′ 1 ), where

c′′

j =

  • k d′

kAjk

Why is this better? . . . we can perform step (A’) just once for many rotations ρ

slide-78
SLIDE 78

The idea: re-factor this three step process ✸ Basically, just swap steps (A) and (B), using the fact that ρ is an automorphism that does not change the norm by very much A’) Key Switching, part 1 – break into digits (expensive): decompose the original c1 as c1 =

  • k dkRk

B’) Raw automorphism step (cheap): c′

0 ← ρ(c0) and d′ k ← ρ(dk)

for each k C) Key Switching, part 2 – apply key switching matrix (cheap): exactly the same as above: compute (c′

0 + c′′ 0 , c′′ 1 ), where

c′′

j =

  • k d′

kAjk

Why is this better? . . . we can perform step (A’) just once for many rotations ρ

slide-79
SLIDE 79

The idea: re-factor this three step process ✸ Basically, just swap steps (A) and (B), using the fact that ρ is an automorphism that does not change the norm by very much A’) Key Switching, part 1 – break into digits (expensive): decompose the original c1 as c1 =

  • k dkRk

B’) Raw automorphism step (cheap): c′

0 ← ρ(c0) and d′ k ← ρ(dk)

for each k C) Key Switching, part 2 – apply key switching matrix (cheap): exactly the same as above: compute (c′

0 + c′′ 0 , c′′ 1 ), where

c′′

j =

  • k d′

kAjk

Why is this better? . . . we can perform step (A’) just once for many rotations ρ

slide-80
SLIDE 80

We call this idea “hoisting” (optimizing compilers are said to “hoist” invariant computations out of a loop) So . . . given an encryption of  we can compute an encryption of ρ() for  ∈ [h] with just one expensive step and h cheap steps Application to matrix multiplication:

  • n the one hand . . . faster than the basic method (which takes h

rotations)

  • n the other hand . . . may be slower than the BS/GS method for

large h but on the other hand . . . we can combine hoisting and BS/GS

baby steps: for each j ∈ [ƒ] compute j ← ρj() hoist out these rotations save a factor of 2 (2

  • h −→
  • h rotations)
slide-81
SLIDE 81

We call this idea “hoisting” (optimizing compilers are said to “hoist” invariant computations out of a loop) So . . . given an encryption of  we can compute an encryption of ρ() for  ∈ [h] with just one expensive step and h cheap steps Application to matrix multiplication:

  • n the one hand . . . faster than the basic method (which takes h

rotations)

  • n the other hand . . . may be slower than the BS/GS method for

large h but on the other hand . . . we can combine hoisting and BS/GS

baby steps: for each j ∈ [ƒ] compute j ← ρj() hoist out these rotations save a factor of 2 (2

  • h −→
  • h rotations)
slide-82
SLIDE 82

We call this idea “hoisting” (optimizing compilers are said to “hoist” invariant computations out of a loop) So . . . given an encryption of  we can compute an encryption of ρ() for  ∈ [h] with just one expensive step and h cheap steps Application to matrix multiplication:

  • n the one hand . . . faster than the basic method (which takes h

rotations)

  • n the other hand . . . may be slower than the BS/GS method for

large h but on the other hand . . . we can combine hoisting and BS/GS

baby steps: for each j ∈ [ƒ] compute j ← ρj() hoist out these rotations save a factor of 2 (2

  • h −→
  • h rotations)
slide-83
SLIDE 83

We call this idea “hoisting” (optimizing compilers are said to “hoist” invariant computations out of a loop) So . . . given an encryption of  we can compute an encryption of ρ() for  ∈ [h] with just one expensive step and h cheap steps Application to matrix multiplication:

  • n the one hand . . . faster than the basic method (which takes h

rotations)

  • n the other hand . . . may be slower than the BS/GS method for

large h but on the other hand . . . we can combine hoisting and BS/GS

baby steps: for each j ∈ [ƒ] compute j ← ρj() hoist out these rotations save a factor of 2 (2

  • h −→
  • h rotations)
slide-84
SLIDE 84

We call this idea “hoisting” (optimizing compilers are said to “hoist” invariant computations out of a loop) So . . . given an encryption of  we can compute an encryption of ρ() for  ∈ [h] with just one expensive step and h cheap steps Application to matrix multiplication:

  • n the one hand . . . faster than the basic method (which takes h

rotations)

  • n the other hand . . . may be slower than the BS/GS method for

large h but on the other hand . . . we can combine hoisting and BS/GS

baby steps: for each j ∈ [ƒ] compute j ← ρj() hoist out these rotations save a factor of 2 (2

  • h −→
  • h rotations)
slide-85
SLIDE 85

We call this idea “hoisting” (optimizing compilers are said to “hoist” invariant computations out of a loop) So . . . given an encryption of  we can compute an encryption of ρ() for  ∈ [h] with just one expensive step and h cheap steps Application to matrix multiplication:

  • n the one hand . . . faster than the basic method (which takes h

rotations)

  • n the other hand . . . may be slower than the BS/GS method for

large h but on the other hand . . . we can combine hoisting and BS/GS

baby steps: for each j ∈ [ƒ] compute j ← ρj() hoist out these rotations save a factor of 2 (2

  • h −→
  • h rotations)
slide-86
SLIDE 86

We call this idea “hoisting” (optimizing compilers are said to “hoist” invariant computations out of a loop) So . . . given an encryption of  we can compute an encryption of ρ() for  ∈ [h] with just one expensive step and h cheap steps Application to matrix multiplication:

  • n the one hand . . . faster than the basic method (which takes h

rotations)

  • n the other hand . . . may be slower than the BS/GS method for

large h but on the other hand . . . we can combine hoisting and BS/GS

baby steps: for each j ∈ [ƒ] compute j ← ρj() hoist out these rotations save a factor of 2 (2

  • h −→
  • h rotations)
slide-87
SLIDE 87

We call this idea “hoisting” (optimizing compilers are said to “hoist” invariant computations out of a loop) So . . . given an encryption of  we can compute an encryption of ρ() for  ∈ [h] with just one expensive step and h cheap steps Application to matrix multiplication:

  • n the one hand . . . faster than the basic method (which takes h

rotations)

  • n the other hand . . . may be slower than the BS/GS method for

large h but on the other hand . . . we can combine hoisting and BS/GS

baby steps: for each j ∈ [ƒ] compute j ← ρj() hoist out these rotations save a factor of 2 (2

  • h −→
  • h rotations)
slide-88
SLIDE 88

We call this idea “hoisting” (optimizing compilers are said to “hoist” invariant computations out of a loop) So . . . given an encryption of  we can compute an encryption of ρ() for  ∈ [h] with just one expensive step and h cheap steps Application to matrix multiplication:

  • n the one hand . . . faster than the basic method (which takes h

rotations)

  • n the other hand . . . may be slower than the BS/GS method for

large h but on the other hand . . . we can combine hoisting and BS/GS

baby steps: for each j ∈ [ƒ] compute j ← ρj() hoist out these rotations save a factor of 2 (2

  • h −→
  • h rotations)
slide-89
SLIDE 89

See paper for more details and other improvements: ➪ More efficient handling of “problematic” dimensions in the hypercube ➪ Saving space: drastic reduction in the number of “key switching matrices” without too much loss in speed Questions?

slide-90
SLIDE 90

See paper for more details and other improvements: ➪ More efficient handling of “problematic” dimensions in the hypercube ➪ Saving space: drastic reduction in the number of “key switching matrices” without too much loss in speed Questions?

slide-91
SLIDE 91

See paper for more details and other improvements: ➪ More efficient handling of “problematic” dimensions in the hypercube ➪ Saving space: drastic reduction in the number of “key switching matrices” without too much loss in speed Questions?

slide-92
SLIDE 92

See paper for more details and other improvements: ➪ More efficient handling of “problematic” dimensions in the hypercube ➪ Saving space: drastic reduction in the number of “key switching matrices” without too much loss in speed Questions?