SLIDE 1 1
Curve25519, Curve41417, E-521
University of Illinois at Chicago & Technische Universiteit Eindhoven Curve25519 mod ♣ = 2255 19: ②2 = ①3 + 486662①2 + ①. Equivalent to Edwards curve ①2 + ②2 = 1 + (1 1❂121666)①2②2. Curve41417 mod 2414 17: ①2 + ②2 = 1 + 3617①2②2. E-521 mod 2521 1: ①2 + ②2 = 1 376014①2②2.
2
Curve25519 Introduced in ECC 2005 talk and PKC 2006 paper “New Diffie–Hellman speed records.” Main features listed in paper: “extremely high speed”; “no time variability”; 32-byte secret keys; 32-byte public keys; “free key validation”; “short code”. The big picture: Minimize tensions between speed, simplicity, security.
SLIDE 2
1
Curve25519, Curve41417, E-521 Bernstein University of Illinois at Chicago & echnische Universiteit Eindhoven Curve25519 mod ♣ = 2255 19: ② ①3 + 486662①2 + ①. Equivalent to Edwards curve ① ②2 = 1 + (1 1❂121666)①2②2. Curve41417 mod 2414 17: ① ②2 = 1 + 3617①2②2. mod 2521 1: ① ②2 = 1 376014①2②2.
2
Curve25519 Introduced in ECC 2005 talk and PKC 2006 paper “New Diffie–Hellman speed records.” Main features listed in paper: “extremely high speed”; “no time variability”; 32-byte secret keys; 32-byte public keys; “free key validation”; “short code”. The big picture: Minimize tensions between speed, simplicity, security. Tension: How will compute ❛❂❜ ♣ Many bo Passes interop But variable presumably
SLIDE 3
1
Curve41417, E-521 Illinois at Chicago & Universiteit Eindhoven ♣ = 2255 19: ② ① 486662①2 + ①. Edwards curve ① ② (1 1❂121666)①2②2. 2414 17: ① ② 3617①2②2. 1: ① ② 376014①2②2.
2
Curve25519 Introduced in ECC 2005 talk and PKC 2006 paper “New Diffie–Hellman speed records.” Main features listed in paper: “extremely high speed”; “no time variability”; 32-byte secret keys; 32-byte public keys; “free key validation”; “short code”. The big picture: Minimize tensions between speed, simplicity, security. Tension: a neutral How will implemento compute ❛❂❜ mod ♣ Many books recommend Passes interoperabilit But variable time presumably a securit
SLIDE 4 1
E-521 Chicago & Eindhoven ♣ 19: ② ① ① ① curve ① ② ❂121666)①2②2. 17: ① ② ① ②
②
2
Curve25519 Introduced in ECC 2005 talk and PKC 2006 paper “New Diffie–Hellman speed records.” Main features listed in paper: “extremely high speed”; “no time variability”; 32-byte secret keys; 32-byte public keys; “free key validation”; “short code”. The big picture: Minimize tensions between speed, simplicity, security. Tension: a neutral example How will implementors compute ❛❂❜ mod ♣? Many books recommend Euclid. Passes interoperability tests. But variable time, presumably a security problem.
SLIDE 5
2
Curve25519 Introduced in ECC 2005 talk and PKC 2006 paper “New Diffie–Hellman speed records.” Main features listed in paper: “extremely high speed”; “no time variability”; 32-byte secret keys; 32-byte public keys; “free key validation”; “short code”. The big picture: Minimize tensions between speed, simplicity, security.
3
Tension: a neutral example How will implementors compute ❛❂❜ mod ♣? Many books recommend Euclid. Passes interoperability tests. But variable time, presumably a security problem.
SLIDE 6
2
Curve25519 Introduced in ECC 2005 talk and PKC 2006 paper “New Diffie–Hellman speed records.” Main features listed in paper: “extremely high speed”; “no time variability”; 32-byte secret keys; 32-byte public keys; “free key validation”; “short code”. The big picture: Minimize tensions between speed, simplicity, security.
3
Tension: a neutral example How will implementors compute ❛❂❜ mod ♣? Many books recommend Euclid. Passes interoperability tests. But variable time, presumably a security problem. Defense 1: Encourage implementors to use ❛❜♣2. Simpler than Euclid, fast enough.
SLIDE 7
2
Curve25519 Introduced in ECC 2005 talk and PKC 2006 paper “New Diffie–Hellman speed records.” Main features listed in paper: “extremely high speed”; “no time variability”; 32-byte secret keys; 32-byte public keys; “free key validation”; “short code”. The big picture: Minimize tensions between speed, simplicity, security.
3
Tension: a neutral example How will implementors compute ❛❂❜ mod ♣? Many books recommend Euclid. Passes interoperability tests. But variable time, presumably a security problem. Defense 1: Encourage implementors to use ❛❜♣2. Simpler than Euclid, fast enough. But maybe implementor finds it simplest to use a Euclid library, and wants the Euclid speed.
SLIDE 8
2
Curve25519 duced in ECC 2005 talk PKC 2006 paper “New Diffie–Hellman speed records.” features listed in paper: “extremely high speed”; time variability”; yte secret keys; yte public keys; ey validation”; code”. big picture: Minimize tensions between eed, simplicity, security.
3
Tension: a neutral example How will implementors compute ❛❂❜ mod ♣? Many books recommend Euclid. Passes interoperability tests. But variable time, presumably a security problem. Defense 1: Encourage implementors to use ❛❜♣2. Simpler than Euclid, fast enough. But maybe implementor finds it simplest to use a Euclid library, and wants the Euclid speed. Defense implemento verify constant-time e.g. 2010 Almeida–Ba
SLIDE 9
2
ECC 2005 talk paper “New speed records.” listed in paper: speed”; riability”; eys; eys; validation”; tensions between simplicity, security.
3
Tension: a neutral example How will implementors compute ❛❂❜ mod ♣? Many books recommend Euclid. Passes interoperability tests. But variable time, presumably a security problem. Defense 1: Encourage implementors to use ❛❜♣2. Simpler than Euclid, fast enough. But maybe implementor finds it simplest to use a Euclid library, and wants the Euclid speed. Defense 2: Encourage implementors to use verify constant-time e.g. 2010 Langley Almeida–Barbosa–Pinto–Vieira.
SLIDE 10
2
talk “New rds.” er: een security.
3
Tension: a neutral example How will implementors compute ❛❂❜ mod ♣? Many books recommend Euclid. Passes interoperability tests. But variable time, presumably a security problem. Defense 1: Encourage implementors to use ❛❜♣2. Simpler than Euclid, fast enough. But maybe implementor finds it simplest to use a Euclid library, and wants the Euclid speed. Defense 2: Encourage implementors to use tools to verify constant-time behavio e.g. 2010 Langley “ctgrind”; Almeida–Barbosa–Pinto–Vieira.
SLIDE 11
3
Tension: a neutral example How will implementors compute ❛❂❜ mod ♣? Many books recommend Euclid. Passes interoperability tests. But variable time, presumably a security problem. Defense 1: Encourage implementors to use ❛❜♣2. Simpler than Euclid, fast enough. But maybe implementor finds it simplest to use a Euclid library, and wants the Euclid speed.
4
Defense 2: Encourage implementors to use tools to verify constant-time behavior. e.g. 2010 Langley “ctgrind”; 2013 Almeida–Barbosa–Pinto–Vieira.
SLIDE 12
3
Tension: a neutral example How will implementors compute ❛❂❜ mod ♣? Many books recommend Euclid. Passes interoperability tests. But variable time, presumably a security problem. Defense 1: Encourage implementors to use ❛❜♣2. Simpler than Euclid, fast enough. But maybe implementor finds it simplest to use a Euclid library, and wants the Euclid speed.
4
Defense 2: Encourage implementors to use tools to verify constant-time behavior. e.g. 2010 Langley “ctgrind”; 2013 Almeida–Barbosa–Pinto–Vieira. Defense 3: Encourage implementors to use fractions (e.g., “projective coordinates”). Then Euclid speedup is negligible.
SLIDE 13
3
Tension: a neutral example How will implementors compute ❛❂❜ mod ♣? Many books recommend Euclid. Passes interoperability tests. But variable time, presumably a security problem. Defense 1: Encourage implementors to use ❛❜♣2. Simpler than Euclid, fast enough. But maybe implementor finds it simplest to use a Euclid library, and wants the Euclid speed.
4
Defense 2: Encourage implementors to use tools to verify constant-time behavior. e.g. 2010 Langley “ctgrind”; 2013 Almeida–Barbosa–Pinto–Vieira. Defense 3: Encourage implementors to use fractions (e.g., “projective coordinates”). Then Euclid speedup is negligible. Defense 4: Choose curves that naturally avoid all divisions.
SLIDE 14
3
Tension: a neutral example How will implementors compute ❛❂❜ mod ♣? Many books recommend Euclid. Passes interoperability tests. But variable time, presumably a security problem. Defense 1: Encourage implementors to use ❛❜♣2. Simpler than Euclid, fast enough. But maybe implementor finds it simplest to use a Euclid library, and wants the Euclid speed.
4
Defense 2: Encourage implementors to use tools to verify constant-time behavior. e.g. 2010 Langley “ctgrind”; 2013 Almeida–Barbosa–Pinto–Vieira. Defense 3: Encourage implementors to use fractions (e.g., “projective coordinates”). Then Euclid speedup is negligible. Defense 4: Choose curves that naturally avoid all divisions. Seems incompatible with ECC.
SLIDE 15
3
Tension: a neutral example How will implementors compute ❛❂❜ mod ♣? Many books recommend Euclid. Passes interoperability tests. But variable time, presumably a security problem. Defense 1: Encourage implementors to use ❛❜♣2. Simpler than Euclid, fast enough. But maybe implementor finds it simplest to use a Euclid library, and wants the Euclid speed.
4
Defense 2: Encourage implementors to use tools to verify constant-time behavior. e.g. 2010 Langley “ctgrind”; 2013 Almeida–Barbosa–Pinto–Vieira. Defense 3: Encourage implementors to use fractions (e.g., “projective coordinates”). Then Euclid speedup is negligible. Defense 4: Choose curves that naturally avoid all divisions. Seems incompatible with ECC. The good news: curve choice can resolve other tensions.
SLIDE 16 3
ension: a neutral example will implementors compute ❛❂❜ mod ♣? books recommend Euclid. interoperability tests. variable time, resumably a security problem. Defense 1: Encourage implementors to use ❛❜♣2. Simpler than Euclid, fast enough. maybe implementor finds it simplest to use a Euclid library, ants the Euclid speed.
4
Defense 2: Encourage implementors to use tools to verify constant-time behavior. e.g. 2010 Langley “ctgrind”; 2013 Almeida–Barbosa–Pinto–Vieira. Defense 3: Encourage implementors to use fractions (e.g., “projective coordinates”). Then Euclid speedup is negligible. Defense 4: Choose curves that naturally avoid all divisions. Seems incompatible with ECC. The good news: curve choice can resolve other tensions. Constant-time Imitate ha Allocate for each Always p
e.g. If you’re ❛ ❜ with 255 ❛ and 255 ❜ allocate ❛ ❜ e.g. If you’re ❛ ❜ with 256 ❛ and 256 ❜ allocate ❛❜
SLIDE 17 3
neutral example implementors ❛❂❜ d ♣? recommend Euclid. erability tests. time, security problem. Encourage use ❛❜♣2. Euclid, fast enough. implementor finds it Euclid library, Euclid speed.
4
Defense 2: Encourage implementors to use tools to verify constant-time behavior. e.g. 2010 Langley “ctgrind”; 2013 Almeida–Barbosa–Pinto–Vieira. Defense 3: Encourage implementors to use fractions (e.g., “projective coordinates”). Then Euclid speedup is negligible. Defense 4: Choose curves that naturally avoid all divisions. Seems incompatible with ECC. The good news: curve choice can resolve other tensions. Constant-time Curve25519 Imitate hardware in Allocate constant numb for each integer. Always perform arithmetic
e.g. If you’re adding ❛ ❜ with 255 bits allocated ❛ and 255 bits allocated ❜ allocate 256 bits fo ❛ ❜ e.g. If you’re multiplying ❛ ❜ with 256 bits allocated ❛ and 256 bits allocated ❜ allocate 512 bits fo ❛❜
SLIDE 18 3
example ❛❂❜ ♣ Euclid. tests. roblem. ❛❜♣ . enough. finds it library, eed.
4
Defense 2: Encourage implementors to use tools to verify constant-time behavior. e.g. 2010 Langley “ctgrind”; 2013 Almeida–Barbosa–Pinto–Vieira. Defense 3: Encourage implementors to use fractions (e.g., “projective coordinates”). Then Euclid speedup is negligible. Defense 4: Choose curves that naturally avoid all divisions. Seems incompatible with ECC. The good news: curve choice can resolve other tensions. Constant-time Curve25519 Imitate hardware in software. Allocate constant number of for each integer. Always perform arithmetic
- n all bits. Don’t skip bits.
e.g. If you’re adding ❛ to ❜, with 255 bits allocated for ❛ and 255 bits allocated for ❜: allocate 256 bits for ❛ + ❜. e.g. If you’re multiplying ❛ b ❜ with 256 bits allocated for ❛ and 256 bits allocated for ❜: allocate 512 bits for ❛❜.
SLIDE 19 4
Defense 2: Encourage implementors to use tools to verify constant-time behavior. e.g. 2010 Langley “ctgrind”; 2013 Almeida–Barbosa–Pinto–Vieira. Defense 3: Encourage implementors to use fractions (e.g., “projective coordinates”). Then Euclid speedup is negligible. Defense 4: Choose curves that naturally avoid all divisions. Seems incompatible with ECC. The good news: curve choice can resolve other tensions.
5
Constant-time Curve25519 Imitate hardware in software. Allocate constant number of bits for each integer. Always perform arithmetic
- n all bits. Don’t skip bits.
e.g. If you’re adding ❛ to ❜, with 255 bits allocated for ❛ and 255 bits allocated for ❜: allocate 256 bits for ❛ + ❜. e.g. If you’re multiplying ❛ by ❜, with 256 bits allocated for ❛ and 256 bits allocated for ❜: allocate 512 bits for ❛❜.
SLIDE 20 4
Defense 2: Encourage implementors to use tools to constant-time behavior. 2010 Langley “ctgrind”; 2013 Almeida–Barbosa–Pinto–Vieira. Defense 3: Encourage implementors to use fractions “projective coordinates”). Euclid speedup is negligible. Defense 4: Choose curves that naturally avoid all divisions. incompatible with ECC. good news: curve choice resolve other tensions.
5
Constant-time Curve25519 Imitate hardware in software. Allocate constant number of bits for each integer. Always perform arithmetic
- n all bits. Don’t skip bits.
e.g. If you’re adding ❛ to ❜, with 255 bits allocated for ❛ and 255 bits allocated for ❜: allocate 256 bits for ❛ + ❜. e.g. If you’re multiplying ❛ by ❜, with 256 bits allocated for ❛ and 256 bits allocated for ❜: allocate 512 bits for ❛❜. If (e.g.) ❝ Replace ❝ q r r = ❝ mo q ☎ ❝❂ ✆ Allocate q r This is the ♣ Repeat same 350 bits ✦ Small enough
SLIDE 21 4
Encourage use tools to constant-time behavior. Langley “ctgrind”; 2013
Encourage use fractions coordinates”). eedup is negligible.
all divisions. atible with ECC. curve choice
5
Constant-time Curve25519 Imitate hardware in software. Allocate constant number of bits for each integer. Always perform arithmetic
- n all bits. Don’t skip bits.
e.g. If you’re adding ❛ to ❜, with 255 bits allocated for ❛ and 255 bits allocated for ❜: allocate 256 bits for ❛ + ❜. e.g. If you’re multiplying ❛ by ❜, with 256 bits allocated for ❛ and 256 bits allocated for ❜: allocate 512 bits for ❛❜. If (e.g.) 600 bits allo ❝ Replace ❝ with 19q r r = ❝ mod 2255, q ☎ ❝❂ ✆ Allocate 350 bits fo q r This is the same mo ♣ Repeat same comp 350 bits ✦ 256 bits. Small enough for next
SLIDE 22 4
to vior. “ctgrind”; 2013
fractions rdinates”). negligible. that divisions. ECC. choice tensions.
5
Constant-time Curve25519 Imitate hardware in software. Allocate constant number of bits for each integer. Always perform arithmetic
- n all bits. Don’t skip bits.
e.g. If you’re adding ❛ to ❜, with 255 bits allocated for ❛ and 255 bits allocated for ❜: allocate 256 bits for ❛ + ❜. e.g. If you’re multiplying ❛ by ❜, with 256 bits allocated for ❛ and 256 bits allocated for ❜: allocate 512 bits for ❛❜. If (e.g.) 600 bits allocated fo ❝ Replace ❝ with 19q + r where r = ❝ mod 2255, q = ☎ ❝❂2255✆ Allocate 350 bits for 19q + r This is the same modulo ♣. Repeat same compression: 350 bits ✦ 256 bits. Small enough for next mult.
SLIDE 23 5
Constant-time Curve25519 Imitate hardware in software. Allocate constant number of bits for each integer. Always perform arithmetic
- n all bits. Don’t skip bits.
e.g. If you’re adding ❛ to ❜, with 255 bits allocated for ❛ and 255 bits allocated for ❜: allocate 256 bits for ❛ + ❜. e.g. If you’re multiplying ❛ by ❜, with 256 bits allocated for ❛ and 256 bits allocated for ❜: allocate 512 bits for ❛❜.
6
If (e.g.) 600 bits allocated for ❝: Replace ❝ with 19q + r where r = ❝ mod 2255, q = ☎ ❝❂2255✆ . Allocate 350 bits for 19q + r. This is the same modulo ♣. Repeat same compression: 350 bits ✦ 256 bits. Small enough for next mult.
SLIDE 24 5
Constant-time Curve25519 Imitate hardware in software. Allocate constant number of bits for each integer. Always perform arithmetic
- n all bits. Don’t skip bits.
e.g. If you’re adding ❛ to ❜, with 255 bits allocated for ❛ and 255 bits allocated for ❜: allocate 256 bits for ❛ + ❜. e.g. If you’re multiplying ❛ by ❜, with 256 bits allocated for ❛ and 256 bits allocated for ❜: allocate 512 bits for ❛❜.
6
If (e.g.) 600 bits allocated for ❝: Replace ❝ with 19q + r where r = ❝ mod 2255, q = ☎ ❝❂2255✆ . Allocate 350 bits for 19q + r. This is the same modulo ♣. Repeat same compression: 350 bits ✦ 256 bits. Small enough for next mult. To completely reduce 256 bits mod ♣, do two iterations of constant-time conditional sub. One conditional sub: replace ❝ with ❝ (1 s)♣ where s is sign bit in ❝ ♣.
SLIDE 25 5
Constant-time Curve25519 Imitate hardware in software. cate constant number of bits h integer. ys perform arithmetic
you’re adding ❛ to ❜, 255 bits allocated for ❛ 255 bits allocated for ❜: cate 256 bits for ❛ + ❜. you’re multiplying ❛ by ❜, 256 bits allocated for ❛ 256 bits allocated for ❜: cate 512 bits for ❛❜.
6
If (e.g.) 600 bits allocated for ❝: Replace ❝ with 19q + r where r = ❝ mod 2255, q = ☎ ❝❂2255✆ . Allocate 350 bits for 19q + r. This is the same modulo ♣. Repeat same compression: 350 bits ✦ 256 bits. Small enough for next mult. To completely reduce 256 bits mod ♣, do two iterations of constant-time conditional sub. One conditional sub: replace ❝ with ❝ (1 s)♣ where s is sign bit in ❝ ♣. Constant-time NIST P-256 ♣ 2256 2224
reduction an integer ❆ ♣ Write ❆ (❆15❀ ❆14❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆8❀ ❆7❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ meaning P
✐ ❆✐ ✐
Define ❚; ❙1; ❙2 ❙ ❙ ❉ ❉ ❉ ❉ as
SLIDE 26 5
Curve25519 in software. constant number of bits arithmetic Don’t skip bits. adding ❛ to ❜, allocated for ❛ cated for ❜: for ❛ + ❜. multiplying ❛ by ❜, allocated for ❛ cated for ❜: for ❛❜.
6
If (e.g.) 600 bits allocated for ❝: Replace ❝ with 19q + r where r = ❝ mod 2255, q = ☎ ❝❂2255✆ . Allocate 350 bits for 19q + r. This is the same modulo ♣. Repeat same compression: 350 bits ✦ 256 bits. Small enough for next mult. To completely reduce 256 bits mod ♣, do two iterations of constant-time conditional sub. One conditional sub: replace ❝ with ❝ (1 s)♣ where s is sign bit in ❝ ♣. Constant-time NIST NIST P-256 prime ♣ 2256 2224 + 2192
reduction procedure an integer “❆ less ♣ Write ❆ as (❆15❀ ❆14❀ ❆13❀ ❆12❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆8❀ ❆7❀ ❆6❀ ❆5❀ ❆4❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ meaning P
✐ ❆✐232✐
Define ❚; ❙1; ❙2; ❙3; ❙4; ❉ ❉ ❉ ❉ as
SLIDE 27 5
re.
bits. ❛ ❜, ❛ ❜: ❛ ❜. ❛ by ❜, ❛ ❜: ❛❜
6
If (e.g.) 600 bits allocated for ❝: Replace ❝ with 19q + r where r = ❝ mod 2255, q = ☎ ❝❂2255✆ . Allocate 350 bits for 19q + r. This is the same modulo ♣. Repeat same compression: 350 bits ✦ 256 bits. Small enough for next mult. To completely reduce 256 bits mod ♣, do two iterations of constant-time conditional sub. One conditional sub: replace ❝ with ❝ (1 s)♣ where s is sign bit in ❝ ♣. Constant-time NIST P-256 NIST P-256 prime ♣ is 2256 2224 + 2192 + 296 1. ECDSA standard specifies reduction procedure given an integer “❆ less than ♣2”: Write ❆ as (❆15❀ ❆14❀ ❆13❀ ❆12❀ ❆11❀ ❆10❀ ❆ ❀ ❆8❀ ❆7❀ ❆6❀ ❆5❀ ❆4❀ ❆3❀ ❆2❀ ❆ ❀ ❆ meaning P
✐ ❆✐232✐.
Define ❚; ❙1; ❙2; ❙3; ❙4; ❉1; ❉2; ❉3 ❉ as
SLIDE 28
6
If (e.g.) 600 bits allocated for ❝: Replace ❝ with 19q + r where r = ❝ mod 2255, q = ☎ ❝❂2255✆ . Allocate 350 bits for 19q + r. This is the same modulo ♣. Repeat same compression: 350 bits ✦ 256 bits. Small enough for next mult. To completely reduce 256 bits mod ♣, do two iterations of constant-time conditional sub. One conditional sub: replace ❝ with ❝ (1 s)♣ where s is sign bit in ❝ ♣.
7
Constant-time NIST P-256 NIST P-256 prime ♣ is 2256 2224 + 2192 + 296 1. ECDSA standard specifies reduction procedure given an integer “❆ less than ♣2”: Write ❆ as (❆15❀ ❆14❀ ❆13❀ ❆12❀ ❆11❀ ❆10❀ ❆9❀ ❆8❀ ❆7❀ ❆6❀ ❆5❀ ❆4❀ ❆3❀ ❆2❀ ❆1❀ ❆0), meaning P
✐ ❆✐232✐.
Define ❚; ❙1; ❙2; ❙3; ❙4; ❉1; ❉2; ❉3; ❉4 as
SLIDE 29
6
(e.g.) 600 bits allocated for ❝: Replace ❝ with 19q + r where r ❝ mod 2255, q = ☎ ❝❂2255✆ . cate 350 bits for 19q + r. the same modulo ♣. eat same compression: bits ✦ 256 bits. enough for next mult. completely reduce 256 bits ♣, do two iterations of constant-time conditional sub. conditional sub: replace ❝ with ❝ (1 s)♣ s is sign bit in ❝ ♣.
7
Constant-time NIST P-256 NIST P-256 prime ♣ is 2256 2224 + 2192 + 296 1. ECDSA standard specifies reduction procedure given an integer “❆ less than ♣2”: Write ❆ as (❆15❀ ❆14❀ ❆13❀ ❆12❀ ❆11❀ ❆10❀ ❆9❀ ❆8❀ ❆7❀ ❆6❀ ❆5❀ ❆4❀ ❆3❀ ❆2❀ ❆1❀ ❆0), meaning P
✐ ❆✐232✐.
Define ❚; ❙1; ❙2; ❙3; ❙4; ❉1; ❉2; ❉3; ❉4 as (❆7❀ ❆6❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ (❆15❀ ❆14❀ ❆ ❀ ❆ ❀ ❆ ❀ ❀ ❀ (0❀ ❆15❀ ❆ ❀ ❆ ❀ ❆ ❀ ❀ ❀ (❆15❀ ❆14❀ ❀ ❀ ❀ ❆ ❀ ❆ ❀ ❆ (❆8❀ ❆13❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ (❆10❀ ❆8❀ ❀ ❀ ❀ ❆ ❀ ❆ ❀ ❆ (❆11❀ ❆9❀ ❀ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ (❆12❀ 0❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ (❆13❀ 0❀ ❆ ❀ ❆ ❀ ❆ ❀ ❀ ❆ ❀ ❆ Compute ❚ ❙ ❙ ❙ ❙4 ❉1 ❉ ❉ ❉ Reduce mo ♣ subtracting ♣
SLIDE 30 6
allocated for ❝: ❝ 19q + r where r ❝ q = ☎ ❝❂2255✆ . bits for 19q + r. modulo ♣. compression: ✦ bits. r next mult. reduce 256 bits ♣ iterations of
sub: ❝ ❝ (1 s)♣ s bit in ❝ ♣.
7
Constant-time NIST P-256 NIST P-256 prime ♣ is 2256 2224 + 2192 + 296 1. ECDSA standard specifies reduction procedure given an integer “❆ less than ♣2”: Write ❆ as (❆15❀ ❆14❀ ❆13❀ ❆12❀ ❆11❀ ❆10❀ ❆9❀ ❆8❀ ❆7❀ ❆6❀ ❆5❀ ❆4❀ ❆3❀ ❆2❀ ❆1❀ ❆0), meaning P
✐ ❆✐232✐.
Define ❚; ❙1; ❙2; ❙3; ❙4; ❉1; ❉2; ❉3; ❉4 as (❆7❀ ❆6❀ ❆5❀ ❆4❀ ❆3❀ ❆ ❀ ❆ ❀ ❆ (❆15❀ ❆14❀ ❆13❀ ❆12❀ ❆ ❀ ❀ ❀ (0❀ ❆15❀ ❆14❀ ❆13❀ ❆ ❀ ❀ ❀ (❆15❀ ❆14❀ 0❀ 0❀ 0❀ ❆ ❀ ❆ ❀ ❆ (❆8❀ ❆13❀ ❆15❀ ❆14❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ (❆10❀ ❆8❀ 0❀ 0❀ 0❀ ❆13❀ ❆ ❀ ❆ (❆11❀ ❆9❀ 0❀ 0❀ ❆15❀ ❆ ❀ ❆ ❀ ❆ (❆12❀ 0❀ ❆10❀ ❆9❀ ❆8❀ ❆ ❀ ❆ ❀ ❆ (❆13❀ 0❀ ❆11❀ ❆10❀ ❆ ❀ ❀ ❆ ❀ ❆ Compute ❚ + 2❙1 ❙ ❙ ❙4 ❉1 ❉2 ❉ ❉ Reduce modulo ♣ “b subtracting a few copies” ♣
SLIDE 31 6
for ❝: ❝ q r where r ❝ q ☎ ❝❂ 255✆ . q r. ♣. ression: ✦ mult. bits ♣
sub. ❝ ❝ s ♣ s ❝ ♣.
7
Constant-time NIST P-256 NIST P-256 prime ♣ is 2256 2224 + 2192 + 296 1. ECDSA standard specifies reduction procedure given an integer “❆ less than ♣2”: Write ❆ as (❆15❀ ❆14❀ ❆13❀ ❆12❀ ❆11❀ ❆10❀ ❆9❀ ❆8❀ ❆7❀ ❆6❀ ❆5❀ ❆4❀ ❆3❀ ❆2❀ ❆1❀ ❆0), meaning P
✐ ❆✐232✐.
Define ❚; ❙1; ❙2; ❙3; ❙4; ❉1; ❉2; ❉3; ❉4 as (❆7❀ ❆6❀ ❆5❀ ❆4❀ ❆3❀ ❆2❀ ❆1❀ ❆ (❆15❀ ❆14❀ ❆13❀ ❆12❀ ❆11❀ 0❀ 0❀ (0❀ ❆15❀ ❆14❀ ❆13❀ ❆12❀ 0❀ 0❀ 0); (❆15❀ ❆14❀ 0❀ 0❀ 0❀ ❆10❀ ❆9❀ ❆8); (❆8❀ ❆13❀ ❆15❀ ❆14❀ ❆13❀ ❆11❀ ❆ ❀ ❆ (❆10❀ ❆8❀ 0❀ 0❀ 0❀ ❆13❀ ❆12❀ ❆11 (❆11❀ ❆9❀ 0❀ 0❀ ❆15❀ ❆14❀ ❆13❀ ❆ (❆12❀ 0❀ ❆10❀ ❆9❀ ❆8❀ ❆15❀ ❆14❀ ❆ (❆13❀ 0❀ ❆11❀ ❆10❀ ❆9❀ 0❀ ❆15❀ ❆ Compute ❚ + 2❙1 + 2❙2 + ❙ ❙4 ❉1 ❉2 ❉3 ❉4. Reduce modulo ♣ “by adding subtracting a few copies” of ♣
SLIDE 32
7
Constant-time NIST P-256 NIST P-256 prime ♣ is 2256 2224 + 2192 + 296 1. ECDSA standard specifies reduction procedure given an integer “❆ less than ♣2”: Write ❆ as (❆15❀ ❆14❀ ❆13❀ ❆12❀ ❆11❀ ❆10❀ ❆9❀ ❆8❀ ❆7❀ ❆6❀ ❆5❀ ❆4❀ ❆3❀ ❆2❀ ❆1❀ ❆0), meaning P
✐ ❆✐232✐.
Define ❚; ❙1; ❙2; ❙3; ❙4; ❉1; ❉2; ❉3; ❉4 as
8
(❆7❀ ❆6❀ ❆5❀ ❆4❀ ❆3❀ ❆2❀ ❆1❀ ❆0); (❆15❀ ❆14❀ ❆13❀ ❆12❀ ❆11❀ 0❀ 0❀ 0); (0❀ ❆15❀ ❆14❀ ❆13❀ ❆12❀ 0❀ 0❀ 0); (❆15❀ ❆14❀ 0❀ 0❀ 0❀ ❆10❀ ❆9❀ ❆8); (❆8❀ ❆13❀ ❆15❀ ❆14❀ ❆13❀ ❆11❀ ❆10❀ ❆9); (❆10❀ ❆8❀ 0❀ 0❀ 0❀ ❆13❀ ❆12❀ ❆11); (❆11❀ ❆9❀ 0❀ 0❀ ❆15❀ ❆14❀ ❆13❀ ❆12); (❆12❀ 0❀ ❆10❀ ❆9❀ ❆8❀ ❆15❀ ❆14❀ ❆13); (❆13❀ 0❀ ❆11❀ ❆10❀ ❆9❀ 0❀ ❆15❀ ❆14). Compute ❚ + 2❙1 + 2❙2 + ❙3 + ❙4 ❉1 ❉2 ❉3 ❉4. Reduce modulo ♣ “by adding or subtracting a few copies” of ♣.
SLIDE 33
7
Constant-time NIST P-256 P-256 prime ♣ is 2224 + 2192 + 296 1. ECDSA standard specifies reduction procedure given integer “❆ less than ♣2”: ❆ as ❆ ❀ ❆14❀ ❆13❀ ❆12❀ ❆11❀ ❆10❀ ❆9❀ ❆ ❀ ❆7❀ ❆6❀ ❆5❀ ❆4❀ ❆3❀ ❆2❀ ❆1❀ ❆0), meaning P
✐ ❆✐232✐.
❚ ❙ ❙2; ❙3; ❙4; ❉1; ❉2; ❉3; ❉4
8
(❆7❀ ❆6❀ ❆5❀ ❆4❀ ❆3❀ ❆2❀ ❆1❀ ❆0); (❆15❀ ❆14❀ ❆13❀ ❆12❀ ❆11❀ 0❀ 0❀ 0); (0❀ ❆15❀ ❆14❀ ❆13❀ ❆12❀ 0❀ 0❀ 0); (❆15❀ ❆14❀ 0❀ 0❀ 0❀ ❆10❀ ❆9❀ ❆8); (❆8❀ ❆13❀ ❆15❀ ❆14❀ ❆13❀ ❆11❀ ❆10❀ ❆9); (❆10❀ ❆8❀ 0❀ 0❀ 0❀ ❆13❀ ❆12❀ ❆11); (❆11❀ ❆9❀ 0❀ 0❀ ❆15❀ ❆14❀ ❆13❀ ❆12); (❆12❀ 0❀ ❆10❀ ❆9❀ ❆8❀ ❆15❀ ❆14❀ ❆13); (❆13❀ 0❀ ❆11❀ ❆10❀ ❆9❀ 0❀ ❆15❀ ❆14). Compute ❚ + 2❙1 + 2❙2 + ❙3 + ❙4 ❉1 ❉2 ❉3 ❉4. Reduce modulo ♣ “by adding or subtracting a few copies” of ♣. What is A loop?
SLIDE 34 7
NIST P-256 rime ♣ is
specifies cedure given ❆ less than ♣2”: ❆ ❆ ❀ ❆ ❀ ❆ ❀ ❆12❀ ❆11❀ ❆10❀ ❆9❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆4❀ ❆3❀ ❆2❀ ❆1❀ ❆0), P
✐ ❆✐ 32✐.
❚ ❙ ❙ ❙ ❙ ; ❉1; ❉2; ❉3; ❉4
8
(❆7❀ ❆6❀ ❆5❀ ❆4❀ ❆3❀ ❆2❀ ❆1❀ ❆0); (❆15❀ ❆14❀ ❆13❀ ❆12❀ ❆11❀ 0❀ 0❀ 0); (0❀ ❆15❀ ❆14❀ ❆13❀ ❆12❀ 0❀ 0❀ 0); (❆15❀ ❆14❀ 0❀ 0❀ 0❀ ❆10❀ ❆9❀ ❆8); (❆8❀ ❆13❀ ❆15❀ ❆14❀ ❆13❀ ❆11❀ ❆10❀ ❆9); (❆10❀ ❆8❀ 0❀ 0❀ 0❀ ❆13❀ ❆12❀ ❆11); (❆11❀ ❆9❀ 0❀ 0❀ ❆15❀ ❆14❀ ❆13❀ ❆12); (❆12❀ 0❀ ❆10❀ ❆9❀ ❆8❀ ❆15❀ ❆14❀ ❆13); (❆13❀ 0❀ ❆11❀ ❆10❀ ❆9❀ 0❀ ❆15❀ ❆14). Compute ❚ + 2❙1 + 2❙2 + ❙3 + ❙4 ❉1 ❉2 ❉3 ❉4. Reduce modulo ♣ “by adding or subtracting a few copies” of ♣. What is “a few copies”? A loop? Variable
SLIDE 35 7
P-256 ♣
❆ ♣ ”: ❆ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆10❀ ❆9❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆1❀ ❆0), P
✐ ❆✐ ✐
❚ ❙ ❙ ❙ ❙ ❉ ❉ ❉3; ❉4
8
(❆7❀ ❆6❀ ❆5❀ ❆4❀ ❆3❀ ❆2❀ ❆1❀ ❆0); (❆15❀ ❆14❀ ❆13❀ ❆12❀ ❆11❀ 0❀ 0❀ 0); (0❀ ❆15❀ ❆14❀ ❆13❀ ❆12❀ 0❀ 0❀ 0); (❆15❀ ❆14❀ 0❀ 0❀ 0❀ ❆10❀ ❆9❀ ❆8); (❆8❀ ❆13❀ ❆15❀ ❆14❀ ❆13❀ ❆11❀ ❆10❀ ❆9); (❆10❀ ❆8❀ 0❀ 0❀ 0❀ ❆13❀ ❆12❀ ❆11); (❆11❀ ❆9❀ 0❀ 0❀ ❆15❀ ❆14❀ ❆13❀ ❆12); (❆12❀ 0❀ ❆10❀ ❆9❀ ❆8❀ ❆15❀ ❆14❀ ❆13); (❆13❀ 0❀ ❆11❀ ❆10❀ ❆9❀ 0❀ ❆15❀ ❆14). Compute ❚ + 2❙1 + 2❙2 + ❙3 + ❙4 ❉1 ❉2 ❉3 ❉4. Reduce modulo ♣ “by adding or subtracting a few copies” of ♣. What is “a few copies”? A loop? Variable time.
SLIDE 36
8
(❆7❀ ❆6❀ ❆5❀ ❆4❀ ❆3❀ ❆2❀ ❆1❀ ❆0); (❆15❀ ❆14❀ ❆13❀ ❆12❀ ❆11❀ 0❀ 0❀ 0); (0❀ ❆15❀ ❆14❀ ❆13❀ ❆12❀ 0❀ 0❀ 0); (❆15❀ ❆14❀ 0❀ 0❀ 0❀ ❆10❀ ❆9❀ ❆8); (❆8❀ ❆13❀ ❆15❀ ❆14❀ ❆13❀ ❆11❀ ❆10❀ ❆9); (❆10❀ ❆8❀ 0❀ 0❀ 0❀ ❆13❀ ❆12❀ ❆11); (❆11❀ ❆9❀ 0❀ 0❀ ❆15❀ ❆14❀ ❆13❀ ❆12); (❆12❀ 0❀ ❆10❀ ❆9❀ ❆8❀ ❆15❀ ❆14❀ ❆13); (❆13❀ 0❀ ❆11❀ ❆10❀ ❆9❀ 0❀ ❆15❀ ❆14). Compute ❚ + 2❙1 + 2❙2 + ❙3 + ❙4 ❉1 ❉2 ❉3 ❉4. Reduce modulo ♣ “by adding or subtracting a few copies” of ♣.
9
What is “a few copies”? A loop? Variable time.
SLIDE 37
8
(❆7❀ ❆6❀ ❆5❀ ❆4❀ ❆3❀ ❆2❀ ❆1❀ ❆0); (❆15❀ ❆14❀ ❆13❀ ❆12❀ ❆11❀ 0❀ 0❀ 0); (0❀ ❆15❀ ❆14❀ ❆13❀ ❆12❀ 0❀ 0❀ 0); (❆15❀ ❆14❀ 0❀ 0❀ 0❀ ❆10❀ ❆9❀ ❆8); (❆8❀ ❆13❀ ❆15❀ ❆14❀ ❆13❀ ❆11❀ ❆10❀ ❆9); (❆10❀ ❆8❀ 0❀ 0❀ 0❀ ❆13❀ ❆12❀ ❆11); (❆11❀ ❆9❀ 0❀ 0❀ ❆15❀ ❆14❀ ❆13❀ ❆12); (❆12❀ 0❀ ❆10❀ ❆9❀ ❆8❀ ❆15❀ ❆14❀ ❆13); (❆13❀ 0❀ ❆11❀ ❆10❀ ❆9❀ 0❀ ❆15❀ ❆14). Compute ❚ + 2❙1 + 2❙2 + ❙3 + ❙4 ❉1 ❉2 ❉3 ❉4. Reduce modulo ♣ “by adding or subtracting a few copies” of ♣.
9
What is “a few copies”? A loop? Variable time. Correct but quite slow: conditionally add 4♣, conditionally add 2♣, conditionally add ♣, conditionally sub 4♣, conditionally sub 2♣, conditionally sub ♣.
SLIDE 38
8
(❆7❀ ❆6❀ ❆5❀ ❆4❀ ❆3❀ ❆2❀ ❆1❀ ❆0); (❆15❀ ❆14❀ ❆13❀ ❆12❀ ❆11❀ 0❀ 0❀ 0); (0❀ ❆15❀ ❆14❀ ❆13❀ ❆12❀ 0❀ 0❀ 0); (❆15❀ ❆14❀ 0❀ 0❀ 0❀ ❆10❀ ❆9❀ ❆8); (❆8❀ ❆13❀ ❆15❀ ❆14❀ ❆13❀ ❆11❀ ❆10❀ ❆9); (❆10❀ ❆8❀ 0❀ 0❀ 0❀ ❆13❀ ❆12❀ ❆11); (❆11❀ ❆9❀ 0❀ 0❀ ❆15❀ ❆14❀ ❆13❀ ❆12); (❆12❀ 0❀ ❆10❀ ❆9❀ ❆8❀ ❆15❀ ❆14❀ ❆13); (❆13❀ 0❀ ❆11❀ ❆10❀ ❆9❀ 0❀ ❆15❀ ❆14). Compute ❚ + 2❙1 + 2❙2 + ❙3 + ❙4 ❉1 ❉2 ❉3 ❉4. Reduce modulo ♣ “by adding or subtracting a few copies” of ♣.
9
What is “a few copies”? A loop? Variable time. Correct but quite slow: conditionally add 4♣, conditionally add 2♣, conditionally add ♣, conditionally sub 4♣, conditionally sub 2♣, conditionally sub ♣. Delay until end of computation? Trouble: “❆ less than ♣2”.
SLIDE 39
8
(❆7❀ ❆6❀ ❆5❀ ❆4❀ ❆3❀ ❆2❀ ❆1❀ ❆0); (❆15❀ ❆14❀ ❆13❀ ❆12❀ ❆11❀ 0❀ 0❀ 0); (0❀ ❆15❀ ❆14❀ ❆13❀ ❆12❀ 0❀ 0❀ 0); (❆15❀ ❆14❀ 0❀ 0❀ 0❀ ❆10❀ ❆9❀ ❆8); (❆8❀ ❆13❀ ❆15❀ ❆14❀ ❆13❀ ❆11❀ ❆10❀ ❆9); (❆10❀ ❆8❀ 0❀ 0❀ 0❀ ❆13❀ ❆12❀ ❆11); (❆11❀ ❆9❀ 0❀ 0❀ ❆15❀ ❆14❀ ❆13❀ ❆12); (❆12❀ 0❀ ❆10❀ ❆9❀ ❆8❀ ❆15❀ ❆14❀ ❆13); (❆13❀ 0❀ ❆11❀ ❆10❀ ❆9❀ 0❀ ❆15❀ ❆14). Compute ❚ + 2❙1 + 2❙2 + ❙3 + ❙4 ❉1 ❉2 ❉3 ❉4. Reduce modulo ♣ “by adding or subtracting a few copies” of ♣.
9
What is “a few copies”? A loop? Variable time. Correct but quite slow: conditionally add 4♣, conditionally add 2♣, conditionally add ♣, conditionally sub 4♣, conditionally sub 2♣, conditionally sub ♣. Delay until end of computation? Trouble: “❆ less than ♣2”. Even worse: what about platforms where 232 isn’t best radix?
SLIDE 40
8
❆ ❀ ❆ ❀ ❆5❀ ❆4❀ ❆3❀ ❆2❀ ❆1❀ ❆0); ❆ ❀ ❆14❀ ❆13❀ ❆12❀ ❆11❀ 0❀ 0❀ 0); ❀ ❆ ❀ ❆14❀ ❆13❀ ❆12❀ 0❀ 0❀ 0); ❆ ❀ ❆14❀ 0❀ 0❀ 0❀ ❆10❀ ❆9❀ ❆8); ❆ ❀ ❆13❀ ❆15❀ ❆14❀ ❆13❀ ❆11❀ ❆10❀ ❆9); ❆ ❀ ❆8❀ 0❀ 0❀ 0❀ ❆13❀ ❆12❀ ❆11); ❆ ❀ ❆9❀ 0❀ 0❀ ❆15❀ ❆14❀ ❆13❀ ❆12); ❆ ❀ ❀ ❆10❀ ❆9❀ ❆8❀ ❆15❀ ❆14❀ ❆13); ❆ ❀ ❀ ❆11❀ ❆10❀ ❆9❀ 0❀ ❆15❀ ❆14). Compute ❚ + 2❙1 + 2❙2 + ❙3 + ❙ ❉1 ❉2 ❉3 ❉4. Reduce modulo ♣ “by adding or subtracting a few copies” of ♣.
9
What is “a few copies”? A loop? Variable time. Correct but quite slow: conditionally add 4♣, conditionally add 2♣, conditionally add ♣, conditionally sub 4♣, conditionally sub 2♣, conditionally sub ♣. Delay until end of computation? Trouble: “❆ less than ♣2”. Even worse: what about platforms where 232 isn’t best radix? The Montgomery
x2,z2,x3,z3 for i in bit = x2,x3 z2,z3 x3,z3 x2,z2 4*x2*z2*(x2^2+A*x2*z2+z2^2)) x2,x3 z2,z3 return x2*z2^(p-2)
SLIDE 41
8
❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆3❀ ❆2❀ ❆1❀ ❆0); ❆ ❀ ❆ ❀ ❆ ❀ ❆12❀ ❆11❀ 0❀ 0❀ 0); ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆12❀ 0❀ 0❀ 0); ❆ ❀ ❆ ❀ ❀ ❀ ❀ ❆10❀ ❆9❀ ❆8); ❆ ❀ ❆ ❀ ❆ ❀ ❆14❀ ❆13❀ ❆11❀ ❆10❀ ❆9); ❆ ❀ ❆ ❀ ❀ ❀ ❀ ❆13❀ ❆12❀ ❆11); ❆ ❀ ❆ ❀ ❀ ❀ ❆15❀ ❆14❀ ❆13❀ ❆12); ❆ ❀ ❀ ❆ ❀ ❆ ❀ ❆8❀ ❆15❀ ❆14❀ ❆13); ❆ ❀ ❀ ❆ ❀ ❆ ❀ ❆9❀ 0❀ ❆15❀ ❆14). ❚ ❙1 + 2❙2 + ❙3 + ❙ ❉ ❉ ❉3 ❉4. ♣ “by adding or copies” of ♣.
9
What is “a few copies”? A loop? Variable time. Correct but quite slow: conditionally add 4♣, conditionally add 2♣, conditionally add ♣, conditionally sub 4♣, conditionally sub 2♣, conditionally sub ♣. Delay until end of computation? Trouble: “❆ less than ♣2”. Even worse: what about platforms where 232 isn’t best radix? The Montgomery ladder
x2,z2,x3,z3 = 1,0,x1,1 for i in reversed(range(255)): bit = 1 & (n >> x2,x3 = cswap(x2,x3,bit) z2,z3 = cswap(z2,z3,bit) x3,z3 = ((x2*x3-z2*z3)^2, x1*(x2*z3-z2*x3)^2) x2,z2 = ((x2^2-z2^2)^2, 4*x2*z2*(x2^2+A*x2*z2+z2^2)) x2,x3 = cswap(x2,x3,bit) z2,z3 = cswap(z2,z3,bit) return x2*z2^(p-2)
SLIDE 42 8
❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆0); ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❀ 0❀ 0); ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❀ ❀ 0); ❆ ❀ ❆ ❀ ❀ ❀ ❀ ❆ ❀ ❆ ❀ ❆8); ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆10❀ ❆9); ❆ ❀ ❆ ❀ ❀ ❀ ❀ ❆ ❀ ❆ ❀ ❆11); ❆ ❀ ❆ ❀ ❀ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆12); ❆ ❀ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❆14❀ ❆13); ❆ ❀ ❀ ❆ ❀ ❆ ❀ ❆ ❀ ❀ ❆ ❀ ❆14). ❚ ❙ ❙ + ❙3 + ❙ ❉ ❉ ❉ ❉ . ♣ adding or
9
What is “a few copies”? A loop? Variable time. Correct but quite slow: conditionally add 4♣, conditionally add 2♣, conditionally add ♣, conditionally sub 4♣, conditionally sub 2♣, conditionally sub ♣. Delay until end of computation? Trouble: “❆ less than ♣2”. Even worse: what about platforms where 232 isn’t best radix? The Montgomery ladder
x2,z2,x3,z3 = 1,0,x1,1 for i in reversed(range(255)): bit = 1 & (n >> i) x2,x3 = cswap(x2,x3,bit) z2,z3 = cswap(z2,z3,bit) x3,z3 = ((x2*x3-z2*z3)^2, x1*(x2*z3-z2*x3)^2) x2,z2 = ((x2^2-z2^2)^2, 4*x2*z2*(x2^2+A*x2*z2+z2^2)) x2,x3 = cswap(x2,x3,bit) z2,z3 = cswap(z2,z3,bit) return x2*z2^(p-2)
SLIDE 43
9
What is “a few copies”? A loop? Variable time. Correct but quite slow: conditionally add 4♣, conditionally add 2♣, conditionally add ♣, conditionally sub 4♣, conditionally sub 2♣, conditionally sub ♣. Delay until end of computation? Trouble: “❆ less than ♣2”. Even worse: what about platforms where 232 isn’t best radix?
10
The Montgomery ladder
x2,z2,x3,z3 = 1,0,x1,1 for i in reversed(range(255)): bit = 1 & (n >> i) x2,x3 = cswap(x2,x3,bit) z2,z3 = cswap(z2,z3,bit) x3,z3 = ((x2*x3-z2*z3)^2, x1*(x2*z3-z2*x3)^2) x2,z2 = ((x2^2-z2^2)^2, 4*x2*z2*(x2^2+A*x2*z2+z2^2)) x2,x3 = cswap(x2,x3,bit) z2,z3 = cswap(z2,z3,bit) return x2*z2^(p-2)
SLIDE 44 9
is “a few copies”?
rrect but quite slow: conditionally add 4♣, conditionally add 2♣, conditionally add ♣, conditionally sub 4♣, conditionally sub 2♣, conditionally sub ♣. until end of computation? rouble: “❆ less than ♣2”.
- rse: what about platforms
232 isn’t best radix?
10
The Montgomery ladder
x2,z2,x3,z3 = 1,0,x1,1 for i in reversed(range(255)): bit = 1 & (n >> i) x2,x3 = cswap(x2,x3,bit) z2,z3 = cswap(z2,z3,bit) x3,z3 = ((x2*x3-z2*z3)^2, x1*(x2*z3-z2*x3)^2) x2,z2 = ((x2^2-z2^2)^2, 4*x2*z2*(x2^2+A*x2*z2+z2^2)) x2,x3 = cswap(x2,x3,bit) z2,z3 = cswap(z2,z3,bit) return x2*z2^(p-2)
Simple; fast; computes
❆① ① when ❆2
SLIDE 45 9
copies”? riable time. quite slow: add 4♣, add 2♣, add ♣, 4♣, 2♣, ♣.
❆ than ♣2”. what about platforms est radix?
10
The Montgomery ladder
x2,z2,x3,z3 = 1,0,x1,1 for i in reversed(range(255)): bit = 1 & (n >> i) x2,x3 = cswap(x2,x3,bit) z2,z3 = cswap(z2,z3,bit) x3,z3 = ((x2*x3-z2*z3)^2, x1*(x2*z3-z2*x3)^2) x2,z2 = ((x2^2-z2^2)^2, 4*x2*z2*(x2^2+A*x2*z2+z2^2)) x2,x3 = cswap(x2,x3,bit) z2,z3 = cswap(z2,z3,bit) return x2*z2^(p-2)
Simple; fast; always computes scalar multiplication
① when ❆2 4 is non-squa
SLIDE 46 9
♣ ♣ ♣ ♣ ♣ ♣ computation? ❆ ♣ ”. platforms
10
The Montgomery ladder
x2,z2,x3,z3 = 1,0,x1,1 for i in reversed(range(255)): bit = 1 & (n >> i) x2,x3 = cswap(x2,x3,bit) z2,z3 = cswap(z2,z3,bit) x3,z3 = ((x2*x3-z2*z3)^2, x1*(x2*z3-z2*x3)^2) x2,z2 = ((x2^2-z2^2)^2, 4*x2*z2*(x2^2+A*x2*z2+z2^2)) x2,x3 = cswap(x2,x3,bit) z2,z3 = cswap(z2,z3,bit) return x2*z2^(p-2)
Simple; fast; always computes scalar multiplication
when ❆2 4 is non-square.
SLIDE 47 10
The Montgomery ladder
x2,z2,x3,z3 = 1,0,x1,1 for i in reversed(range(255)): bit = 1 & (n >> i) x2,x3 = cswap(x2,x3,bit) z2,z3 = cswap(z2,z3,bit) x3,z3 = ((x2*x3-z2*z3)^2, x1*(x2*z3-z2*x3)^2) x2,z2 = ((x2^2-z2^2)^2, 4*x2*z2*(x2^2+A*x2*z2+z2^2)) x2,x3 = cswap(x2,x3,bit) z2,z3 = cswap(z2,z3,bit) return x2*z2^(p-2)
11
Simple; fast; always computes scalar multiplication
when ❆2 4 is non-square.
SLIDE 48 10
The Montgomery ladder
x2,z2,x3,z3 = 1,0,x1,1 for i in reversed(range(255)): bit = 1 & (n >> i) x2,x3 = cswap(x2,x3,bit) z2,z3 = cswap(z2,z3,bit) x3,z3 = ((x2*x3-z2*z3)^2, x1*(x2*z3-z2*x3)^2) x2,z2 = ((x2^2-z2^2)^2, 4*x2*z2*(x2^2+A*x2*z2+z2^2)) x2,x3 = cswap(x2,x3,bit) z2,z3 = cswap(z2,z3,bit) return x2*z2^(p-2)
11
Simple; fast; always computes scalar multiplication
when ❆2 4 is non-square. With some extra lines can compute (①❀ ②) output given (①❀ ②) input. But simpler to use just ①, as proposed by 1985 Miller.
SLIDE 49 10
The Montgomery ladder
x2,z2,x3,z3 = 1,0,x1,1 for i in reversed(range(255)): bit = 1 & (n >> i) x2,x3 = cswap(x2,x3,bit) z2,z3 = cswap(z2,z3,bit) x3,z3 = ((x2*x3-z2*z3)^2, x1*(x2*z3-z2*x3)^2) x2,z2 = ((x2^2-z2^2)^2, 4*x2*z2*(x2^2+A*x2*z2+z2^2)) x2,x3 = cswap(x2,x3,bit) z2,z3 = cswap(z2,z3,bit) return x2*z2^(p-2)
11
Simple; fast; always computes scalar multiplication
when ❆2 4 is non-square. With some extra lines can compute (①❀ ②) output given (①❀ ②) input. But simpler to use just ①, as proposed by 1985 Miller. Adaptations to NIST curves are much slower; not as simple; not proven to always work. Other scalar-mult methods: proven but much more complex.
SLIDE 50 10
Montgomery ladder
x2,z2,x3,z3 = 1,0,x1,1 in reversed(range(255)): 1 & (n >> i) = cswap(x2,x3,bit) = cswap(z2,z3,bit) = ((x2*x3-z2*z3)^2, x1*(x2*z3-z2*x3)^2) = ((x2^2-z2^2)^2, 4*x2*z2*(x2^2+A*x2*z2+z2^2)) = cswap(x2,x3,bit) = cswap(z2,z3,bit) x2*z2^(p-2)
11
Simple; fast; always computes scalar multiplication
when ❆2 4 is non-square. With some extra lines can compute (①❀ ②) output given (①❀ ②) input. But simpler to use just ①, as proposed by 1985 Miller. Adaptations to NIST curves are much slower; not as simple; not proven to always work. Other scalar-mult methods: proven but much more complex. “Hey, you that ①1 is No need Curve25519
SLIDE 51 10
Montgomery ladder
1,0,x1,1 reversed(range(255)): >> i) cswap(x2,x3,bit) cswap(z2,z3,bit) ((x2*x3-z2*z3)^2, x1*(x2*z3-z2*x3)^2) ((x2^2-z2^2)^2, 4*x2*z2*(x2^2+A*x2*z2+z2^2)) cswap(x2,x3,bit) cswap(z2,z3,bit) x2*z2^(p-2)
11
Simple; fast; always computes scalar multiplication
when ❆2 4 is non-square. With some extra lines can compute (①❀ ②) output given (①❀ ②) input. But simpler to use just ①, as proposed by 1985 Miller. Adaptations to NIST curves are much slower; not as simple; not proven to always work. Other scalar-mult methods: proven but much more complex. “Hey, you forgot to that ①1 is on the curve!” No need to check. Curve25519 is twist-secure
SLIDE 52 10
reversed(range(255)): cswap(x2,x3,bit) cswap(z2,z3,bit) ((x2*x3-z2*z3)^2, x1*(x2*z3-z2*x3)^2) ((x2^2-z2^2)^2, 4*x2*z2*(x2^2+A*x2*z2+z2^2)) cswap(x2,x3,bit) cswap(z2,z3,bit)
11
Simple; fast; always computes scalar multiplication
when ❆2 4 is non-square. With some extra lines can compute (①❀ ②) output given (①❀ ②) input. But simpler to use just ①, as proposed by 1985 Miller. Adaptations to NIST curves are much slower; not as simple; not proven to always work. Other scalar-mult methods: proven but much more complex. “Hey, you forgot to check that ①1 is on the curve!” No need to check. Curve25519 is twist-secure.
SLIDE 53 11
Simple; fast; always computes scalar multiplication
when ❆2 4 is non-square. With some extra lines can compute (①❀ ②) output given (①❀ ②) input. But simpler to use just ①, as proposed by 1985 Miller. Adaptations to NIST curves are much slower; not as simple; not proven to always work. Other scalar-mult methods: proven but much more complex.
12
“Hey, you forgot to check that ①1 is on the curve!” No need to check. Curve25519 is twist-secure.
SLIDE 54 11
Simple; fast; always computes scalar multiplication
when ❆2 4 is non-square. With some extra lines can compute (①❀ ②) output given (①❀ ②) input. But simpler to use just ①, as proposed by 1985 Miller. Adaptations to NIST curves are much slower; not as simple; not proven to always work. Other scalar-mult methods: proven but much more complex.
12
“Hey, you forgot to check that ①1 is on the curve!” No need to check. Curve25519 is twist-secure. “This textbook tells me to start the Montgomery ladder from the top bit set in ♥!” (Exploited in, e.g., 2011 Brumley–Tuveri “Remote timing attacks are still practical”.) The Curve25519 DH function takes 2254 ✔ ♥ ❁ 2255, so this is still constant-time.
SLIDE 55 11
Simple; fast; always computes scalar multiplication ② = ①3 + ❆①2 + ① ❆2 4 is non-square. some extra lines compute (①❀ ②) output (①❀ ②) input. simpler to use just ①,
Adaptations to NIST curves much slower; not as simple;
scalar-mult methods: but much more complex.
12
“Hey, you forgot to check that ①1 is on the curve!” No need to check. Curve25519 is twist-secure. “This textbook tells me to start the Montgomery ladder from the top bit set in ♥!” (Exploited in, e.g., 2011 Brumley–Tuveri “Remote timing attacks are still practical”.) The Curve25519 DH function takes 2254 ✔ ♥ ❁ 2255, so this is still constant-time. Subsequent More Curve25519 2007 Gaudry–Thom Core 2, A 2009 Costigan–Schw 2011 Bernstein–Duif–Lange– Schwabe–Y 2012 Bernstein–Schw 2014 Langley–Mo newer Intel 2014 Mah 2014 Sasdrich–G
SLIDE 56
11
ays multiplication ② ① ❆①2 + ① ❆ non-square. lines ①❀ ②) output ①❀ ② input. use just ①, 1985 Miller. NIST curves er; not as simple; always work. r-mult methods: more complex.
12
“Hey, you forgot to check that ①1 is on the curve!” No need to check. Curve25519 is twist-secure. “This textbook tells me to start the Montgomery ladder from the top bit set in ♥!” (Exploited in, e.g., 2011 Brumley–Tuveri “Remote timing attacks are still practical”.) The Curve25519 DH function takes 2254 ✔ ♥ ❁ 2255, so this is still constant-time. Subsequent developments More Curve25519 2007 Gaudry–Thom Core 2, Athlon 64. 2009 Costigan–Schw 2011 Bernstein–Duif–Lange– Schwabe–Yang: Nehalem 2012 Bernstein–Schw 2014 Langley–Moon newer Intel chips. 2014 Mah´ e–Chauvet 2014 Sasdrich–G¨ uneysu
SLIDE 57
11
multiplication ② ① ❆① ① ❆ re. ①❀ ② output ①❀ ② ① Miller. curves simple; rk. ds: complex.
12
“Hey, you forgot to check that ①1 is on the curve!” No need to check. Curve25519 is twist-secure. “This textbook tells me to start the Montgomery ladder from the top bit set in ♥!” (Exploited in, e.g., 2011 Brumley–Tuveri “Remote timing attacks are still practical”.) The Curve25519 DH function takes 2254 ✔ ♥ ❁ 2255, so this is still constant-time. Subsequent developments More Curve25519 implementations: 2007 Gaudry–Thom´ e: tuned Core 2, Athlon 64. 2009 Costigan–Schwabe: Cell. 2011 Bernstein–Duif–Lange– Schwabe–Yang: Nehalem etc 2012 Bernstein–Schwabe: NEON. 2014 Langley–Moon: various newer Intel chips. 2014 Mah´ e–Chauvet: GPUs. 2014 Sasdrich–G¨ uneysu: FPGAs.
SLIDE 58
12
“Hey, you forgot to check that ①1 is on the curve!” No need to check. Curve25519 is twist-secure. “This textbook tells me to start the Montgomery ladder from the top bit set in ♥!” (Exploited in, e.g., 2011 Brumley–Tuveri “Remote timing attacks are still practical”.) The Curve25519 DH function takes 2254 ✔ ♥ ❁ 2255, so this is still constant-time.
13
Subsequent developments More Curve25519 implementations: 2007 Gaudry–Thom´ e: tuned for Core 2, Athlon 64. 2009 Costigan–Schwabe: Cell. 2011 Bernstein–Duif–Lange– Schwabe–Yang: Nehalem etc. 2012 Bernstein–Schwabe: NEON. 2014 Langley–Moon: various newer Intel chips. 2014 Mah´ e–Chauvet: GPUs. 2014 Sasdrich–G¨ uneysu: FPGAs.
SLIDE 59
12
you forgot to check ① is on the curve!” need to check. Curve25519 is twist-secure. textbook tells me rt the Montgomery ladder the top bit set in ♥!” (Exploited in, e.g., 2011 Brumley–Tuveri “Remote timing attacks are still practical”.) Curve25519 DH function 2254 ✔ ♥ ❁ 2255, is still constant-time.
13
Subsequent developments More Curve25519 implementations: 2007 Gaudry–Thom´ e: tuned for Core 2, Athlon 64. 2009 Costigan–Schwabe: Cell. 2011 Bernstein–Duif–Lange– Schwabe–Yang: Nehalem etc. 2012 Bernstein–Schwabe: NEON. 2014 Langley–Moon: various newer Intel chips. 2014 Mah´ e–Chauvet: GPUs. 2014 Sasdrich–G¨ uneysu: FPGAs. 2011 Bernstein–Duif–Lange– Schwabe–Y reusing Curve25519 2013 Bernstein–Janssen–Lange– Schwabe: 2014 Chen–Hsu–Lin–Schw Tsai–Wang–Y “Verifying http://en.wikipedia.org/wiki /Curve25519#Notable_uses lists Apple’s TextSecure, Much longer Nicolai Bro
SLIDE 60
12
to check ① the curve!” check. wist-secure. tells me Montgomery ladder set in ♥!” e.g., 2011 “Remote timing practical”.) DH function ✔ ♥ ❁ 2255, constant-time.
13
Subsequent developments More Curve25519 implementations: 2007 Gaudry–Thom´ e: tuned for Core 2, Athlon 64. 2009 Costigan–Schwabe: Cell. 2011 Bernstein–Duif–Lange– Schwabe–Yang: Nehalem etc. 2012 Bernstein–Schwabe: NEON. 2014 Langley–Moon: various newer Intel chips. 2014 Mah´ e–Chauvet: GPUs. 2014 Sasdrich–G¨ uneysu: FPGAs. 2011 Bernstein–Duif–Lange– Schwabe–Yang: Ed25519 reusing Curve25519 2013 Bernstein–Janssen–Lange– Schwabe: TweetNaCl 2014 Chen–Hsu–Lin–Schw Tsai–Wang–Yang–Y “Verifying Curve25519 http://en.wikipedia.org/wiki /Curve25519#Notable_uses lists Apple’s iOS, Op TextSecure, Tor, et Much longer list maintained Nicolai Brown (IANIX).
SLIDE 61
12
① wist-secure. ladder ♥!” timing ractical”.) function ✔ ♥ ❁ constant-time.
13
Subsequent developments More Curve25519 implementations: 2007 Gaudry–Thom´ e: tuned for Core 2, Athlon 64. 2009 Costigan–Schwabe: Cell. 2011 Bernstein–Duif–Lange– Schwabe–Yang: Nehalem etc. 2012 Bernstein–Schwabe: NEON. 2014 Langley–Moon: various newer Intel chips. 2014 Mah´ e–Chauvet: GPUs. 2014 Sasdrich–G¨ uneysu: FPGAs. 2011 Bernstein–Duif–Lange– Schwabe–Yang: Ed25519, reusing Curve25519 for signatures. 2013 Bernstein–Janssen–Lange– Schwabe: TweetNaCl. 2014 Chen–Hsu–Lin–Schwab Tsai–Wang–Yang–Yang: “Verifying Curve25519 softw http://en.wikipedia.org/wiki /Curve25519#Notable_uses lists Apple’s iOS, OpenSSH, TextSecure, Tor, et al. Much longer list maintained Nicolai Brown (IANIX).
SLIDE 62
13
Subsequent developments More Curve25519 implementations: 2007 Gaudry–Thom´ e: tuned for Core 2, Athlon 64. 2009 Costigan–Schwabe: Cell. 2011 Bernstein–Duif–Lange– Schwabe–Yang: Nehalem etc. 2012 Bernstein–Schwabe: NEON. 2014 Langley–Moon: various newer Intel chips. 2014 Mah´ e–Chauvet: GPUs. 2014 Sasdrich–G¨ uneysu: FPGAs.
14
2011 Bernstein–Duif–Lange– Schwabe–Yang: Ed25519, reusing Curve25519 for signatures. 2013 Bernstein–Janssen–Lange– Schwabe: TweetNaCl. 2014 Chen–Hsu–Lin–Schwabe– Tsai–Wang–Yang–Yang: “Verifying Curve25519 software.” http://en.wikipedia.org/wiki /Curve25519#Notable_uses lists Apple’s iOS, OpenSSH, TextSecure, Tor, et al. Much longer list maintained by Nicolai Brown (IANIX).
SLIDE 63
13
Subsequent developments Curve25519 implementations: Gaudry–Thom´ e: tuned for 2, Athlon 64. Costigan–Schwabe: Cell. Bernstein–Duif–Lange– abe–Yang: Nehalem etc. Bernstein–Schwabe: NEON. Langley–Moon: various Intel chips. Mah´ e–Chauvet: GPUs. Sasdrich–G¨ uneysu: FPGAs.
14
2011 Bernstein–Duif–Lange– Schwabe–Yang: Ed25519, reusing Curve25519 for signatures. 2013 Bernstein–Janssen–Lange– Schwabe: TweetNaCl. 2014 Chen–Hsu–Lin–Schwabe– Tsai–Wang–Yang–Yang: “Verifying Curve25519 software.” http://en.wikipedia.org/wiki /Curve25519#Notable_uses lists Apple’s iOS, OpenSSH, TextSecure, Tor, et al. Much longer list maintained by Nicolai Brown (IANIX). 2013.08: requests at higher Bernstein–Lange: Now Silent
SLIDE 64
13
developments Curve25519 implementations: Gaudry–Thom´ e: tuned for 64. Costigan–Schwabe: Cell. Bernstein–Duif–Lange– Nehalem etc. Bernstein–Schwabe: NEON. Langley–Moon: various chips. e–Chauvet: GPUs. ¨ uneysu: FPGAs.
14
2011 Bernstein–Duif–Lange– Schwabe–Yang: Ed25519, reusing Curve25519 for signatures. 2013 Bernstein–Janssen–Lange– Schwabe: TweetNaCl. 2014 Chen–Hsu–Lin–Schwabe– Tsai–Wang–Yang–Yang: “Verifying Curve25519 software.” http://en.wikipedia.org/wiki /Curve25519#Notable_uses lists Apple’s iOS, OpenSSH, TextSecure, Tor, et al. Much longer list maintained by Nicolai Brown (IANIX). 2013.08: Silent Circle requests non-NIST at higher security level. Bernstein–Lange: Now Silent Circle’s
SLIDE 65
13
implementations: tuned for Cell. Bernstein–Duif–Lange– etc. NEON. rious GPUs. FPGAs.
14
2011 Bernstein–Duif–Lange– Schwabe–Yang: Ed25519, reusing Curve25519 for signatures. 2013 Bernstein–Janssen–Lange– Schwabe: TweetNaCl. 2014 Chen–Hsu–Lin–Schwabe– Tsai–Wang–Yang–Yang: “Verifying Curve25519 software.” http://en.wikipedia.org/wiki /Curve25519#Notable_uses lists Apple’s iOS, OpenSSH, TextSecure, Tor, et al. Much longer list maintained by Nicolai Brown (IANIX). 2013.08: Silent Circle requests non-NIST curve at higher security level. Bernstein–Lange: Curve41417. Now Silent Circle’s default.
SLIDE 66
14
2011 Bernstein–Duif–Lange– Schwabe–Yang: Ed25519, reusing Curve25519 for signatures. 2013 Bernstein–Janssen–Lange– Schwabe: TweetNaCl. 2014 Chen–Hsu–Lin–Schwabe– Tsai–Wang–Yang–Yang: “Verifying Curve25519 software.” http://en.wikipedia.org/wiki /Curve25519#Notable_uses lists Apple’s iOS, OpenSSH, TextSecure, Tor, et al. Much longer list maintained by Nicolai Brown (IANIX).
15
2013.08: Silent Circle requests non-NIST curve at higher security level. Bernstein–Lange: Curve41417. Now Silent Circle’s default.
SLIDE 67
14
2011 Bernstein–Duif–Lange– Schwabe–Yang: Ed25519, reusing Curve25519 for signatures. 2013 Bernstein–Janssen–Lange– Schwabe: TweetNaCl. 2014 Chen–Hsu–Lin–Schwabe– Tsai–Wang–Yang–Yang: “Verifying Curve25519 software.” http://en.wikipedia.org/wiki /Curve25519#Notable_uses lists Apple’s iOS, OpenSSH, TextSecure, Tor, et al. Much longer list maintained by Nicolai Brown (IANIX).
15
2013.08: Silent Circle requests non-NIST curve at higher security level. Bernstein–Lange: Curve41417. Now Silent Circle’s default. Bernstein–Lange, independently Hamburg, independently Aranha– Barreto–Pereira–Ricardini: E-521.
SLIDE 68
14
2011 Bernstein–Duif–Lange– Schwabe–Yang: Ed25519, reusing Curve25519 for signatures. 2013 Bernstein–Janssen–Lange– Schwabe: TweetNaCl. 2014 Chen–Hsu–Lin–Schwabe– Tsai–Wang–Yang–Yang: “Verifying Curve25519 software.” http://en.wikipedia.org/wiki /Curve25519#Notable_uses lists Apple’s iOS, OpenSSH, TextSecure, Tor, et al. Much longer list maintained by Nicolai Brown (IANIX).
15
2013.08: Silent Circle requests non-NIST curve at higher security level. Bernstein–Lange: Curve41417. Now Silent Circle’s default. Bernstein–Lange, independently Hamburg, independently Aranha– Barreto–Pereira–Ricardini: E-521. More options hurt simplicity; do they really help security? Note that typical claims regarding AES-ECC “balance” disregard multiple users; lucky attacks; quantum attacks.