SLIDE 1 1
Small cryptographic bytecode
elaborating on an idea from Adam Langley
2
“Line search”: trying to find minimum of function f defined on x-line. e.g. “Bisection”, trying to find minimum in interval [x0; x1]: Replace interval with either [x0; (x0+x1)=2] or [(x0+x1)=2; x1]; try to make sensible choice. Iterate many times.
SLIDE 2 1
Small cryptographic bytecode
elaborating on an idea from Adam Langley
2
“Line search”: trying to find minimum of function f defined on x-line. e.g. “Bisection”, trying to find minimum in interval [x0; x1]: Replace interval with either [x0; (x0+x1)=2] or [(x0+x1)=2; x1]; try to make sensible choice. Iterate many times. Can try to reduce #iterations using smarter models of f : see, e.g., “secant method”.
SLIDE 3 1
Small cryptographic bytecode
elaborating on an idea from Adam Langley
2
“Line search”: trying to find minimum of function f defined on x-line. e.g. “Bisection”, trying to find minimum in interval [x0; x1]: Replace interval with either [x0; (x0+x1)=2] or [(x0+x1)=2; x1]; try to make sensible choice. Iterate many times. Can try to reduce #iterations using smarter models of f : see, e.g., “secant method”. Harder when f varies more.
SLIDE 4
1
cryptographic bytecode Bernstein rating on an idea from Langley
2
“Line search”: trying to find minimum of function f defined on x-line. e.g. “Bisection”, trying to find minimum in interval [x0; x1]: Replace interval with either [x0; (x0+x1)=2] or [(x0+x1)=2; x1]; try to make sensible choice. Iterate many times. Can try to reduce #iterations using smarter models of f : see, e.g., “secant method”. Harder when f varies more. How to find f defined “Gradient Starting try to figure where f
SLIDE 5
1
cryptographic bytecode an idea from
2
“Line search”: trying to find minimum of function f defined on x-line. e.g. “Bisection”, trying to find minimum in interval [x0; x1]: Replace interval with either [x0; (x0+x1)=2] or [(x0+x1)=2; x1]; try to make sensible choice. Iterate many times. Can try to reduce #iterations using smarter models of f : see, e.g., “secant method”. Harder when f varies more. How to find minimum f defined on (x; y)-plane? “Gradient descent”: Starting from (x0; try to figure out direction where f decreases
SLIDE 6
1
ytecode from
2
“Line search”: trying to find minimum of function f defined on x-line. e.g. “Bisection”, trying to find minimum in interval [x0; x1]: Replace interval with either [x0; (x0+x1)=2] or [(x0+x1)=2; x1]; try to make sensible choice. Iterate many times. Can try to reduce #iterations using smarter models of f : see, e.g., “secant method”. Harder when f varies more. How to find minimum of function f defined on (x; y)-plane? “Gradient descent”: Starting from (x0; y0), try to figure out direction where f decreases fastest.
SLIDE 7
2
“Line search”: trying to find minimum of function f defined on x-line. e.g. “Bisection”, trying to find minimum in interval [x0; x1]: Replace interval with either [x0; (x0+x1)=2] or [(x0+x1)=2; x1]; try to make sensible choice. Iterate many times. Can try to reduce #iterations using smarter models of f : see, e.g., “secant method”. Harder when f varies more.
3
How to find minimum of function f defined on (x; y)-plane? “Gradient descent”: Starting from (x0; y0), try to figure out direction where f decreases fastest.
SLIDE 8
2
“Line search”: trying to find minimum of function f defined on x-line. e.g. “Bisection”, trying to find minimum in interval [x0; x1]: Replace interval with either [x0; (x0+x1)=2] or [(x0+x1)=2; x1]; try to make sensible choice. Iterate many times. Can try to reduce #iterations using smarter models of f : see, e.g., “secant method”. Harder when f varies more.
3
How to find minimum of function f defined on (x; y)-plane? “Gradient descent”: Starting from (x0; y0), try to figure out direction where f decreases fastest. Could do line search to find minimum in that direction. Then find a new direction.
SLIDE 9
2
“Line search”: trying to find minimum of function f defined on x-line. e.g. “Bisection”, trying to find minimum in interval [x0; x1]: Replace interval with either [x0; (x0+x1)=2] or [(x0+x1)=2; x1]; try to make sensible choice. Iterate many times. Can try to reduce #iterations using smarter models of f : see, e.g., “secant method”. Harder when f varies more.
3
How to find minimum of function f defined on (x; y)-plane? “Gradient descent”: Starting from (x0; y0), try to figure out direction where f decreases fastest. Could do line search to find minimum in that direction. Then find a new direction. Better: Step down that direction. Then find a new direction.
SLIDE 10
2
“Line search”: trying to find minimum of function f defined on x-line. e.g. “Bisection”, trying to find minimum in interval [x0; x1]: Replace interval with either [x0; (x0+x1)=2] or [(x0+x1)=2; x1]; try to make sensible choice. Iterate many times. Can try to reduce #iterations using smarter models of f : see, e.g., “secant method”. Harder when f varies more.
3
How to find minimum of function f defined on (x; y)-plane? “Gradient descent”: Starting from (x0; y0), try to figure out direction where f decreases fastest. Could do line search to find minimum in that direction. Then find a new direction. Better: Step down that direction. Then find a new direction. Silly: Line search in x direction; line search in y direction; repeat.
SLIDE 11 2
search”: to find minimum of function f defined on x-line. “Bisection”, trying to find minimum in interval [x0; x1]: Replace interval with either +x1)=2] or [(x0+x1)=2; x1]; make sensible choice. many times. try to reduce #iterations smarter models of f : e.g., “secant method”. rder when f varies more.
3
How to find minimum of function f defined on (x; y)-plane? “Gradient descent”: Starting from (x0; y0), try to figure out direction where f decreases fastest. Could do line search to find minimum in that direction. Then find a new direction. Better: Step down that direction. Then find a new direction. Silly: Line search in x direction; line search in y direction; repeat. Keccak optimiz Goal: Fastest
You start implementing
SLIDE 12 2
minimum of defined on x-line. “Bisection”, trying to find interval [x0; x1]: with either
sensible choice. times. reduce #iterations models of f : “secant method”. varies more.
3
How to find minimum of function f defined on (x; y)-plane? “Gradient descent”: Starting from (x0; y0), try to figure out direction where f decreases fastest. Could do line search to find minimum in that direction. Then find a new direction. Better: Step down that direction. Then find a new direction. Silly: Line search in x direction; line search in y direction; repeat. Keccak optimization Goal: Fastest C co
You start with simple implementing Keccak.
SLIDE 13 2
find ]: either )=2; x1]; hoice. #iterations : d”. re.
3
How to find minimum of function f defined on (x; y)-plane? “Gradient descent”: Starting from (x0; y0), try to figure out direction where f decreases fastest. Could do line search to find minimum in that direction. Then find a new direction. Better: Step down that direction. Then find a new direction. Silly: Line search in x direction; line search in y direction; repeat. Keccak optimization Goal: Fastest C code for Keccak
You start with simple C code implementing Keccak.
SLIDE 14 3
How to find minimum of function f defined on (x; y)-plane? “Gradient descent”: Starting from (x0; y0), try to figure out direction where f decreases fastest. Could do line search to find minimum in that direction. Then find a new direction. Better: Step down that direction. Then find a new direction. Silly: Line search in x direction; line search in y direction; repeat.
4
Keccak optimization Goal: Fastest C code for Keccak
You start with simple C code implementing Keccak.
SLIDE 15 3
How to find minimum of function f defined on (x; y)-plane? “Gradient descent”: Starting from (x0; y0), try to figure out direction where f decreases fastest. Could do line search to find minimum in that direction. Then find a new direction. Better: Step down that direction. Then find a new direction. Silly: Line search in x direction; line search in y direction; repeat.
4
Keccak optimization Goal: Fastest C code for Keccak
You start with simple C code implementing Keccak. You compile it; see how fast it is; modify it to try to make it faster; repeat; eventually stop trying.
SLIDE 16 3
How to find minimum of function f defined on (x; y)-plane? “Gradient descent”: Starting from (x0; y0), try to figure out direction where f decreases fastest. Could do line search to find minimum in that direction. Then find a new direction. Better: Step down that direction. Then find a new direction. Silly: Line search in x direction; line search in y direction; repeat.
4
Keccak optimization Goal: Fastest C code for Keccak
You start with simple C code implementing Keccak. You compile it; see how fast it is; modify it to try to make it faster; repeat; eventually stop trying. You publish your fastest code. Maybe lots of people use it, and care about its speed.
SLIDE 17 3
to find minimum of function defined on (x; y)-plane? “Gradient descent”: rting from (x0; y0), figure out direction f decreases fastest. do line search to find minimum in that direction. find a new direction. Better: Step down that direction. find a new direction. Line search in x direction; search in y direction; repeat.
4
Keccak optimization Goal: Fastest C code for Keccak
You start with simple C code implementing Keccak. You compile it; see how fast it is; modify it to try to make it faster; repeat; eventually stop trying. You publish your fastest code. Maybe lots of people use it, and care about its speed. Compiler your Keccak
SLIDE 18 3
minimum of function y)-plane? descent”:
0; y0),
direction decreases fastest. search to find that direction. direction. wn that direction. direction. h in x direction; direction; repeat.
4
Keccak optimization Goal: Fastest C code for Keccak
You start with simple C code implementing Keccak. You compile it; see how fast it is; modify it to try to make it faster; repeat; eventually stop trying. You publish your fastest code. Maybe lots of people use it, and care about its speed. Compiler writer lea your Keccak Cortex-M4
SLIDE 19 3
function find direction. direction. direction. direction. direction; repeat.
4
Keccak optimization Goal: Fastest C code for Keccak
You start with simple C code implementing Keccak. You compile it; see how fast it is; modify it to try to make it faster; repeat; eventually stop trying. You publish your fastest code. Maybe lots of people use it, and care about its speed. Compiler writer learns about your Keccak Cortex-M4 C co
SLIDE 20 4
Keccak optimization Goal: Fastest C code for Keccak
You start with simple C code implementing Keccak. You compile it; see how fast it is; modify it to try to make it faster; repeat; eventually stop trying. You publish your fastest code. Maybe lots of people use it, and care about its speed.
5
Compiler writer learns about your Keccak Cortex-M4 C code.
SLIDE 21 4
Keccak optimization Goal: Fastest C code for Keccak
You start with simple C code implementing Keccak. You compile it; see how fast it is; modify it to try to make it faster; repeat; eventually stop trying. You publish your fastest code. Maybe lots of people use it, and care about its speed.
5
Compiler writer learns about your Keccak Cortex-M4 C code. Compiles it; sees how fast it is. Modifies compiler to try to make the compiled code faster. Repeats; eventually stops trying.
SLIDE 22 4
Keccak optimization Goal: Fastest C code for Keccak
You start with simple C code implementing Keccak. You compile it; see how fast it is; modify it to try to make it faster; repeat; eventually stop trying. You publish your fastest code. Maybe lots of people use it, and care about its speed.
5
Compiler writer learns about your Keccak Cortex-M4 C code. Compiles it; sees how fast it is. Modifies compiler to try to make the compiled code faster. Repeats; eventually stops trying. Publishes a new compiler version.
SLIDE 23 4
Keccak optimization Goal: Fastest C code for Keccak
You start with simple C code implementing Keccak. You compile it; see how fast it is; modify it to try to make it faster; repeat; eventually stop trying. You publish your fastest code. Maybe lots of people use it, and care about its speed.
5
Compiler writer learns about your Keccak Cortex-M4 C code. Compiles it; sees how fast it is. Modifies compiler to try to make the compiled code faster. Repeats; eventually stops trying. Publishes a new compiler version. Later: Maybe you try the new
- compiler. Whole process repeats.
SLIDE 24 4
Keccak optimization Goal: Fastest C code for Keccak
You start with simple C code implementing Keccak. You compile it; see how fast it is; modify it to try to make it faster; repeat; eventually stop trying. You publish your fastest code. Maybe lots of people use it, and care about its speed.
5
Compiler writer learns about your Keccak Cortex-M4 C code. Compiles it; sees how fast it is. Modifies compiler to try to make the compiled code faster. Repeats; eventually stops trying. Publishes a new compiler version. Later: Maybe you try the new
- compiler. Whole process repeats.
You treat compiler as constant. Compiler treats code as constant.
SLIDE 25 4
Keccak optimization Fastest C code for Keccak Cortex-M4 CPU core. start with simple C code implementing Keccak. compile it; see how fast it is; it to try to make it faster; eat; eventually stop trying. publish your fastest code. lots of people use it, re about its speed.
5
Compiler writer learns about your Keccak Cortex-M4 C code. Compiles it; sees how fast it is. Modifies compiler to try to make the compiled code faster. Repeats; eventually stops trying. Publishes a new compiler version. Later: Maybe you try the new
- compiler. Whole process repeats.
You treat compiler as constant. Compiler treats code as constant. Define f code x with
SLIDE 26 4
tion code for Keccak CPU core. simple C code Keccak. see how fast it is; to make it faster; eventually stop trying.
eople use it, its speed.
5
Compiler writer learns about your Keccak Cortex-M4 C code. Compiles it; sees how fast it is. Modifies compiler to try to make the compiled code faster. Repeats; eventually stops trying. Publishes a new compiler version. Later: Maybe you try the new
- compiler. Whole process repeats.
You treat compiler as constant. Compiler treats code as constant. Define f (x; y) as time code x with compiler
SLIDE 27 4
eccak re. de st it is; it faster; trying. code. it,
5
Compiler writer learns about your Keccak Cortex-M4 C code. Compiles it; sees how fast it is. Modifies compiler to try to make the compiled code faster. Repeats; eventually stops trying. Publishes a new compiler version. Later: Maybe you try the new
- compiler. Whole process repeats.
You treat compiler as constant. Compiler treats code as constant. Define f (x; y) as time taken code x with compiler y.
SLIDE 28 5
Compiler writer learns about your Keccak Cortex-M4 C code. Compiles it; sees how fast it is. Modifies compiler to try to make the compiled code faster. Repeats; eventually stops trying. Publishes a new compiler version. Later: Maybe you try the new
- compiler. Whole process repeats.
You treat compiler as constant. Compiler treats code as constant.
6
Define f (x; y) as time taken by code x with compiler y.
SLIDE 29 5
Compiler writer learns about your Keccak Cortex-M4 C code. Compiles it; sees how fast it is. Modifies compiler to try to make the compiled code faster. Repeats; eventually stops trying. Publishes a new compiler version. Later: Maybe you try the new
- compiler. Whole process repeats.
You treat compiler as constant. Compiler treats code as constant.
6
Define f (x; y) as time taken by code x with compiler y. x0: initial code. y0: initial compiler.
SLIDE 30 5
Compiler writer learns about your Keccak Cortex-M4 C code. Compiles it; sees how fast it is. Modifies compiler to try to make the compiled code faster. Repeats; eventually stops trying. Publishes a new compiler version. Later: Maybe you try the new
- compiler. Whole process repeats.
You treat compiler as constant. Compiler treats code as constant.
6
Define f (x; y) as time taken by code x with compiler y. x0: initial code. y0: initial compiler. You try to minimize f (x; y0). x1: new code from this line search in x direction.
SLIDE 31 5
Compiler writer learns about your Keccak Cortex-M4 C code. Compiles it; sees how fast it is. Modifies compiler to try to make the compiled code faster. Repeats; eventually stops trying. Publishes a new compiler version. Later: Maybe you try the new
- compiler. Whole process repeats.
You treat compiler as constant. Compiler treats code as constant.
6
Define f (x; y) as time taken by code x with compiler y. x0: initial code. y0: initial compiler. You try to minimize f (x; y0). x1: new code from this line search in x direction. Compiler writer: f (x1; y). y1: new compiler from this line search in y direction.
SLIDE 32 5
Compiler writer learns about your Keccak Cortex-M4 C code. Compiles it; sees how fast it is. Modifies compiler to try to make the compiled code faster. Repeats; eventually stops trying. Publishes a new compiler version. Later: Maybe you try the new
- compiler. Whole process repeats.
You treat compiler as constant. Compiler treats code as constant.
6
Define f (x; y) as time taken by code x with compiler y. x0: initial code. y0: initial compiler. You try to minimize f (x; y0). x1: new code from this line search in x direction. Compiler writer: f (x1; y). y1: new compiler from this line search in y direction. This whole approach is silly.
SLIDE 33 5
Compiler writer learns about Keccak Cortex-M4 C code. Compiles it; sees how fast it is. difies compiler to try to the compiled code faster. eats; eventually stops trying. Publishes a new compiler version. Maybe you try the new
- compiler. Whole process repeats.
treat compiler as constant. Compiler treats code as constant.
6
Define f (x; y) as time taken by code x with compiler y. x0: initial code. y0: initial compiler. You try to minimize f (x; y0). x1: new code from this line search in x direction. Compiler writer: f (x1; y). y1: new compiler from this line search in y direction. This whole approach is silly. min{f (x; fastest Kec
SLIDE 34 5
learns about rtex-M4 C code. how fast it is. compiler to try to compiled code faster. eventually stops trying. compiler version.
process repeats. compiler as constant. code as constant.
6
Define f (x; y) as time taken by code x with compiler y. x0: initial code. y0: initial compiler. You try to minimize f (x; y0). x1: new code from this line search in x direction. Compiler writer: f (x1; y). y1: new compiler from this line search in y direction. This whole approach is silly. min{f (x; y)} is the fastest Keccak Cortex-M4
SLIDE 35 5
code. it is. to faster. trying. version. new repeats. constant. constant.
6
Define f (x; y) as time taken by code x with compiler y. x0: initial code. y0: initial compiler. You try to minimize f (x; y0). x1: new code from this line search in x direction. Compiler writer: f (x1; y). y1: new compiler from this line search in y direction. This whole approach is silly. min{f (x; y)} is the time tak fastest Keccak Cortex-M4 asm.
SLIDE 36
6
Define f (x; y) as time taken by code x with compiler y. x0: initial code. y0: initial compiler. You try to minimize f (x; y0). x1: new code from this line search in x direction. Compiler writer: f (x1; y). y1: new compiler from this line search in y direction. This whole approach is silly.
7
min{f (x; y)} is the time taken by fastest Keccak Cortex-M4 asm.
SLIDE 37
6
Define f (x; y) as time taken by code x with compiler y. x0: initial code. y0: initial compiler. You try to minimize f (x; y0). x1: new code from this line search in x direction. Compiler writer: f (x1; y). y1: new compiler from this line search in y direction. This whole approach is silly.
7
min{f (x; y)} is the time taken by fastest Keccak Cortex-M4 asm. Slowly bouncing between x-line searches, y-line searches is a silly way to approach this min.
SLIDE 38
6
Define f (x; y) as time taken by code x with compiler y. x0: initial code. y0: initial compiler. You try to minimize f (x; y0). x1: new code from this line search in x direction. Compiler writer: f (x1; y). y1: new compiler from this line search in y direction. This whole approach is silly.
7
min{f (x; y)} is the time taken by fastest Keccak Cortex-M4 asm. Slowly bouncing between x-line searches, y-line searches is a silly way to approach this min. Clearly min can be achieved by many different pairs (x; y). Which pair is easiest to find?
SLIDE 39
6
Define f (x; y) as time taken by code x with compiler y. x0: initial code. y0: initial compiler. You try to minimize f (x; y0). x1: new code from this line search in x direction. Compiler writer: f (x1; y). y1: new compiler from this line search in y direction. This whole approach is silly.
7
min{f (x; y)} is the time taken by fastest Keccak Cortex-M4 asm. Slowly bouncing between x-line searches, y-line searches is a silly way to approach this min. Clearly min can be achieved by many different pairs (x; y). Which pair is easiest to find? Generalize from C to other languages: which language makes min easiest to find? Why did goal say “C code”? End user doesn’t need C.
SLIDE 40
6
f (x; y) as time taken by with compiler y. initial code. initial compiler. try to minimize f (x; y0). new code from this search in x direction. Compiler writer: f (x1; y). new compiler from this search in y direction. whole approach is silly.
7
min{f (x; y)} is the time taken by fastest Keccak Cortex-M4 asm. Slowly bouncing between x-line searches, y-line searches is a silly way to approach this min. Clearly min can be achieved by many different pairs (x; y). Which pair is easiest to find? Generalize from C to other languages: which language makes min easiest to find? Why did goal say “C code”? End user doesn’t need C. Does end
SLIDE 41
6
as time taken by compiler y. compiler. minimize f (x; y0). rom this direction. f (x1; y). compiler from this direction. roach is silly.
7
min{f (x; y)} is the time taken by fastest Keccak Cortex-M4 asm. Slowly bouncing between x-line searches, y-line searches is a silly way to approach this min. Clearly min can be achieved by many different pairs (x; y). Which pair is easiest to find? Generalize from C to other languages: which language makes min easiest to find? Why did goal say “C code”? End user doesn’t need C. Does end user need
SLIDE 42
6
en by
0).
this silly.
7
min{f (x; y)} is the time taken by fastest Keccak Cortex-M4 asm. Slowly bouncing between x-line searches, y-line searches is a silly way to approach this min. Clearly min can be achieved by many different pairs (x; y). Which pair is easiest to find? Generalize from C to other languages: which language makes min easiest to find? Why did goal say “C code”? End user doesn’t need C. Does end user need Cortex-M4?
SLIDE 43
7
min{f (x; y)} is the time taken by fastest Keccak Cortex-M4 asm. Slowly bouncing between x-line searches, y-line searches is a silly way to approach this min. Clearly min can be achieved by many different pairs (x; y). Which pair is easiest to find? Generalize from C to other languages: which language makes min easiest to find? Why did goal say “C code”? End user doesn’t need C.
8
Does end user need Cortex-M4?
SLIDE 44
7
min{f (x; y)} is the time taken by fastest Keccak Cortex-M4 asm. Slowly bouncing between x-line searches, y-line searches is a silly way to approach this min. Clearly min can be achieved by many different pairs (x; y). Which pair is easiest to find? Generalize from C to other languages: which language makes min easiest to find? Why did goal say “C code”? End user doesn’t need C.
8
Does end user need Cortex-M4? CPU designer learns about your Keccak Cortex-M4 asm.
SLIDE 45
7
min{f (x; y)} is the time taken by fastest Keccak Cortex-M4 asm. Slowly bouncing between x-line searches, y-line searches is a silly way to approach this min. Clearly min can be achieved by many different pairs (x; y). Which pair is easiest to find? Generalize from C to other languages: which language makes min easiest to find? Why did goal say “C code”? End user doesn’t need C.
8
Does end user need Cortex-M4? CPU designer learns about your Keccak Cortex-M4 asm. Modifies the CPU design to try to make this code faster. Repeats; eventually stops trying.
SLIDE 46
7
min{f (x; y)} is the time taken by fastest Keccak Cortex-M4 asm. Slowly bouncing between x-line searches, y-line searches is a silly way to approach this min. Clearly min can be achieved by many different pairs (x; y). Which pair is easiest to find? Generalize from C to other languages: which language makes min easiest to find? Why did goal say “C code”? End user doesn’t need C.
8
Does end user need Cortex-M4? CPU designer learns about your Keccak Cortex-M4 asm. Modifies the CPU design to try to make this code faster. Repeats; eventually stops trying. Years later, sells a new CPU. You reoptimize for this CPU.
SLIDE 47 7
min{f (x; y)} is the time taken by fastest Keccak Cortex-M4 asm. Slowly bouncing between x-line searches, y-line searches is a silly way to approach this min. Clearly min can be achieved by many different pairs (x; y). Which pair is easiest to find? Generalize from C to other languages: which language makes min easiest to find? Why did goal say “C code”? End user doesn’t need C.
8
Does end user need Cortex-M4? CPU designer learns about your Keccak Cortex-M4 asm. Modifies the CPU design to try to make this code faster. Repeats; eventually stops trying. Years later, sells a new CPU. You reoptimize for this CPU. Sometimes CPUs try extending
- r replacing instruction set, but
this is poorly coordinated with programmers, compiler writers.
SLIDE 48 7
(x; y)} is the time taken by Keccak Cortex-M4 asm. bouncing between searches, y-line searches is way to approach this min. min can be achieved by different pairs (x; y). pair is easiest to find? Generalize from C to other languages: which language min easiest to find? did goal say “C code”? user doesn’t need C.
8
Does end user need Cortex-M4? CPU designer learns about your Keccak Cortex-M4 asm. Modifies the CPU design to try to make this code faster. Repeats; eventually stops trying. Years later, sells a new CPU. You reoptimize for this CPU. Sometimes CPUs try extending
- r replacing instruction set, but
this is poorly coordinated with programmers, compiler writers. Generalize f (x; y) is code x on If compiler asm y(x f (x; y) =
SLIDE 49 7
the time taken by Cortex-M4 asm. between y-line searches is approach this min. be achieved by pairs (x; y). easiest to find? C to other which language easiest to find? y “C code”? esn’t need C.
8
Does end user need Cortex-M4? CPU designer learns about your Keccak Cortex-M4 asm. Modifies the CPU design to try to make this code faster. Repeats; eventually stops trying. Years later, sells a new CPU. You reoptimize for this CPU. Sometimes CPUs try extending
- r replacing instruction set, but
this is poorly coordinated with programmers, compiler writers. Generalize f (x; y) f (x; y) is time taken code x on platform If compiler y on co asm y(x) for Cortex-M4: f (x; y) = f (y(x); Co
SLIDE 50 7
taken by asm. rches is this min. achieved by ). d?
language de”?
8
Does end user need Cortex-M4? CPU designer learns about your Keccak Cortex-M4 asm. Modifies the CPU design to try to make this code faster. Repeats; eventually stops trying. Years later, sells a new CPU. You reoptimize for this CPU. Sometimes CPUs try extending
- r replacing instruction set, but
this is poorly coordinated with programmers, compiler writers. Generalize f (x; y) definition: f (x; y) is time taken by code x on platform y. If compiler y on code x produces asm y(x) for Cortex-M4: f (x; y) = f (y(x); Cortex-M4).
SLIDE 51 8
Does end user need Cortex-M4? CPU designer learns about your Keccak Cortex-M4 asm. Modifies the CPU design to try to make this code faster. Repeats; eventually stops trying. Years later, sells a new CPU. You reoptimize for this CPU. Sometimes CPUs try extending
- r replacing instruction set, but
this is poorly coordinated with programmers, compiler writers.
9
Generalize f (x; y) definition: f (x; y) is time taken by code x on platform y. If compiler y on code x produces asm y(x) for Cortex-M4: f (x; y) = f (y(x); Cortex-M4).
SLIDE 52 8
Does end user need Cortex-M4? CPU designer learns about your Keccak Cortex-M4 asm. Modifies the CPU design to try to make this code faster. Repeats; eventually stops trying. Years later, sells a new CPU. You reoptimize for this CPU. Sometimes CPUs try extending
- r replacing instruction set, but
this is poorly coordinated with programmers, compiler writers.
9
Generalize f (x; y) definition: f (x; y) is time taken by code x on platform y. If compiler y on code x produces asm y(x) for Cortex-M4: f (x; y) = f (y(x); Cortex-M4). Without the CPU changing: Minimize f (a; Cortex-M4). Search for (x; y) with y(x) = a.
SLIDE 53 8
Does end user need Cortex-M4? CPU designer learns about your Keccak Cortex-M4 asm. Modifies the CPU design to try to make this code faster. Repeats; eventually stops trying. Years later, sells a new CPU. You reoptimize for this CPU. Sometimes CPUs try extending
- r replacing instruction set, but
this is poorly coordinated with programmers, compiler writers.
9
Generalize f (x; y) definition: f (x; y) is time taken by code x on platform y. If compiler y on code x produces asm y(x) for Cortex-M4: f (x; y) = f (y(x); Cortex-M4). Without the CPU changing: Minimize f (a; Cortex-M4). Search for (x; y) with y(x) = a. Typical CPU designer: View a as a constant; try to minimize f (a; y). Silly optimization approach.
SLIDE 54
8
end user need Cortex-M4? designer learns about your Keccak Cortex-M4 asm. difies the CPU design to make this code faster. eats; eventually stops trying. later, sells a new CPU. reoptimize for this CPU. Sometimes CPUs try extending replacing instruction set, but poorly coordinated with rogrammers, compiler writers.
9
Generalize f (x; y) definition: f (x; y) is time taken by code x on platform y. If compiler y on code x produces asm y(x) for Cortex-M4: f (x; y) = f (y(x); Cortex-M4). Without the CPU changing: Minimize f (a; Cortex-M4). Search for (x; y) with y(x) = a. Typical CPU designer: View a as a constant; try to minimize f (a; y). Silly optimization approach. “I know I’ve develop that computes This circuit
SLIDE 55 8
need Cortex-M4? learns about your rtex-M4 asm. PU design to code faster. eventually stops trying. a new CPU. for this CPU. CPUs try extending instruction set, but
compiler writers.
9
Generalize f (x; y) definition: f (x; y) is time taken by code x on platform y. If compiler y on code x produces asm y(x) for Cortex-M4: f (x; y) = f (y(x); Cortex-M4). Without the CPU changing: Minimize f (a; Cortex-M4). Search for (x; y) with y(x) = a. Typical CPU designer: View a as a constant; try to minimize f (a; y). Silly optimization approach. “I know the minimum! I’ve developed the that computes Keccak. This circuit is my CPU.”
SLIDE 56 8
rtex-M4?
to faster. trying. CPU. CPU. extending set, but with writers.
9
Generalize f (x; y) definition: f (x; y) is time taken by code x on platform y. If compiler y on code x produces asm y(x) for Cortex-M4: f (x; y) = f (y(x); Cortex-M4). Without the CPU changing: Minimize f (a; Cortex-M4). Search for (x; y) with y(x) = a. Typical CPU designer: View a as a constant; try to minimize f (a; y). Silly optimization approach. “I know the minimum! I’ve developed the fastest circuit that computes Keccak. This circuit is my CPU.”
SLIDE 57
9
Generalize f (x; y) definition: f (x; y) is time taken by code x on platform y. If compiler y on code x produces asm y(x) for Cortex-M4: f (x; y) = f (y(x); Cortex-M4). Without the CPU changing: Minimize f (a; Cortex-M4). Search for (x; y) with y(x) = a. Typical CPU designer: View a as a constant; try to minimize f (a; y). Silly optimization approach.
10
“I know the minimum! I’ve developed the fastest circuit that computes Keccak. This circuit is my CPU.”
SLIDE 58
9
Generalize f (x; y) definition: f (x; y) is time taken by code x on platform y. If compiler y on code x produces asm y(x) for Cortex-M4: f (x; y) = f (y(x); Cortex-M4). Without the CPU changing: Minimize f (a; Cortex-M4). Search for (x; y) with y(x) = a. Typical CPU designer: View a as a constant; try to minimize f (a; y). Silly optimization approach.
10
“I know the minimum! I’ve developed the fastest circuit that computes Keccak. This circuit is my CPU.” Wait a minute: “CPU” concept is more restrictive than “chip”. Perspective of CPU designer: This chip can do anything! People want this chip to support SHA-1, SHA-2, SHA-3, SHAmir; all sorts of block ciphers; public-key cryptosystems; non-cryptographic computations.
SLIDE 59 9
Generalize f (x; y) definition: ) is time taken by
compiler y on code x produces (x) for Cortex-M4: ) = f (y(x); Cortex-M4). Without the CPU changing: Minimize f (a; Cortex-M4). for (x; y) with y(x) = a. ypical CPU designer: as a constant; minimize f (a; y).
10
“I know the minimum! I’ve developed the fastest circuit that computes Keccak. This circuit is my CPU.” Wait a minute: “CPU” concept is more restrictive than “chip”. Perspective of CPU designer: This chip can do anything! People want this chip to support SHA-1, SHA-2, SHA-3, SHAmir; all sorts of block ciphers; public-key cryptosystems; non-cryptographic computations. Adding fast (“Keccak adds area Adding fast for desired adds even
SLIDE 60 9
) definition: taken by rm y. code x produces rtex-M4: ); Cortex-M4). CPU changing: Cortex-M4). with y(x) = a. designer: constant; f (a; y).
10
“I know the minimum! I’ve developed the fastest circuit that computes Keccak. This circuit is my CPU.” Wait a minute: “CPU” concept is more restrictive than “chip”. Perspective of CPU designer: This chip can do anything! People want this chip to support SHA-1, SHA-2, SHA-3, SHAmir; all sorts of block ciphers; public-key cryptosystems; non-cryptographic computations. Adding fast Keccak (“Keccak coprocesso adds area to CPU. Adding fast coproc for desired mix of adds even more area
SLIDE 61
9
definition: roduces rtex-M4). changing: rtex-M4). ) = a. roach.
10
“I know the minimum! I’ve developed the fastest circuit that computes Keccak. This circuit is my CPU.” Wait a minute: “CPU” concept is more restrictive than “chip”. Perspective of CPU designer: This chip can do anything! People want this chip to support SHA-1, SHA-2, SHA-3, SHAmir; all sorts of block ciphers; public-key cryptosystems; non-cryptographic computations. Adding fast Keccak circuit (“Keccak coprocessor”) to CPU adds area to CPU. Adding fast coprocessors for desired mix of operations adds even more area to CPU.
SLIDE 62
10
“I know the minimum! I’ve developed the fastest circuit that computes Keccak. This circuit is my CPU.” Wait a minute: “CPU” concept is more restrictive than “chip”. Perspective of CPU designer: This chip can do anything! People want this chip to support SHA-1, SHA-2, SHA-3, SHAmir; all sorts of block ciphers; public-key cryptosystems; non-cryptographic computations.
11
Adding fast Keccak circuit (“Keccak coprocessor”) to CPU adds area to CPU. Adding fast coprocessors for desired mix of operations adds even more area to CPU.
SLIDE 63 10
“I know the minimum! I’ve developed the fastest circuit that computes Keccak. This circuit is my CPU.” Wait a minute: “CPU” concept is more restrictive than “chip”. Perspective of CPU designer: This chip can do anything! People want this chip to support SHA-1, SHA-2, SHA-3, SHAmir; all sorts of block ciphers; public-key cryptosystems; non-cryptographic computations.
11
Adding fast Keccak circuit (“Keccak coprocessor”) to CPU adds area to CPU. Adding fast coprocessors for desired mix of operations adds even more area to CPU. For same CPU area,
- btain much better throughput
by building many copies
without these coprocessors.
SLIDE 64 10
“I know the minimum! I’ve developed the fastest circuit that computes Keccak. This circuit is my CPU.” Wait a minute: “CPU” concept is more restrictive than “chip”. Perspective of CPU designer: This chip can do anything! People want this chip to support SHA-1, SHA-2, SHA-3, SHAmir; all sorts of block ciphers; public-key cryptosystems; non-cryptographic computations.
11
Adding fast Keccak circuit (“Keccak coprocessor”) to CPU adds area to CPU. Adding fast coprocessors for desired mix of operations adds even more area to CPU. For same CPU area,
- btain much better throughput
by building many copies
without these coprocessors. Fast Keccak chip is special case. Doesn’t reflect general case.
SLIDE 65 10
w the minimum! developed the fastest circuit computes Keccak. circuit is my CPU.” minute: “CPU” concept re restrictive than “chip”. ective of CPU designer: chip can do anything! want this chip to support SHA-1, SHA-2, SHA-3, SHAmir; rts of block ciphers; public-key cryptosystems; non-cryptographic computations.
11
Adding fast Keccak circuit (“Keccak coprocessor”) to CPU adds area to CPU. Adding fast coprocessors for desired mix of operations adds even more area to CPU. For same CPU area,
- btain much better throughput
by building many copies
without these coprocessors. Fast Keccak chip is special case. Doesn’t reflect general case. CPU designer’s What is for a specified within a
SLIDE 66 10
minimum! the fastest circuit Keccak. my CPU.” “CPU” concept restrictive than “chip”. CPU designer: anything! chip to support SHA-3, SHAmir; ciphers; cryptosystems; non-cryptographic computations.
11
Adding fast Keccak circuit (“Keccak coprocessor”) to CPU adds area to CPU. Adding fast coprocessors for desired mix of operations adds even more area to CPU. For same CPU area,
- btain much better throughput
by building many copies
without these coprocessors. Fast Keccak chip is special case. Doesn’t reflect general case. CPU designer’s metric: What is best perfo for a specified mix within a particular
SLIDE 67 10
circuit concept “chip”. designer: anything! support SHAmir; computations.
11
Adding fast Keccak circuit (“Keccak coprocessor”) to CPU adds area to CPU. Adding fast coprocessors for desired mix of operations adds even more area to CPU. For same CPU area,
- btain much better throughput
by building many copies
without these coprocessors. Fast Keccak chip is special case. Doesn’t reflect general case. CPU designer’s metric: What is best performance for a specified mix of operations within a particular CPU area?
SLIDE 68 11
Adding fast Keccak circuit (“Keccak coprocessor”) to CPU adds area to CPU. Adding fast coprocessors for desired mix of operations adds even more area to CPU. For same CPU area,
- btain much better throughput
by building many copies
without these coprocessors. Fast Keccak chip is special case. Doesn’t reflect general case.
12
CPU designer’s metric: What is best performance for a specified mix of operations within a particular CPU area?
SLIDE 69 11
Adding fast Keccak circuit (“Keccak coprocessor”) to CPU adds area to CPU. Adding fast coprocessors for desired mix of operations adds even more area to CPU. For same CPU area,
- btain much better throughput
by building many copies
without these coprocessors. Fast Keccak chip is special case. Doesn’t reflect general case.
12
CPU designer’s metric: What is best performance for a specified mix of operations within a particular CPU area? CPU designer is much more likely to consider incorporating a small Keccak coprocessor.
SLIDE 70 11
Adding fast Keccak circuit (“Keccak coprocessor”) to CPU adds area to CPU. Adding fast coprocessors for desired mix of operations adds even more area to CPU. For same CPU area,
- btain much better throughput
by building many copies
without these coprocessors. Fast Keccak chip is special case. Doesn’t reflect general case.
12
CPU designer’s metric: What is best performance for a specified mix of operations within a particular CPU area? CPU designer is much more likely to consider incorporating a small Keccak coprocessor. “So we should design the smallest Keccak circuit?”
SLIDE 71 11
Adding fast Keccak circuit (“Keccak coprocessor”) to CPU adds area to CPU. Adding fast coprocessors for desired mix of operations adds even more area to CPU. For same CPU area,
- btain much better throughput
by building many copies
without these coprocessors. Fast Keccak chip is special case. Doesn’t reflect general case.
12
CPU designer’s metric: What is best performance for a specified mix of operations within a particular CPU area? CPU designer is much more likely to consider incorporating a small Keccak coprocessor. “So we should design the smallest Keccak circuit?” —Maybe, but will this extreme be faster than using existing CPU instructions without coprocessor?
SLIDE 72 11
Adding fast Keccak circuit (“Keccak coprocessor”) to CPU rea to CPU. Adding fast coprocessors sired mix of operations even more area to CPU. same CPU area, much better throughput ilding many copies iginal CPU core without these coprocessors. Keccak chip is special case. esn’t reflect general case.
12
CPU designer’s metric: What is best performance for a specified mix of operations within a particular CPU area? CPU designer is much more likely to consider incorporating a small Keccak coprocessor. “So we should design the smallest Keccak circuit?” —Maybe, but will this extreme be faster than using existing CPU instructions without coprocessor? Intel typically quite large 32KB L1 32KB L1 several fast many different
“So it’s small to add instru for my favo
SLIDE 73 11
eccak circuit essor”) to CPU CPU. rocessors
area to CPU. rea, etter throughput many copies core coprocessors. chip is special case. general case.
12
CPU designer’s metric: What is best performance for a specified mix of operations within a particular CPU area? CPU designer is much more likely to consider incorporating a small Keccak coprocessor. “So we should design the smallest Keccak circuit?” —Maybe, but will this extreme be faster than using existing CPU instructions without coprocessor? Intel typically designs quite large CPU co 32KB L1 data cache, 32KB L1 instruction several fast multipliers, many different instructions,
“So it’s small cost to add instruction-set for my favorite crypto!”
SLIDE 74 11
CPU erations CPU. throughput rs. l case. case.
12
CPU designer’s metric: What is best performance for a specified mix of operations within a particular CPU area? CPU designer is much more likely to consider incorporating a small Keccak coprocessor. “So we should design the smallest Keccak circuit?” —Maybe, but will this extreme be faster than using existing CPU instructions without coprocessor? Intel typically designs quite large CPU cores: 32KB L1 data cache, 32KB L1 instruction cache, several fast multipliers, many different instructions,
“So it’s small cost for Intel to add instruction-set extension for my favorite crypto!”
SLIDE 75 12
CPU designer’s metric: What is best performance for a specified mix of operations within a particular CPU area? CPU designer is much more likely to consider incorporating a small Keccak coprocessor. “So we should design the smallest Keccak circuit?” —Maybe, but will this extreme be faster than using existing CPU instructions without coprocessor?
13
Intel typically designs quite large CPU cores: 32KB L1 data cache, 32KB L1 instruction cache, several fast multipliers, many different instructions,
“So it’s small cost for Intel to add instruction-set extension for my favorite crypto!”
SLIDE 76 12
CPU designer’s metric: What is best performance for a specified mix of operations within a particular CPU area? CPU designer is much more likely to consider incorporating a small Keccak coprocessor. “So we should design the smallest Keccak circuit?” —Maybe, but will this extreme be faster than using existing CPU instructions without coprocessor?
13
Intel typically designs quite large CPU cores: 32KB L1 data cache, 32KB L1 instruction cache, several fast multipliers, many different instructions,
“So it’s small cost for Intel to add instruction-set extension for my favorite crypto!” —Yes, but even smaller benefit for Intel’s mix of operations.
SLIDE 77 12
designer’s metric: is best performance specified mix of operations a particular CPU area? designer is much more likely consider incorporating a Keccak coprocessor. e should design the smallest Keccak circuit?” ybe, but will this extreme faster than using existing CPU instructions without coprocessor?
13
Intel typically designs quite large CPU cores: 32KB L1 data cache, 32KB L1 instruction cache, several fast multipliers, many different instructions,
“So it’s small cost for Intel to add instruction-set extension for my favorite crypto!” —Yes, but even smaller benefit for Intel’s mix of operations. Intel did for 1 round How many in an AE Can be 16: 8: smaller 4: even smaller : : : 1: probably compared and using
SLIDE 78 12
metric: erformance mix of operations rticular CPU area? much more likely rporating a coprocessor. design the circuit?” will this extreme using existing CPU without coprocessor?
13
Intel typically designs quite large CPU cores: 32KB L1 data cache, 32KB L1 instruction cache, several fast multipliers, many different instructions,
“So it’s small cost for Intel to add instruction-set extension for my favorite crypto!” —Yes, but even smaller benefit for Intel’s mix of operations. Intel did add instruction for 1 round of AES. How many parallel in an AES-round cop Can be 16: big; fa 8: smaller but slow 4: even smaller but : : : 1: probably not compared to skipping and using other CPU
SLIDE 79 12
erations rea? re likely r. extreme existing CPU cessor?
13
Intel typically designs quite large CPU cores: 32KB L1 data cache, 32KB L1 instruction cache, several fast multipliers, many different instructions,
“So it’s small cost for Intel to add instruction-set extension for my favorite crypto!” —Yes, but even smaller benefit for Intel’s mix of operations. Intel did add instruction for 1 round of AES. How many parallel S-boxes a in an AES-round coprocessor? Can be 16: big; fast. 8: smaller but slower. 4: even smaller but slower. : : : 1: probably not worthwhile compared to skipping coprocesso and using other CPU instructions.
SLIDE 80 13
Intel typically designs quite large CPU cores: 32KB L1 data cache, 32KB L1 instruction cache, several fast multipliers, many different instructions,
“So it’s small cost for Intel to add instruction-set extension for my favorite crypto!” —Yes, but even smaller benefit for Intel’s mix of operations.
14
Intel did add instruction for 1 round of AES. How many parallel S-boxes are in an AES-round coprocessor? Can be 16: big; fast. 8: smaller but slower. 4: even smaller but slower. : : : 1: probably not worthwhile compared to skipping coprocessor and using other CPU instructions.
SLIDE 81 13
Intel typically designs quite large CPU cores: 32KB L1 data cache, 32KB L1 instruction cache, several fast multipliers, many different instructions,
“So it’s small cost for Intel to add instruction-set extension for my favorite crypto!” —Yes, but even smaller benefit for Intel’s mix of operations.
14
Intel did add instruction for 1 round of AES. How many parallel S-boxes are in an AES-round coprocessor? Can be 16: big; fast. 8: smaller but slower. 4: even smaller but slower. : : : 1: probably not worthwhile compared to skipping coprocessor and using other CPU instructions. An instruction for 4 rounds of SHA-256 is in a few Intel CPUs.
SLIDE 82 13
ypically designs large CPU cores: L1 data cache, L1 instruction cache, several fast multipliers, different instructions,
it’s small cost for Intel instruction-set extension favorite crypto!” but even smaller benefit Intel’s mix of operations.
14
Intel did add instruction for 1 round of AES. How many parallel S-boxes are in an AES-round coprocessor? Can be 16: big; fast. 8: smaller but slower. 4: even smaller but slower. : : : 1: probably not worthwhile compared to skipping coprocessor and using other CPU instructions. An instruction for 4 rounds of SHA-256 is in a few Intel CPUs. Lightweigh Frequent where X
- Keccak;
- any secure
- a secure
“Resource-constrained need the
SLIDE 83 13
designs cores: cache, instruction cache, multipliers, instructions, it, etc. cost for Intel tion-set extension crypto!” smaller benefit
14
Intel did add instruction for 1 round of AES. How many parallel S-boxes are in an AES-round coprocessor? Can be 16: big; fast. 8: smaller but slower. 4: even smaller but slower. : : : 1: probably not worthwhile compared to skipping coprocessor and using other CPU instructions. An instruction for 4 rounds of SHA-256 is in a few Intel CPUs. Lightweight crypto Frequent claim in literature, where X might be
- Keccak;
- any secure hash;
- a secure cipher; :
“Resource-constrained need the smallest circuit
SLIDE 84 13
cache, instructions, Intel extension enefit erations.
14
Intel did add instruction for 1 round of AES. How many parallel S-boxes are in an AES-round coprocessor? Can be 16: big; fast. 8: smaller but slower. 4: even smaller but slower. : : : 1: probably not worthwhile compared to skipping coprocessor and using other CPU instructions. An instruction for 4 rounds of SHA-256 is in a few Intel CPUs. Lightweight crypto Frequent claim in literature, where X might be
- Keccak;
- any secure hash;
- a secure cipher; : : : :
“Resource-constrained IoT devices need the smallest circuit for
SLIDE 85 14
Intel did add instruction for 1 round of AES. How many parallel S-boxes are in an AES-round coprocessor? Can be 16: big; fast. 8: smaller but slower. 4: even smaller but slower. : : : 1: probably not worthwhile compared to skipping coprocessor and using other CPU instructions. An instruction for 4 rounds of SHA-256 is in a few Intel CPUs.
15
Lightweight crypto Frequent claim in literature, where X might be
- Keccak;
- any secure hash;
- a secure cipher; : : : :
“Resource-constrained IoT devices need the smallest circuit for X.”
SLIDE 86 14
Intel did add instruction for 1 round of AES. How many parallel S-boxes are in an AES-round coprocessor? Can be 16: big; fast. 8: smaller but slower. 4: even smaller but slower. : : : 1: probably not worthwhile compared to skipping coprocessor and using other CPU instructions. An instruction for 4 rounds of SHA-256 is in a few Intel CPUs.
15
Lightweight crypto Frequent claim in literature, where X might be
- Keccak;
- any secure hash;
- a secure cipher; : : : :
“Resource-constrained IoT devices need the smallest circuit for X.” —Even if speed is acceptable, who will use smallest X circuit?
SLIDE 87 14
Intel did add instruction for 1 round of AES. How many parallel S-boxes are in an AES-round coprocessor? Can be 16: big; fast. 8: smaller but slower. 4: even smaller but slower. : : : 1: probably not worthwhile compared to skipping coprocessor and using other CPU instructions. An instruction for 4 rounds of SHA-256 is in a few Intel CPUs.
15
Lightweight crypto Frequent claim in literature, where X might be
- Keccak;
- any secure hash;
- a secure cipher; : : : :
“Resource-constrained IoT devices need the smallest circuit for X.” —Even if speed is acceptable, who will use smallest X circuit? Why should minimum area for X give minimum area for IoT+X?
SLIDE 88 14
did add instruction round of AES. many parallel S-boxes are ES-round coprocessor? e 16: big; fast. smaller but slower. even smaller but slower. probably not worthwhile red to skipping coprocessor using other CPU instructions. instruction for 4 rounds of SHA-256 is in a few Intel CPUs.
15
Lightweight crypto Frequent claim in literature, where X might be
- Keccak;
- any secure hash;
- a secure cipher; : : : :
“Resource-constrained IoT devices need the smallest circuit for X.” —Even if speed is acceptable, who will use smallest X circuit? Why should minimum area for X give minimum area for IoT+X? An idea Consider public ke receives under these verifies thes e.g. an SS Painful histo all clients to suppo since old
SLIDE 89 14
instruction AES. rallel S-boxes are coprocessor? fast. slower. but slower. not worthwhile skipping coprocessor CPU instructions. r 4 rounds of few Intel CPUs.
15
Lightweight crypto Frequent claim in literature, where X might be
- Keccak;
- any secure hash;
- a secure cipher; : : : :
“Resource-constrained IoT devices need the smallest circuit for X.” —Even if speed is acceptable, who will use smallest X circuit? Why should minimum area for X give minimum area for IoT+X? An idea from Adam Consider a device that public keys from trusted receives data supp under these public verifies these signatures. e.g. an SSL client. Painful historical event: all clients needed upgrades to support new hash since old functions
SLIDE 90 14
xes are cessor? r. rthwhile rocessor instructions. rounds of CPUs.
15
Lightweight crypto Frequent claim in literature, where X might be
- Keccak;
- any secure hash;
- a secure cipher; : : : :
“Resource-constrained IoT devices need the smallest circuit for X.” —Even if speed is acceptable, who will use smallest X circuit? Why should minimum area for X give minimum area for IoT+X? An idea from Adam Langley Consider a device that receives public keys from trusted sou receives data supposedly signed under these public keys; verifies these signatures. e.g. an SSL client. Painful historical event: all clients needed upgrades to support new hash functions since old functions were brok
SLIDE 91 15
Lightweight crypto Frequent claim in literature, where X might be
- Keccak;
- any secure hash;
- a secure cipher; : : : :
“Resource-constrained IoT devices need the smallest circuit for X.” —Even if speed is acceptable, who will use smallest X circuit? Why should minimum area for X give minimum area for IoT+X?
16
An idea from Adam Langley Consider a device that receives public keys from trusted sources; receives data supposedly signed under these public keys; verifies these signatures. e.g. an SSL client. Painful historical event: all clients needed upgrades to support new hash functions since old functions were broken.
SLIDE 92
15
eight crypto requent claim in literature, X might be Keccak; secure hash; secure cipher; : : : : “Resource-constrained IoT devices the smallest circuit for X.” —Even if speed is acceptable, will use smallest X circuit? should minimum area for X minimum area for IoT+X?
16
An idea from Adam Langley Consider a device that receives public keys from trusted sources; receives data supposedly signed under these public keys; verifies these signatures. e.g. an SSL client. Painful historical event: all clients needed upgrades to support new hash functions since old functions were broken. A public signature-verification in a limited Langley’s Replace a full programming Then can (or upgrade signatures!) keys, with
SLIDE 93
15
crypto in literature, be hash; cipher; : : : : “Resource-constrained IoT devices smallest circuit for X.” is acceptable, smallest X circuit? inimum area for X rea for IoT+X?
16
An idea from Adam Langley Consider a device that receives public keys from trusted sources; receives data supposedly signed under these public keys; verifies these signatures. e.g. an SSL client. Painful historical event: all clients needed upgrades to support new hash functions since old functions were broken. A public key is a signature-verification in a limited language. Langley’s idea: Replace this language a full programming Then can upgrade (or upgrade to post-quantum signatures!) by changing keys, with no changes
SLIDE 94
15
literature, devices for X.” acceptable, circuit? rea for X IoT+X?
16
An idea from Adam Langley Consider a device that receives public keys from trusted sources; receives data supposedly signed under these public keys; verifies these signatures. e.g. an SSL client. Painful historical event: all clients needed upgrades to support new hash functions since old functions were broken. A public key is a signature-verification program in a limited language. Langley’s idea: Replace this language with a full programming language. Then can upgrade hash function (or upgrade to post-quantum signatures!) by changing public keys, with no changes to clients.
SLIDE 95
16
An idea from Adam Langley Consider a device that receives public keys from trusted sources; receives data supposedly signed under these public keys; verifies these signatures. e.g. an SSL client. Painful historical event: all clients needed upgrades to support new hash functions since old functions were broken.
17
A public key is a signature-verification program in a limited language. Langley’s idea: Replace this language with a full programming language. Then can upgrade hash function (or upgrade to post-quantum signatures!) by changing public keys, with no changes to clients.
SLIDE 96
16
An idea from Adam Langley Consider a device that receives public keys from trusted sources; receives data supposedly signed under these public keys; verifies these signatures. e.g. an SSL client. Painful historical event: all clients needed upgrades to support new hash functions since old functions were broken.
17
A public key is a signature-verification program in a limited language. Langley’s idea: Replace this language with a full programming language. Then can upgrade hash function (or upgrade to post-quantum signatures!) by changing public keys, with no changes to clients. Same for public-key encryption systems: public key is program.
SLIDE 97 16
idea from Adam Langley Consider a device that receives keys from trusted sources; receives data supposedly signed these public keys; verifies these signatures. SSL client. ainful historical event: clients needed upgrades support new hash functions
- ld functions were broken.
17
A public key is a signature-verification program in a limited language. Langley’s idea: Replace this language with a full programming language. Then can upgrade hash function (or upgrade to post-quantum signatures!) by changing public keys, with no changes to clients. Same for public-key encryption systems: public key is program. Say verification is a chip How small Have to size of a size of a
SLIDE 98
16
Adam Langley device that receives trusted sources; supposedly signed public keys; signatures. client. rical event: needed upgrades hash functions functions were broken.
17
A public key is a signature-verification program in a limited language. Langley’s idea: Replace this language with a full programming language. Then can upgrade hash function (or upgrade to post-quantum signatures!) by changing public keys, with no changes to clients. Same for public-key encryption systems: public key is program. Say verification device is a chip of area A How small can public Have to consider, e.g., size of a SHA-256 size of a Keccak progra
SLIDE 99
16
Langley receives sources; signed upgrades functions roken.
17
A public key is a signature-verification program in a limited language. Langley’s idea: Replace this language with a full programming language. Then can upgrade hash function (or upgrade to post-quantum signatures!) by changing public keys, with no changes to clients. Same for public-key encryption systems: public key is program. Say verification device is a chip of area A. How small can public keys b Have to consider, e.g., size of a SHA-256 program, size of a Keccak program, etc.
SLIDE 100
17
A public key is a signature-verification program in a limited language. Langley’s idea: Replace this language with a full programming language. Then can upgrade hash function (or upgrade to post-quantum signatures!) by changing public keys, with no changes to clients. Same for public-key encryption systems: public key is program.
18
Say verification device is a chip of area A. How small can public keys be? Have to consider, e.g., size of a SHA-256 program, size of a Keccak program, etc.
SLIDE 101
17
A public key is a signature-verification program in a limited language. Langley’s idea: Replace this language with a full programming language. Then can upgrade hash function (or upgrade to post-quantum signatures!) by changing public keys, with no changes to clients. Same for public-key encryption systems: public key is program.
18
Say verification device is a chip of area A. How small can public keys be? Have to consider, e.g., size of a SHA-256 program, size of a Keccak program, etc. Similar question to optimizing total size of a CPU with a SHA-256 instruction, a Keccak instruction, etc.
SLIDE 102
17
A public key is a signature-verification program in a limited language. Langley’s idea: Replace this language with a full programming language. Then can upgrade hash function (or upgrade to post-quantum signatures!) by changing public keys, with no changes to clients. Same for public-key encryption systems: public key is program.
18
Say verification device is a chip of area A. How small can public keys be? Have to consider, e.g., size of a SHA-256 program, size of a Keccak program, etc. Similar question to optimizing total size of a CPU with a SHA-256 instruction, a Keccak instruction, etc. Not the usual code-size question. Change the language!