SLIDE 1 1
Can cryptographic software be fixed?
2
Bob’s laptop screen:
From: Alice Thank you for your
many interesting papers, and unfortunately your
Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified
SLIDE 2 1
cryptographic software fixed? Bernstein
2
Bob’s laptop screen:
From: Alice Thank you for your
many interesting papers, and unfortunately your
Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified
Systems e.g. Firefo 4582680 3093398 2623454
SLIDE 3 1
cryptographic software
2
Bob’s laptop screen:
From: Alice Thank you for your
many interesting papers, and unfortunately your
Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified
Systems are too complex. e.g. Firefox 60 (Ma 4582680 lines in cpp 3093398 lines in h 2623454 lines in c
SLIDE 4 1
re
2
Bob’s laptop screen:
From: Alice Thank you for your
many interesting papers, and unfortunately your
Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified
Systems are too complex. e.g. Firefox 60 (May 2018) co 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc.
SLIDE 5 2
Bob’s laptop screen:
From: Alice Thank you for your
many interesting papers, and unfortunately your
Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified
3
Systems are too complex. e.g. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc.
SLIDE 6 2
Bob’s laptop screen:
From: Alice Thank you for your
many interesting papers, and unfortunately your
Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified
3
Systems are too complex. e.g. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc. Every line in this code has full control over user messages.
SLIDE 7 2
Bob’s laptop screen:
From: Alice Thank you for your
many interesting papers, and unfortunately your
Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified
3
Systems are too complex. e.g. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc. Every line in this code has full control over user messages. Critical vulnerabilities fixed in 61: CVE-2018-12359, “Buffer
- verflow using computed size
- f canvas element”; CVE-2018-
12360, “Use-after-free when using focus()”; CVE-2018-12361, “Integer overflow in SwizzleData”.
SLIDE 8 2
laptop screen:
From: Alice Thank you for your
many interesting papers, unfortunately your
assumes this message is something Alice actually sent. day’s “security” systems guarantee this property. er could have modified rged the message.
3
Systems are too complex. e.g. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc. Every line in this code has full control over user messages. Critical vulnerabilities fixed in 61: CVE-2018-12359, “Buffer
- verflow using computed size
- f canvas element”; CVE-2018-
12360, “Use-after-free when using focus()”; CVE-2018-12361, “Integer overflow in SwizzleData”. Trusted TCB: po that is resp the users’
SLIDE 9 2
creen:
for your We received interesting papers, unfortunately your
this message is actually sent. “security” systems this property. have modified message.
3
Systems are too complex. e.g. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc. Every line in this code has full control over user messages. Critical vulnerabilities fixed in 61: CVE-2018-12359, “Buffer
- verflow using computed size
- f canvas element”; CVE-2018-
12360, “Use-after-free when using focus()”; CVE-2018-12361, “Integer overflow in SwizzleData”. Trusted computing TCB: portion of computer that is responsible the users’ security
SLIDE 10 2
received papers, your
is sent. systems erty. dified
3
Systems are too complex. e.g. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc. Every line in this code has full control over user messages. Critical vulnerabilities fixed in 61: CVE-2018-12359, “Buffer
- verflow using computed size
- f canvas element”; CVE-2018-
12360, “Use-after-free when using focus()”; CVE-2018-12361, “Integer overflow in SwizzleData”. Trusted computing base (TCB TCB: portion of computer system that is responsible for enforcing the users’ security policy.
SLIDE 11 3
Systems are too complex. e.g. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc. Every line in this code has full control over user messages. Critical vulnerabilities fixed in 61: CVE-2018-12359, “Buffer
- verflow using computed size
- f canvas element”; CVE-2018-
12360, “Use-after-free when using focus()”; CVE-2018-12361, “Integer overflow in SwizzleData”.
4
Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy.
SLIDE 12 3
Systems are too complex. e.g. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc. Every line in this code has full control over user messages. Critical vulnerabilities fixed in 61: CVE-2018-12359, “Buffer
- verflow using computed size
- f canvas element”; CVE-2018-
12360, “Use-after-free when using focus()”; CVE-2018-12361, “Integer overflow in SwizzleData”.
4
Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice.
SLIDE 13 3
Systems are too complex. e.g. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc. Every line in this code has full control over user messages. Critical vulnerabilities fixed in 61: CVE-2018-12359, “Buffer
- verflow using computed size
- f canvas element”; CVE-2018-
12360, “Use-after-free when using focus()”; CVE-2018-12361, “Integer overflow in SwizzleData”.
4
Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does.
SLIDE 14 3
Systems are too complex. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc. line in this code has control over user messages. Critical vulnerabilities fixed in 61: CVE-2018-12359, “Buffer w using computed size canvas element”; CVE-2018- 12360, “Use-after-free when focus()”; CVE-2018-12361, “Integer overflow in SwizzleData”.
4
Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does. Examples
in a device Linux
SLIDE 15 3
complex. (May 2018) code: cpp files, h files, c files, etc. this code has user messages. vulnerabilities fixed in 61: CVE-2018-12359, “Buffer computed size element”; CVE-2018- “Use-after-free when CVE-2018-12361, in SwizzleData”.
4
Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does. Examples of attack
in a device driver Linux kernel on
SLIDE 16 3
2018) code: etc. messages. fixed in 61: size CVE-2018- when CVE-2018-12361, SwizzleData”.
4
Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does. Examples of attack strategies:
- 1. Attacker uses buffer overflo
in a device driver to control Linux kernel on Alice’s laptop.
SLIDE 17 4
Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does.
5
Examples of attack strategies:
- 1. Attacker uses buffer overflow
in a device driver to control Linux kernel on Alice’s laptop.
SLIDE 18 4
Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does.
5
Examples of attack strategies:
- 1. Attacker uses buffer overflow
in a device driver to control Linux kernel on Alice’s laptop.
- 2. Attacker uses buffer overflow
in a web browser to control disk files on Bob’s laptop.
SLIDE 19 4
Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does.
5
Examples of attack strategies:
- 1. Attacker uses buffer overflow
in a device driver to control Linux kernel on Alice’s laptop.
- 2. Attacker uses buffer overflow
in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc.
SLIDE 20 4
Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does.
5
Examples of attack strategies:
- 1. Attacker uses buffer overflow
in a device driver to control Linux kernel on Alice’s laptop.
- 2. Attacker uses buffer overflow
in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?
SLIDE 21 4
rusted computing base (TCB) portion of computer system responsible for enforcing users’ security policy. Security policy for this talk: message is displayed on screen as “From: Alice” message is from Alice. works correctly, message is guaranteed from Alice, no matter what rest of the system does.
5
Examples of attack strategies:
- 1. Attacker uses buffer overflow
in a device driver to control Linux kernel on Alice’s laptop.
- 2. Attacker uses buffer overflow
in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this? Classic sec Rearchitect to have a
SLIDE 22 4
computing base (TCB) computer system
security policy. for this talk: displayed on “From: Alice” from Alice. rrectly, guaranteed Alice, no matter what system does.
5
Examples of attack strategies:
- 1. Attacker uses buffer overflow
in a device driver to control Linux kernel on Alice’s laptop.
- 2. Attacker uses buffer overflow
in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this? Classic security strategy: Rearchitect computer to have a much smaller
SLIDE 23 4
(TCB) system rcing talk:
Alice”
Alice. ranteed matter what es.
5
Examples of attack strategies:
- 1. Attacker uses buffer overflow
in a device driver to control Linux kernel on Alice’s laptop.
- 2. Attacker uses buffer overflow
in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this? Classic security strategy: Rearchitect computer systems to have a much smaller TCB
SLIDE 24 5
Examples of attack strategies:
- 1. Attacker uses buffer overflow
in a device driver to control Linux kernel on Alice’s laptop.
- 2. Attacker uses buffer overflow
in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?
6
Classic security strategy: Rearchitect computer systems to have a much smaller TCB.
SLIDE 25 5
Examples of attack strategies:
- 1. Attacker uses buffer overflow
in a device driver to control Linux kernel on Alice’s laptop.
- 2. Attacker uses buffer overflow
in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?
6
Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB.
SLIDE 26 5
Examples of attack strategies:
- 1. Attacker uses buffer overflow
in a device driver to control Linux kernel on Alice’s laptop.
- 2. Attacker uses buffer overflow
in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?
6
Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs.
SLIDE 27 5
Examples of attack strategies:
- 1. Attacker uses buffer overflow
in a device driver to control Linux kernel on Alice’s laptop.
- 2. Attacker uses buffer overflow
in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?
6
Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly.
SLIDE 28 5
Examples of attack strategies:
- 1. Attacker uses buffer overflow
in a device driver to control Linux kernel on Alice’s laptop.
- 2. Attacker uses buffer overflow
in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?
6
Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly. Alice also runs many VMs.
SLIDE 29
5
Examples of attack strategies: ttacker uses buffer overflow device driver to control Linux kernel on Alice’s laptop. ttacker uses buffer overflow web browser to control files on Bob’s laptop. driver is in the TCB. rowser is in the TCB. is in the TCB. Etc. Massive TCB has many bugs, including many security holes. hope of fixing this?
6
Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly. Alice also runs many VMs. Focus of How does that incoming is from Alice’s Cryptographic Message-authentication Alice’s authenticated authenticated Alice’s
SLIDE 30 5
attack strategies: buffer overflow driver to control
buffer overflow wser to control Bob’s laptop. in the TCB. in the TCB.
has many bugs, security holes. fixing this?
6
Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly. Alice also runs many VMs. Focus of this talk: How does Bob’s laptop that incoming netw is from Alice’s laptop? Cryptographic solution: Message-authentication Alice’s message
untrusted
- authenticated message
- Alice’s message
SLIDE 31 5
strategies:
control laptop.
control laptop. TCB. TCB. bugs, holes.
6
Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly. Alice also runs many VMs. Focus of this talk: Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message
untrusted netwo
- authenticated message
- Alice’s message
SLIDE 32 6
Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly. Alice also runs many VMs.
7
Focus of this talk: Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message
untrusted network
- authenticated message
- Alice’s message
k
SLIDE 33 6
Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly. Alice also runs many VMs.
7
Focus of this talk: Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message
untrusted network
- modified message
- “Alert: forgery!”
k
SLIDE 34 6
security strategy: rchitect computer systems have a much smaller TCB. refully audit the TCB. Bob runs many VMs: A data VM C Charlie data · · · stops each VM from touching data in other VMs. wser in VM C isn’t in TCB. touch data in VM A, works correctly. also runs many VMs.
7
Focus of this talk: Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message
untrusted network
- modified message
- “Alert: forgery!”
k
to share What if
SLIDE 35 6
strategy: computer systems smaller TCB. the TCB. many VMs: VM C Charlie data · · · VM from
C isn’t in TCB. ta in VM A, rrectly. many VMs.
7
Focus of this talk: Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message
untrusted network
- modified message
- “Alert: forgery!”
k
to share the same What if attacker w
SLIDE 36 6
systems TCB. · · · VMs. TCB. A, VMs.
7
Focus of this talk: Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message
untrusted network
- modified message
- “Alert: forgery!”
k
- Important for Alice and Bob
to share the same secret k. What if attacker was spying
- n their communication of k
SLIDE 37 7
Focus of this talk: Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message
untrusted network
- modified message
- “Alert: forgery!”
k
Important for Alice and Bob to share the same secret k. What if attacker was spying
- n their communication of k?
SLIDE 38 7
Focus of this talk: Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message
untrusted network
- modified message
- “Alert: forgery!”
k
Important for Alice and Bob to share the same secret k. What if attacker was spying
- n their communication of k?
Solution 1: Public-key encryption. k private key a
network
network
SLIDE 39 7
- f this talk: Cryptography
does Bob’s laptop know incoming network data Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message
untrusted network
- dified message
- “Alert: forgery!”
k
Important for Alice and Bob to share the same secret k. What if attacker was spying
- n their communication of k?
Solution 1: Public-key encryption. k private key a
network
network
Public-key m
- signed message
- signed message
- m
SLIDE 40 7
talk: Cryptography laptop know network data laptop? solution: Message-authentication codes. message k
untrusted network message rgery!” k
Important for Alice and Bob to share the same secret k. What if attacker was spying
- n their communication of k?
Solution 1: Public-key encryption. k private key a
network
network
- public key aG
- k
- Solution 2:
Public-key signatures. m
network
m
SLIDE 41 7
Cryptography know data des. k work k
8
Important for Alice and Bob to share the same secret k. What if attacker was spying
- n their communication of k?
Solution 1: Public-key encryption. k private key a
network
network
- public key aG
- k
- Solution 2:
Public-key signatures. m
network
net
SLIDE 42 8
Important for Alice and Bob to share the same secret k. What if attacker was spying
- n their communication of k?
Solution 1: Public-key encryption. k private key a
network
network
Solution 2: Public-key signatures. m
network
network
SLIDE 43 8
Important for Alice and Bob to share the same secret k. What if attacker was spying
- n their communication of k?
Solution 1: Public-key encryption. k private key a
network
network
Solution 2: Public-key signatures. m
network
network
Fantasy world: software for authentication/encryption/sigs is small and carefully audited ⇒ no cryptographic security failures.
SLIDE 44 8
rtant for Alice and Bob re the same secret k. if attacker was spying their communication of k? Solution 1: Public-key encryption. private key a
public key aG network
network public key aG
Solution 2: Public-key signatures. m
network
network
Fantasy world: software for authentication/encryption/sigs is small and carefully audited ⇒ no cryptographic security failures. Real world: Cryptographic is huge.
Most complications
SLIDE 45 8
Alice and Bob same secret k. was spying communication of k? encryption. private key a
network
9
Solution 2: Public-key signatures. m
network
network
Fantasy world: software for authentication/encryption/sigs is small and carefully audited ⇒ no cryptographic security failures. Real world: Cryptographic part is huge. Many implementations
Most complications
SLIDE 46 8
Bob . ying
key a y aG network y aG
9
Solution 2: Public-key signatures. m
network
network
Fantasy world: software for authentication/encryption/sigs is small and carefully audited ⇒ no cryptographic security failures. Real world: Cryptographic part of the TCB is huge. Many implementations
- f many cryptographic primitives.
Most complications are for sp
SLIDE 47 9
Solution 2: Public-key signatures. m
network
network
Fantasy world: software for authentication/encryption/sigs is small and carefully audited ⇒ no cryptographic security failures.
10
Real world: Cryptographic part of the TCB is huge. Many implementations
- f many cryptographic primitives.
Most complications are for speed.
SLIDE 48 9
Solution 2: Public-key signatures. m
network
network
Fantasy world: software for authentication/encryption/sigs is small and carefully audited ⇒ no cryptographic security failures.
10
Real world: Cryptographic part of the TCB is huge. Many implementations
- f many cryptographic primitives.
Most complications are for speed. e.g. February 2018: Google adds NSA’s Speck cipher to Linux kernel using hand-written asm for ARM Cortex-A7 processors.
SLIDE 49 9
Solution 2: Public-key signatures. m
network
network
Fantasy world: software for authentication/encryption/sigs is small and carefully audited ⇒ no cryptographic security failures.
10
Real world: Cryptographic part of the TCB is huge. Many implementations
- f many cryptographic primitives.
Most complications are for speed. e.g. February 2018: Google adds NSA’s Speck cipher to Linux kernel using hand-written asm for ARM Cortex-A7 processors. August 2018: Google switches from Speck to ChaCha12, again using hand-written assembly. Why not ChaCha20? Speed.
SLIDE 50 9
Solution 2: Public-key signatures. m
network
network
antasy world: software for authentication/encryption/sigs small and carefully audited ⇒ cryptographic security failures.
10
Real world: Cryptographic part of the TCB is huge. Many implementations
- f many cryptographic primitives.
Most complications are for speed. e.g. February 2018: Google adds NSA’s Speck cipher to Linux kernel using hand-written asm for ARM Cortex-A7 processors. August 2018: Google switches from Speck to ChaCha12, again using hand-written assembly. Why not ChaCha20? Speed. Keccak (SHA-3) “Keccak >20 optimized
Includes many further
SLIDE 51 9
signatures. a
network
software for authentication/encryption/sigs refully audited ⇒ security failures.
10
Real world: Cryptographic part of the TCB is huge. Many implementations
- f many cryptographic primitives.
Most complications are for speed. e.g. February 2018: Google adds NSA’s Speck cipher to Linux kernel using hand-written asm for ARM Cortex-A7 processors. August 2018: Google switches from Speck to ChaCha12, again using hand-written assembly. Why not ChaCha20? Speed. Keccak (SHA-3) team “Keccak Code Pack >20 optimized implementations
Includes “parallel Keccak”: many further implementations.
SLIDE 52 9
network r authentication/encryption/sigs audited ⇒ failures.
10
Real world: Cryptographic part of the TCB is huge. Many implementations
- f many cryptographic primitives.
Most complications are for speed. e.g. February 2018: Google adds NSA’s Speck cipher to Linux kernel using hand-written asm for ARM Cortex-A7 processors. August 2018: Google switches from Speck to ChaCha12, again using hand-written assembly. Why not ChaCha20? Speed. Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations
- f Keccak: AVX2, NEON, etc.
Includes “parallel Keccak”: many further implementations.
SLIDE 53 10
Real world: Cryptographic part of the TCB is huge. Many implementations
- f many cryptographic primitives.
Most complications are for speed. e.g. February 2018: Google adds NSA’s Speck cipher to Linux kernel using hand-written asm for ARM Cortex-A7 processors. August 2018: Google switches from Speck to ChaCha12, again using hand-written assembly. Why not ChaCha20? Speed.
11
Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations
- f Keccak: AVX2, NEON, etc.
Includes “parallel Keccak”: many further implementations.
SLIDE 54 10
Real world: Cryptographic part of the TCB is huge. Many implementations
- f many cryptographic primitives.
Most complications are for speed. e.g. February 2018: Google adds NSA’s Speck cipher to Linux kernel using hand-written asm for ARM Cortex-A7 processors. August 2018: Google switches from Speck to ChaCha12, again using hand-written assembly. Why not ChaCha20? Speed.
11
Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations
- f Keccak: AVX2, NEON, etc.
Includes “parallel Keccak”: many further implementations. Why not portable C code using “optimizing” compiler? Slower.
SLIDE 55 10
Real world: Cryptographic part of the TCB is huge. Many implementations
- f many cryptographic primitives.
Most complications are for speed. e.g. February 2018: Google adds NSA’s Speck cipher to Linux kernel using hand-written asm for ARM Cortex-A7 processors. August 2018: Google switches from Speck to ChaCha12, again using hand-written assembly. Why not ChaCha20? Speed.
11
Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations
- f Keccak: AVX2, NEON, etc.
Includes “parallel Keccak”: many further implementations. Why not portable C code using “optimizing” compiler? Slower. Another example: many different primitives in NIST competition for post-quantum public-key
- cryptography. (See next talk.)
Some overlap in implementations, but still huge volume of code.
SLIDE 56 10
Cryptographic part of the TCB
- huge. Many implementations
many cryptographic primitives. complications are for speed. ebruary 2018: Google adds Speck cipher to Linux using hand-written asm M Cortex-A7 processors. August 2018: Google switches Speck to ChaCha12, again hand-written assembly. not ChaCha20? Speed.
11
Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations
- f Keccak: AVX2, NEON, etc.
Includes “parallel Keccak”: many further implementations. Why not portable C code using “optimizing” compiler? Slower. Another example: many different primitives in NIST competition for post-quantum public-key
- cryptography. (See next talk.)
Some overlap in implementations, but still huge volume of code. Often people cryptographic e.g. NIST, really like
⇒ More
SLIDE 57 10
part of the TCB implementations cryptographic primitives. complications are for speed. 2018: Google adds cipher to Linux hand-written asm rtex-A7 processors. Google switches ChaCha12, again hand-written assembly. Cha20? Speed.
11
Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations
- f Keccak: AVX2, NEON, etc.
Includes “parallel Keccak”: many further implementations. Why not portable C code using “optimizing” compiler? Slower. Another example: many different primitives in NIST competition for post-quantum public-key
- cryptography. (See next talk.)
Some overlap in implementations, but still huge volume of code. Often people still complain cryptographic perfo e.g. NIST, May 2018: really like to see mo
⇒ More and more
SLIDE 58 10
TCB implementations rimitives. r speed.
Linux asm cessors. switches again sembly. eed.
11
Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations
- f Keccak: AVX2, NEON, etc.
Includes “parallel Keccak”: many further implementations. Why not portable C code using “optimizing” compiler? Slower. Another example: many different primitives in NIST competition for post-quantum public-key
- cryptography. (See next talk.)
Some overlap in implementations, but still huge volume of code. Often people still complain ab cryptographic performance. e.g. NIST, May 2018: “we’d really like to see more platfo
- ptimized implementations”.
⇒ More and more software.
SLIDE 59 11
Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations
- f Keccak: AVX2, NEON, etc.
Includes “parallel Keccak”: many further implementations. Why not portable C code using “optimizing” compiler? Slower. Another example: many different primitives in NIST competition for post-quantum public-key
- cryptography. (See next talk.)
Some overlap in implementations, but still huge volume of code.
12
Often people still complain about cryptographic performance. e.g. NIST, May 2018: “we’d really like to see more platform-
- ptimized implementations”.
⇒ More and more software.
SLIDE 60 11
Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations
- f Keccak: AVX2, NEON, etc.
Includes “parallel Keccak”: many further implementations. Why not portable C code using “optimizing” compiler? Slower. Another example: many different primitives in NIST competition for post-quantum public-key
- cryptography. (See next talk.)
Some overlap in implementations, but still huge volume of code.
12
Often people still complain about cryptographic performance. e.g. NIST, May 2018: “we’d really like to see more platform-
- ptimized implementations”.
⇒ More and more software. Many security failures from incorrect computations: e.g., CVE-2017-3732, CVE-2017-3736, CVE-2017-3738 in OpenSSL.
SLIDE 61 11
Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations
- f Keccak: AVX2, NEON, etc.
Includes “parallel Keccak”: many further implementations. Why not portable C code using “optimizing” compiler? Slower. Another example: many different primitives in NIST competition for post-quantum public-key
- cryptography. (See next talk.)
Some overlap in implementations, but still huge volume of code.
12
Often people still complain about cryptographic performance. e.g. NIST, May 2018: “we’d really like to see more platform-
- ptimized implementations”.
⇒ More and more software. Many security failures from incorrect computations: e.g., CVE-2017-3732, CVE-2017-3736, CVE-2017-3738 in OpenSSL. Many security failures from variable-time computations: e.g. CVE-2018-0495, CVE-2018-0737, CVE-2018-5407 in OpenSSL.
SLIDE 62 11
Keccak (SHA-3) team maintains “Keccak Code Package” with
Keccak: AVX2, NEON, etc. Includes “parallel Keccak”: further implementations. not portable C code using “optimizing” compiler? Slower. Another example: many different rimitives in NIST competition
- st-quantum public-key
- cryptography. (See next talk.)
- verlap in implementations,
still huge volume of code.
12
Often people still complain about cryptographic performance. e.g. NIST, May 2018: “we’d really like to see more platform-
- ptimized implementations”.
⇒ More and more software. Many security failures from incorrect computations: e.g., CVE-2017-3732, CVE-2017-3736, CVE-2017-3738 in OpenSSL. Many security failures from variable-time computations: e.g. CVE-2018-0495, CVE-2018-0737, CVE-2018-5407 in OpenSSL. Timing attacks Large po
addresses Consider instruction parallel cache store-to-load branch p
SLIDE 63 11
team maintains ackage” with implementations VX2, NEON, etc. rallel Keccak”: implementations. ble C code using compiler? Slower. example: many different NIST competition public-key (See next talk.) implementations, volume of code.
12
Often people still complain about cryptographic performance. e.g. NIST, May 2018: “we’d really like to see more platform-
- ptimized implementations”.
⇒ More and more software. Many security failures from incorrect computations: e.g., CVE-2017-3732, CVE-2017-3736, CVE-2017-3738 in OpenSSL. Many security failures from variable-time computations: e.g. CVE-2018-0495, CVE-2018-0737, CVE-2018-5407 in OpenSSL. Timing attacks Large portion of CPU
addresses of memo Consider data cachin instruction caching, parallel cache banks, store-to-load forwa branch prediction,
SLIDE 64 11
maintains with implementations etc. Keccak”: implementations. using Slower. different etition ey talk.) implementations, de.
12
Often people still complain about cryptographic performance. e.g. NIST, May 2018: “we’d really like to see more platform-
- ptimized implementations”.
⇒ More and more software. Many security failures from incorrect computations: e.g., CVE-2017-3732, CVE-2017-3736, CVE-2017-3738 in OpenSSL. Many security failures from variable-time computations: e.g. CVE-2018-0495, CVE-2018-0737, CVE-2018-5407 in OpenSSL. Timing attacks Large portion of CPU hardw
- ptimizations depending on
addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc.
SLIDE 65 12
Often people still complain about cryptographic performance. e.g. NIST, May 2018: “we’d really like to see more platform-
- ptimized implementations”.
⇒ More and more software. Many security failures from incorrect computations: e.g., CVE-2017-3732, CVE-2017-3736, CVE-2017-3738 in OpenSSL. Many security failures from variable-time computations: e.g. CVE-2018-0495, CVE-2018-0737, CVE-2018-5407 in OpenSSL.
13
Timing attacks Large portion of CPU hardware:
- ptimizations depending on
addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc.
SLIDE 66 12
Often people still complain about cryptographic performance. e.g. NIST, May 2018: “we’d really like to see more platform-
- ptimized implementations”.
⇒ More and more software. Many security failures from incorrect computations: e.g., CVE-2017-3732, CVE-2017-3736, CVE-2017-3738 in OpenSSL. Many security failures from variable-time computations: e.g. CVE-2018-0495, CVE-2018-0737, CVE-2018-5407 in OpenSSL.
13
Timing attacks Large portion of CPU hardware:
- ptimizations depending on
addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks (e.g. TLBleed from 2018 Gras–Razavi–Bos–Giuffrida) show that this portion of the CPU has trouble keeping secrets.
SLIDE 67 12
people still complain about cryptographic performance. NIST, May 2018: “we’d like to see more platform-
- ptimized implementations”.
re and more software. security failures from rrect computations: e.g., CVE-2017-3732, CVE-2017-3736, CVE-2017-3738 in OpenSSL. security failures from riable-time computations: e.g. CVE-2018-0495, CVE-2018-0737, CVE-2018-5407 in OpenSSL.
13
Timing attacks Large portion of CPU hardware:
- ptimizations depending on
addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks (e.g. TLBleed from 2018 Gras–Razavi–Bos–Giuffrida) show that this portion of the CPU has trouble keeping secrets. Typical literature Understand But details not exposed Try to push This becomes Tweak the to try to
SLIDE 68 12
still complain about erformance. 2018: “we’d more platform- implementations”. re software. failures from computations: e.g., CVE-2017-3736, in OpenSSL. failures from computations: e.g. CVE-2018-0737, in OpenSSL.
13
Timing attacks Large portion of CPU hardware:
- ptimizations depending on
addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks (e.g. TLBleed from 2018 Gras–Razavi–Bos–Giuffrida) show that this portion of the CPU has trouble keeping secrets. Typical literature on Understand this po But details are often not exposed to securit Try to push attacks This becomes very Tweak the attacked to try to stop the kno
SLIDE 69 12
complain about rmance. e’d platform- tions”. re. e.g., CVE-2017-3736, enSSL. computations: e.g. CVE-2018-0737, enSSL.
13
Timing attacks Large portion of CPU hardware:
- ptimizations depending on
addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks (e.g. TLBleed from 2018 Gras–Razavi–Bos–Giuffrida) show that this portion of the CPU has trouble keeping secrets. Typical literature on this topic: Understand this portion of CPU. But details are often proprieta not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks.
SLIDE 70 13
Timing attacks Large portion of CPU hardware:
- ptimizations depending on
addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks (e.g. TLBleed from 2018 Gras–Razavi–Bos–Giuffrida) show that this portion of the CPU has trouble keeping secrets.
14
Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks.
SLIDE 71 13
Timing attacks Large portion of CPU hardware:
- ptimizations depending on
addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks (e.g. TLBleed from 2018 Gras–Razavi–Bos–Giuffrida) show that this portion of the CPU has trouble keeping secrets.
14
Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great!
SLIDE 72 13
Timing attacks Large portion of CPU hardware:
- ptimizations depending on
addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks (e.g. TLBleed from 2018 Gras–Razavi–Bos–Giuffrida) show that this portion of the CPU has trouble keeping secrets.
14
Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security.
SLIDE 73 13
Timing attacks portion of CPU hardware:
- ptimizations depending on
addresses of memory locations. Consider data caching, instruction caching, rallel cache banks, re-to-load forwarding, prediction, etc. attacks (e.g. TLBleed from Gras–Razavi–Bos–Giuffrida) that this portion of the CPU trouble keeping secrets.
14
Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security. The “constant-time” Don’t give to this p (1987 Goldreich, Oblivious domain-sp
SLIDE 74 13
CPU hardware: depending on memory locations. caching, caching, banks, rwarding, rediction, etc. (e.g. TLBleed from Gras–Razavi–Bos–Giuffrida)
eeping secrets.
14
Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security. The “constant-time” Don’t give any secrets to this portion of the (1987 Goldreich, 1990 Oblivious RAM; 2004 domain-specific for
SLIDE 75 13
rdware:
cations. TLBleed from Gras–Razavi–Bos–Giuffrida) the CPU secrets.
14
Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security. The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better sp
SLIDE 76
14
Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security.
15
The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed)
SLIDE 77 14
Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security.
15
The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion
- f the CPU to be correct, but
don’t need it to keep secrets. Makes auditing much easier.
SLIDE 78 14
Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security.
15
The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion
- f the CPU to be correct, but
don’t need it to keep secrets. Makes auditing much easier. Good match for attitude and experience of CPU designers: e.g., Intel issues errata for correctness bugs, not for information leaks.
SLIDE 79 14
ypical literature on this topic: Understand this portion of CPU. details are often proprietary, exposed to security review. push attacks further. ecomes very complicated. the attacked software to stop the known attacks. researchers: This is great! auditors: This is a nightmare. years of security failures. confidence in future security.
15
The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion
- f the CPU to be correct, but
don’t need it to keep secrets. Makes auditing much easier. Good match for attitude and experience of CPU designers: e.g., Intel issues errata for correctness bugs, not for information leaks. Case study: Subroutine Classic McEliec Gravity-SPHINCS, LEDApkc, sort array e.g. sort Typical so merge so choose load/sto based on also branch How to so without
SLIDE 80 14
literature on this topic: portion of CPU.
security review. ttacks further. very complicated. attacked software the known attacks. This is great! This is a nightmare. security failures. future security.
15
The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion
- f the CPU to be correct, but
don’t need it to keep secrets. Makes auditing much easier. Good match for attitude and experience of CPU designers: e.g., Intel issues errata for correctness bugs, not for information leaks. Case study: Constant-time Subroutine in (e.g.) Classic McEliece, GeMSS, Gravity-SPHINCS, LEDApkc, NTRU Prime, sort array of secret e.g. sort 768 32-bit Typical sorting algo merge sort, quickso choose load/store based on secret data. also branch based How to sort secret without any secret
SLIDE 81 14
topic:
rietary, review. further. complicated. are attacks. great! nightmare. failures. security.
15
The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion
- f the CPU to be correct, but
don’t need it to keep secrets. Makes auditing much easier. Good match for attitude and experience of CPU designers: e.g., Intel issues errata for correctness bugs, not for information leaks. Case study: Constant-time so Subroutine in (e.g.) BIG QUAKE, Classic McEliece, GeMSS, Gravity-SPHINCS, LEDAkem, LEDApkc, NTRU Prime, Round2: sort array of secret integers. e.g. sort 768 32-bit integers. Typical sorting algorithms— merge sort, quicksort, etc.— choose load/store addresses based on secret data. Usually also branch based on secret How to sort secret data without any secret addresses?
SLIDE 82 15
The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion
- f the CPU to be correct, but
don’t need it to keep secrets. Makes auditing much easier. Good match for attitude and experience of CPU designers: e.g., Intel issues errata for correctness bugs, not for information leaks.
16
Case study: Constant-time sorting Subroutine in (e.g.) BIG QUAKE, Classic McEliece, GeMSS, Gravity-SPHINCS, LEDAkem, LEDApkc, NTRU Prime, Round2: sort array of secret integers. e.g. sort 768 32-bit integers. Typical sorting algorithms— merge sort, quicksort, etc.— choose load/store addresses based on secret data. Usually also branch based on secret data. How to sort secret data without any secret addresses?
SLIDE 83 15
“constant-time” solution: give any secrets portion of the CPU. Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) analysis: Need this portion CPU to be correct, but need it to keep secrets. auditing much easier. match for attitude and erience of CPU designers: e.g., issues errata for correctness not for information leaks.
16
Case study: Constant-time sorting Subroutine in (e.g.) BIG QUAKE, Classic McEliece, GeMSS, Gravity-SPHINCS, LEDAkem, LEDApkc, NTRU Prime, Round2: sort array of secret integers. e.g. sort 768 32-bit integers. Typical sorting algorithms— merge sort, quicksort, etc.— choose load/store addresses based on secret data. Usually also branch based on secret data. How to sort secret data without any secret addresses? Foundation a compa x
Easy constant-time Warning: compiler Even easier
SLIDE 84 15
“constant-time” solution: secrets
Goldreich, 1990 Ostrovsky: 2004 Bernstein: for better speed) Need this portion e correct, but keep secrets. much easier. attitude and U designers: e.g., errata for correctness information leaks.
16
Case study: Constant-time sorting Subroutine in (e.g.) BIG QUAKE, Classic McEliece, GeMSS, Gravity-SPHINCS, LEDAkem, LEDApkc, NTRU Prime, Round2: sort array of secret integers. e.g. sort 768 32-bit integers. Typical sorting algorithms— merge sort, quicksort, etc.— choose load/store addresses based on secret data. Usually also branch based on secret data. How to sort secret data without any secret addresses? Foundation of solution: a comparator sorting x
max Easy constant-time Warning: C standa compiler to break the Even easier exercise
SLIDE 85 15
solution: CPU. Ostrovsky: Bernstein: speed)
but secrets. easier. and designers: e.g., rrectness leaks.
16
Case study: Constant-time sorting Subroutine in (e.g.) BIG QUAKE, Classic McEliece, GeMSS, Gravity-SPHINCS, LEDAkem, LEDApkc, NTRU Prime, Round2: sort array of secret integers. e.g. sort 768 32-bit integers. Typical sorting algorithms— merge sort, quicksort, etc.— choose load/store addresses based on secret data. Usually also branch based on secret data. How to sort secret data without any secret addresses? Foundation of solution: a comparator sorting 2 integers. x y
max{x; y} Easy constant-time exercise Warning: C standard allows compiler to break the solution. Even easier exercise in asm.
SLIDE 86 16
Case study: Constant-time sorting Subroutine in (e.g.) BIG QUAKE, Classic McEliece, GeMSS, Gravity-SPHINCS, LEDAkem, LEDApkc, NTRU Prime, Round2: sort array of secret integers. e.g. sort 768 32-bit integers. Typical sorting algorithms— merge sort, quicksort, etc.— choose load/store addresses based on secret data. Usually also branch based on secret data. How to sort secret data without any secret addresses?
17
Foundation of solution: a comparator sorting 2 integers. x y
max{x; y} Easy constant-time exercise in C. Warning: C standard allows compiler to break the solution. Even easier exercise in asm.
SLIDE 87 16
study: Constant-time sorting routine in (e.g.) BIG QUAKE, McEliece, GeMSS, y-SPHINCS, LEDAkem, Apkc, NTRU Prime, Round2: rray of secret integers. rt 768 32-bit integers. ypical sorting algorithms— sort, quicksort, etc.— load/store addresses
anch based on secret data. to sort secret data without any secret addresses?
17
Foundation of solution: a comparator sorting 2 integers. x y
max{x; y} Easy constant-time exercise in C. Warning: C standard allows compiler to break the solution. Even easier exercise in asm. Combine sorting net Example
SLIDE 88 16
Constant-time sorting (e.g.) BIG QUAKE, e, GeMSS, y-SPHINCS, LEDAkem, Prime, Round2: cret integers. 32-bit integers. algorithms— quicksort, etc.— re addresses
based on secret data. ecret data secret addresses?
17
Foundation of solution: a comparator sorting 2 integers. x y
max{x; y} Easy constant-time exercise in C. Warning: C standard allows compiler to break the solution. Even easier exercise in asm. Combine comparato sorting network fo Example of a sorting
SLIDE 89 16
Constant-time sorting QUAKE, em, Round2: integers. gers. rithms— tc.— addresses Usually secret data. addresses?
17
Foundation of solution: a comparator sorting 2 integers. x y
max{x; y} Easy constant-time exercise in C. Warning: C standard allows compiler to break the solution. Even easier exercise in asm. Combine comparators into a sorting network for more inputs. Example of a sorting network:
SLIDE 90 17
Foundation of solution: a comparator sorting 2 integers. x y
max{x; y} Easy constant-time exercise in C. Warning: C standard allows compiler to break the solution. Even easier exercise in asm.
18
Combine comparators into a sorting network for more inputs. Example of a sorting network:
SLIDE 91 17
comparator sorting 2 integers. y
max{x; y} constant-time exercise in C. rning: C standard allows compiler to break the solution. easier exercise in asm.
18
Combine comparators into a sorting network for more inputs. Example of a sorting network:
in a sorting independent Naturally
SLIDE 92 17
solution: sorting 2 integers. y
constant-time exercise in C. standard allows reak the solution. exercise in asm.
18
Combine comparators into a sorting network for more inputs. Example of a sorting network:
in a sorting network independent of the Naturally constant-time.
SLIDE 93 17
integers. } exercise in C. ws tion. .
18
Combine comparators into a sorting network for more inputs. Example of a sorting network:
in a sorting network are independent of the input. Naturally constant-time.
SLIDE 94 18
Combine comparators into a sorting network for more inputs. Example of a sorting network:
Positions of comparators in a sorting network are independent of the input. Naturally constant-time.
SLIDE 95 18
Combine comparators into a sorting network for more inputs. Example of a sorting network:
Positions of comparators in a sorting network are independent of the input. Naturally constant-time. But remember all the people complaining about speed: e.g., “We would be happy to hear that fixed weight sampling is efficient
- n a variety of platforms : : :
We have not yet been convinced that this is the case.”
SLIDE 96 18
Combine comparators into a sorting network for more inputs. Example of a sorting network:
Positions of comparators in a sorting network are independent of the input. Naturally constant-time. But remember all the people complaining about speed: e.g., “We would be happy to hear that fixed weight sampling is efficient
- n a variety of platforms : : :
We have not yet been convinced that this is the case.” (n2 − n)=2 comparators in bubble sort produce complaints about performance as n increases.
SLIDE 97 18
Combine comparators into a rting network for more inputs. Example of a sorting network:
Positions of comparators in a sorting network are independent of the input. Naturally constant-time. But remember all the people complaining about speed: e.g., “We would be happy to hear that fixed weight sampling is efficient
- n a variety of platforms : : :
We have not yet been convinced that this is the case.” (n2 − n)=2 comparators in bubble sort produce complaints about performance as n increases.
void int32_sort(int32 { int64 if (n t = 1; while for (p for if for for } }
SLIDE 98 18
rators into a for more inputs. rting network:
Positions of comparators in a sorting network are independent of the input. Naturally constant-time. But remember all the people complaining about speed: e.g., “We would be happy to hear that fixed weight sampling is efficient
- n a variety of platforms : : :
We have not yet been convinced that this is the case.” (n2 − n)=2 comparators in bubble sort produce complaints about performance as n increases.
void int32_sort(int32 { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - for (p = t;p > for (i = 0;i if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q for (i = 0;i if (!(i & minmax(x+i+p,x+i+q); } }
SLIDE 99 18
a inputs.
Positions of comparators in a sorting network are independent of the input. Naturally constant-time. But remember all the people complaining about speed: e.g., “We would be happy to hear that fixed weight sampling is efficient
- n a variety of platforms : : :
We have not yet been convinced that this is the case.” (n2 − n)=2 comparators in bubble sort produce complaints about performance as n increases.
void int32_sort(int32 *x,int64 { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - t) t += for (p = t;p > 0;p >>= for (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q > p;q >>= for (i = 0;i < n - if (!(i & p)) minmax(x+i+p,x+i+q); } }
SLIDE 100 19
Positions of comparators in a sorting network are independent of the input. Naturally constant-time. But remember all the people complaining about speed: e.g., “We would be happy to hear that fixed weight sampling is efficient
- n a variety of platforms : : :
We have not yet been convinced that this is the case.” (n2 − n)=2 comparators in bubble sort produce complaints about performance as n increases.
20
void int32_sort(int32 *x,int64 n) { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - t) t += t; for (p = t;p > 0;p >>= 1) { for (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q > p;q >>= 1) for (i = 0;i < n - q;++i) if (!(i & p)) minmax(x+i+p,x+i+q); } }
SLIDE 101 19
rting network are endent of the input. Naturally constant-time. remember all the people complaining about speed: e.g.,
- uld be happy to hear that
eight sampling is efficient variety of platforms : : : ve not yet been convinced this is the case.” n)=2 comparators in bubble roduce complaints about rmance as n increases.
20
void int32_sort(int32 *x,int64 n) { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - t) t += t; for (p = t;p > 0;p >>= 1) { for (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q > p;q >>= 1) for (i = 0;i < n - q;++i) if (!(i & p)) minmax(x+i+p,x+i+q); } }
Previous 1973 Knuth which is 1968 Batcher sorting net ≈n(log2 Much faster Warning:
require n Also, Wikip networks handling
SLIDE 102 19
comparators
the input. constant-time. ll the people
happy to hear that sampling is efficient platforms : : : been convinced case.” comparators in bubble complaints about n increases.
20
void int32_sort(int32 *x,int64 n) { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - t) t += t; for (p = t;p > 0;p >>= 1) { for (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q > p;q >>= 1) for (i = 0;i < n - q;++i) if (!(i & p)) minmax(x+i+p,x+i+q); } }
Previous slide: C translation 1973 Knuth “merge which is a simplified 1968 Batcher “odd-even sorting networks. ≈n(log2 n)2=4 compa Much faster than bubble Warning: many other
require n to be a p Also, Wikipedia sa networks : : : are n handling arbitrarily
SLIDE 103 19
eople e.g., hear that efficient : : convinced bubble about increases.
20
void int32_sort(int32 *x,int64 n) { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - t) t += t; for (p = t;p > 0;p >>= 1) { for (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q > p;q >>= 1) for (i = 0;i < n - q;++i) if (!(i & p)) minmax(x+i+p,x+i+q); } }
Previous slide: C translation 1973 Knuth “merge exchange”, which is a simplified version 1968 Batcher “odd-even merge” sorting networks. ≈n(log2 n)2=4 comparators. Much faster than bubble sort. Warning: many other descriptions
- f Batcher’s sorting networks
require n to be a power of 2. Also, Wikipedia says “Sorting networks : : : are not capable handling arbitrarily large inputs
SLIDE 104 20
void int32_sort(int32 *x,int64 n) { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - t) t += t; for (p = t;p > 0;p >>= 1) { for (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q > p;q >>= 1) for (i = 0;i < n - q;++i) if (!(i & p)) minmax(x+i+p,x+i+q); } }
21
Previous slide: C translation of 1973 Knuth “merge exchange”, which is a simplified version of 1968 Batcher “odd-even merge” sorting networks. ≈n(log2 n)2=4 comparators. Much faster than bubble sort. Warning: many other descriptions
- f Batcher’s sorting networks
require n to be a power of 2. Also, Wikipedia says “Sorting networks : : : are not capable of handling arbitrarily large inputs.”
SLIDE 105 20
int32_sort(int32 *x,int64 n) t,p,q,i; < 2) return; 1; (t < n - t) t += t; (p = t;p > 0;p >>= 1) { (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); (q = t;q > p;q >>= 1) for (i = 0;i < n - q;++i) if (!(i & p)) minmax(x+i+p,x+i+q);
21
Previous slide: C translation of 1973 Knuth “merge exchange”, which is a simplified version of 1968 Batcher “odd-even merge” sorting networks. ≈n(log2 n)2=4 comparators. Much faster than bubble sort. Warning: many other descriptions
- f Batcher’s sorting networks
require n to be a power of 2. Also, Wikipedia says “Sorting networks : : : are not capable of handling arbitrarily large inputs.” This constant-time Constant-time Bernstein–Chuengsatiansup– Lange–van “NTRU constant-time
SLIDE 106 20
int32_sort(int32 *x,int64 n) return; t) t += t; 0;p >>= 1) { < n - p;++i) p)) minmax(x+i,x+i+p); > p;q >>= 1) 0;i < n - q;++i) & p)) minmax(x+i+p,x+i+q);
21
Previous slide: C translation of 1973 Knuth “merge exchange”, which is a simplified version of 1968 Batcher “odd-even merge” sorting networks. ≈n(log2 n)2=4 comparators. Much faster than bubble sort. Warning: many other descriptions
- f Batcher’s sorting networks
require n to be a power of 2. Also, Wikipedia says “Sorting networks : : : are not capable of handling arbitrarily large inputs.” This constant-time vecto (fo
included in Bernstein–Chuengsatiansup– Lange–van V “NTRU Prime” soft revamp higher
constant-time so
SLIDE 107 20
*x,int64 n) t; 1) { p;++i) minmax(x+i,x+i+p); >>= 1) q;++i) minmax(x+i+p,x+i+q);
21
Previous slide: C translation of 1973 Knuth “merge exchange”, which is a simplified version of 1968 Batcher “odd-even merge” sorting networks. ≈n(log2 n)2=4 comparators. Much faster than bubble sort. Warning: many other descriptions
- f Batcher’s sorting networks
require n to be a power of 2. Also, Wikipedia says “Sorting networks : : : are not capable of handling arbitrarily large inputs.” This constant-time sorting co vectorization (for Haswell)
- Constant-time sorting code
included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped fo higher speed
constant-time sorting code
SLIDE 108 21
Previous slide: C translation of 1973 Knuth “merge exchange”, which is a simplified version of 1968 Batcher “odd-even merge” sorting networks. ≈n(log2 n)2=4 comparators. Much faster than bubble sort. Warning: many other descriptions
- f Batcher’s sorting networks
require n to be a power of 2. Also, Wikipedia says “Sorting networks : : : are not capable of handling arbitrarily large inputs.”
22
This constant-time sorting code vectorization (for Haswell)
- Constant-time sorting code
included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed
constant-time sorting code
SLIDE 109 21
Previous slide: C translation of Knuth “merge exchange”, is a simplified version of Batcher “odd-even merge” networks. (log2 n)2=4 comparators. faster than bubble sort. rning: many other descriptions Batcher’s sorting networks n to be a power of 2. Wikipedia says “Sorting rks : : : are not capable of handling arbitrarily large inputs.”
22
This constant-time sorting code vectorization (for Haswell)
- Constant-time sorting code
included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed
constant-time sorting code The slowdo Massive Includes sorting using
2015 Gueron–Krasnov Haswell ( 25608 stdsort 21844 herf 15136 krasnov
SLIDE 110 21
translation of “merge exchange”, simplified version of dd-even merge” rks. comparators. than bubble sort.
rting networks power of 2. says “Sorting not capable of rily large inputs.”
22
This constant-time sorting code vectorization (for Haswell)
- Constant-time sorting code
included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed
constant-time sorting code The slowdown for Massive fast-sorting Includes several effo sorting using AVX2
2015 Gueron–Krasnov Haswell (titan0) 25608 stdsort 21844 herf 15136 krasnov
SLIDE 111 21
translation of exchange”, version of merge” rs. sort. descriptions rks 2. rting capable of inputs.”
22
This constant-time sorting code vectorization (for Haswell)
- Constant-time sorting code
included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed
constant-time sorting code The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions
- n modern Intel CPUs: e.g.
2015 Gueron–Krasnov quickso Haswell (titan0) cycles, n = 25608 stdsort 21844 herf 15136 krasnov
SLIDE 112 22
This constant-time sorting code vectorization (for Haswell)
- Constant-time sorting code
included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed
constant-time sorting code
23
The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions
- n modern Intel CPUs: e.g.
2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 15136 krasnov
SLIDE 113 22
This constant-time sorting code vectorization (for Haswell)
- Constant-time sorting code
included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed
constant-time sorting code
23
The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions
- n modern Intel CPUs: e.g.
2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov
SLIDE 114 22
This constant-time sorting code vectorization (for Haswell)
- Constant-time sorting code
included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed
constant-time sorting code
23
The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions
- n modern Intel CPUs: e.g.
2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov 6596 avx2 (2018 djbsort)
SLIDE 115 22
This constant-time sorting code vectorization (for Haswell)
- Constant-time sorting code
included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed
constant-time sorting code
23
The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions
- n modern Intel CPUs: e.g.
2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov 6596 avx2 (2018 djbsort) No slowdown. New speed records!
SLIDE 116 22
constant-time sorting code vectorization (for Haswell)
- Constant-time sorting code
included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed
constant-time sorting code
23
The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions
- n modern Intel CPUs: e.g.
2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov 6596 avx2 (2018 djbsort) No slowdown. New speed records! How can beat standa
SLIDE 117 22
constant-time sorting code vectorization (for Haswell) Constant-time sorting code in 2017 Bernstein–Chuengsatiansup– Vredendaal software release revamped for higher speed “djbsort” constant-time sorting code
23
The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions
- n modern Intel CPUs: e.g.
2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov 6596 avx2 (2018 djbsort) No slowdown. New speed records! How can an n(log n beat standard n log
SLIDE 118 22
rting code rization ell) code Bernstein–Chuengsatiansup– redendaal release ed for eed code
23
The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions
- n modern Intel CPUs: e.g.
2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov 6596 avx2 (2018 djbsort) No slowdown. New speed records! How can an n(log n)2 algorithm beat standard n log n algorithms?
SLIDE 119 23
The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions
- n modern Intel CPUs: e.g.
2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov 6596 avx2 (2018 djbsort) No slowdown. New speed records!
24
How can an n(log n)2 algorithm beat standard n log n algorithms?
SLIDE 120 23
The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions
- n modern Intel CPUs: e.g.
2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov 6596 avx2 (2018 djbsort) No slowdown. New speed records!
24
How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs
SLIDE 121 23
The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions
- n modern Intel CPUs: e.g.
2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov 6596 avx2 (2018 djbsort) No slowdown. New speed records!
24
How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs
Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers.
SLIDE 122 23
The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions
- n modern Intel CPUs: e.g.
2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov 6596 avx2 (2018 djbsort) No slowdown. New speed records!
24
How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs
Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers. Loading a 32-bit integer from a random address: much slower. Conditional branch: much slower.
SLIDE 123 23
slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize using AVX2 instructions dern Intel CPUs: e.g. Gueron–Krasnov quicksort. ell (titan0) cycles, n = 768: stdsort herf
krasnov avx2 (2018 djbsort)
- wdown. New speed records!
24
How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs
Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers. Loading a 32-bit integer from a random address: much slower. Conditional branch: much slower. Verification Sorting s Does it w Test the random inputs, decreasing
SLIDE 124 23
for constant time rting literature. efforts to optimize VX2 instructions CPUs: e.g. Gueron–Krasnov quicksort. ) cycles, n = 768: (2017 BCLvV) (2018 djbsort) New speed records!
24
How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs
Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers. Loading a 32-bit integer from a random address: much slower. Conditional branch: much slower. Verification Sorting software is Does it work corre Test the sorting soft random inputs, increasing decreasing inputs.
SLIDE 125 23
constant time literature.
instructions e.g. quicksort. n = 768: BCLvV) rt) records!
24
How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs
Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers. Loading a 32-bit integer from a random address: much slower. Conditional branch: much slower. Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on random inputs, increasing inputs, decreasing inputs. Seems to
SLIDE 126 24
How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs
Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers. Loading a 32-bit integer from a random address: much slower. Conditional branch: much slower.
25
Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on many random inputs, increasing inputs, decreasing inputs. Seems to work.
SLIDE 127 24
How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs
Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers. Loading a 32-bit integer from a random address: much slower. Conditional branch: much slower.
25
Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on many random inputs, increasing inputs, decreasing inputs. Seems to work. But are there occasional inputs where this sorting software fails to sort correctly? History: Many security problems involve occasional inputs where TCB works incorrectly.
SLIDE 128
24
can an n(log n)2 algorithm standard n log n algorithms? er: well-known trends design, reflecting fundamental hardware costs rious operations. cycle, Haswell core can do “min” ops on 32-bit integers + “max” ops on 32-bit integers. Loading a 32-bit integer from a address: much slower. Conditional branch: much slower.
25
Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on many random inputs, increasing inputs, decreasing inputs. Seems to work. But are there occasional inputs where this sorting software fails to sort correctly? History: Many security problems involve occasional inputs where TCB works incorrectly. For each machine fully unrolled unrolled yes,
SLIDE 129 24
(log n)2 algorithm log n algorithms? ell-known trends reflecting rdware costs tions. Haswell core can do 32-bit integers + 32-bit integers. integer from a much slower. ranch: much slower.
25
Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on many random inputs, increasing inputs, decreasing inputs. Seems to work. But are there occasional inputs where this sorting software fails to sort correctly? History: Many security problems involve occasional inputs where TCB works incorrectly. For each used n (e.g., C code normal
symbolic
new p
new so
SLIDE 130 24
rithm rithms? trends costs can do integers + integers. from a wer. slower.
25
Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on many random inputs, increasing inputs, decreasing inputs. Seems to work. But are there occasional inputs where this sorting software fails to sort correctly? History: Many security problems involve occasional inputs where TCB works incorrectly. For each used n (e.g., 768): C code normal compiler
symbolic execution
new peephole optim
new sorting verifier
SLIDE 131 25
Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on many random inputs, increasing inputs, decreasing inputs. Seems to work. But are there occasional inputs where this sorting software fails to sort correctly? History: Many security problems involve occasional inputs where TCB works incorrectly.
26
For each used n (e.g., 768): C code normal compiler
symbolic execution
new peephole optimizer
new sorting verifier
SLIDE 132 25
erification rting software is in the TCB. it work correctly? the sorting software on many inputs, increasing inputs, decreasing inputs. Seems to work. re there occasional inputs this sorting software sort correctly? ry: Many security problems
TCB works incorrectly.
26
For each used n (e.g., 768): C code normal compiler
symbolic execution
new peephole optimizer
new sorting verifier
Symbolic use existing with tiny eliminating a few missing
SLIDE 133 25
is in the TCB. rrectly? software on many increasing inputs,
ccasional inputs rting software rrectly? security problems ccasional inputs rks incorrectly.
26
For each used n (e.g., 768): C code normal compiler
symbolic execution
new peephole optimizer
new sorting verifier
Symbolic execution: use existing “angr” with tiny new patches eliminating byte splitting, a few missing vecto
SLIDE 134 25
TCB.
inputs, to work. inputs roblems rrectly.
26
For each used n (e.g., 768): C code normal compiler
symbolic execution
new peephole optimizer
new sorting verifier
Symbolic execution: use existing “angr” library, with tiny new patches for eliminating byte splitting, adding a few missing vector instructions.
SLIDE 135 26
For each used n (e.g., 768): C code normal compiler
symbolic execution
new peephole optimizer
new sorting verifier
27
Symbolic execution: use existing “angr” library, with tiny new patches for eliminating byte splitting, adding a few missing vector instructions.
SLIDE 136 26
For each used n (e.g., 768): C code normal compiler
symbolic execution
new peephole optimizer
new sorting verifier
27
Symbolic execution: use existing “angr” library, with tiny new patches for eliminating byte splitting, adding a few missing vector instructions. Peephole optimizer: recognize instruction patterns equivalent to min, max.
SLIDE 137 26
For each used n (e.g., 768): C code normal compiler
symbolic execution
new peephole optimizer
new sorting verifier
27
Symbolic execution: use existing “angr” library, with tiny new patches for eliminating byte splitting, adding a few missing vector instructions. Peephole optimizer: recognize instruction patterns equivalent to min, max. Sorting verifier: decompose DAG into merging networks. Verify each merging network using generalization of 2007 Even–Levi–Litman, correction of 1990 Chung–Ravikumar.
SLIDE 138 26
each used n (e.g., 768): C code normal compiler
symbolic execution
new peephole optimizer
new sorting verifier
27
Symbolic execution: use existing “angr” library, with tiny new patches for eliminating byte splitting, adding a few missing vector instructions. Peephole optimizer: recognize instruction patterns equivalent to min, max. Sorting verifier: decompose DAG into merging networks. Verify each merging network using generalization of 2007 Even–Levi–Litman, correction of 1990 Chung–Ravikumar. Current verified A verified p https://sorting.cr.yp.to Includes automatic simple b verification Web site use the verification Next release verified ARM
SLIDE 139 26
(e.g., 768): rmal compiler de symbolic execution code peephole optimizer min-max code sorting verifier
27
Symbolic execution: use existing “angr” library, with tiny new patches for eliminating byte splitting, adding a few missing vector instructions. Peephole optimizer: recognize instruction patterns equivalent to min, max. Sorting verifier: decompose DAG into merging networks. Verify each merging network using generalization of 2007 Even–Levi–Litman, correction of 1990 Chung–Ravikumar. Current djbsort release, verified AVX2 code verified portable co https://sorting.cr.yp.to Includes the sorting automatic build-time simple benchmarking verification tools. Web site shows ho use the verification Next release planned: verified ARM NEON
SLIDE 140 26
768):
execution
verifier
27
Symbolic execution: use existing “angr” library, with tiny new patches for eliminating byte splitting, adding a few missing vector instructions. Peephole optimizer: recognize instruction patterns equivalent to min, max. Sorting verifier: decompose DAG into merging networks. Verify each merging network using generalization of 2007 Even–Levi–Litman, correction of 1990 Chung–Ravikumar. Current djbsort release, verified AVX2 code and verified portable code: https://sorting.cr.yp.to Includes the sorting code; automatic build-time tests; simple benchmarking program; verification tools. Web site shows how to use the verification tools. Next release planned: verified ARM NEON code.
SLIDE 141
27
Symbolic execution: use existing “angr” library, with tiny new patches for eliminating byte splitting, adding a few missing vector instructions. Peephole optimizer: recognize instruction patterns equivalent to min, max. Sorting verifier: decompose DAG into merging networks. Verify each merging network using generalization of 2007 Even–Levi–Litman, correction of 1990 Chung–Ravikumar.
28
Current djbsort release, verified AVX2 code and verified portable code: https://sorting.cr.yp.to Includes the sorting code; automatic build-time tests; simple benchmarking program; verification tools. Web site shows how to use the verification tools. Next release planned: verified ARM NEON code.
SLIDE 142 27
existing “angr” library, tiny new patches for eliminating byte splitting, adding missing vector instructions. eephole optimizer: recognize instruction patterns equivalent to min, max. rting verifier: decompose into merging networks. each merging network generalization of 2007 Even–Levi–Litman, correction of Chung–Ravikumar.
28
Current djbsort release, verified AVX2 code and verified portable code: https://sorting.cr.yp.to Includes the sorting code; automatic build-time tests; simple benchmarking program; verification tools. Web site shows how to use the verification tools. Next release planned: verified ARM NEON code. The future I don’t thin fundamental
See the so Firefox has verified constant-time Curve25519+ChaCha20+P I’m working post-quantum
SLIDE 143 27
execution: “angr” library, patches for splitting, adding vector instructions.
instruction patterns min, max. decompose merging networks. merging network generalization of 2007 Even–Levi–Litman, correction of Chung–Ravikumar.
28
Current djbsort release, verified AVX2 code and verified portable code: https://sorting.cr.yp.to Includes the sorting code; automatic build-time tests; simple benchmarking program; verification tools. Web site shows how to use the verification tools. Next release planned: verified ARM NEON code. The future I don’t think there fundamental tension
- crypto performance,
- stopping timing
- making sure soft
See the sorting example. Firefox has already verified constant-time Curve25519+ChaCha20+P I’m working on easier post-quantum code,
SLIDE 144 27
, adding instructions. patterns
rks. rk 2007 rrection of
28
Current djbsort release, verified AVX2 code and verified portable code: https://sorting.cr.yp.to Includes the sorting code; automatic build-time tests; simple benchmarking program; verification tools. Web site shows how to use the verification tools. Next release planned: verified ARM NEON code. The future I don’t think there is a fundamental tension between
- crypto performance,
- stopping timing attacks,
- making sure software works.
See the sorting example. Firefox has already deployed verified constant-time softwa Curve25519+ChaCha20+Poly1305. I’m working on easier verification, post-quantum code, faster c
SLIDE 145 28
Current djbsort release, verified AVX2 code and verified portable code: https://sorting.cr.yp.to Includes the sorting code; automatic build-time tests; simple benchmarking program; verification tools. Web site shows how to use the verification tools. Next release planned: verified ARM NEON code.
29
The future I don’t think there is a fundamental tension between
- crypto performance,
- stopping timing attacks,
- making sure software works.
See the sorting example. Firefox has already deployed verified constant-time software for Curve25519+ChaCha20+Poly1305. I’m working on easier verification, post-quantum code, faster code.