Can cryptographic software Bobs laptop screen: be fixed? From: - - PowerPoint PPT Presentation

can cryptographic software bob s laptop screen be fixed
SMART_READER_LITE
LIVE PREVIEW

Can cryptographic software Bobs laptop screen: be fixed? From: - - PowerPoint PPT Presentation

1 2 Can cryptographic software Bobs laptop screen: be fixed? From: Alice D. J. Bernstein Thank you for your submission. We received many interesting papers, and unfortunately your Bob assumes this message is something Alice actually


slide-1
SLIDE 1

1

Can cryptographic software be fixed?

  • D. J. Bernstein

2

Bob’s laptop screen:

From: Alice Thank you for your

  • submission. We received

many interesting papers, and unfortunately your

Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified

  • r forged the message.
slide-2
SLIDE 2

1

cryptographic software fixed? Bernstein

2

Bob’s laptop screen:

From: Alice Thank you for your

  • submission. We received

many interesting papers, and unfortunately your

Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified

  • r forged the message.

Systems e.g. Firefo 4582680 3093398 2623454

slide-3
SLIDE 3

1

cryptographic software

2

Bob’s laptop screen:

From: Alice Thank you for your

  • submission. We received

many interesting papers, and unfortunately your

Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified

  • r forged the message.

Systems are too complex. e.g. Firefox 60 (Ma 4582680 lines in cpp 3093398 lines in h 2623454 lines in c

slide-4
SLIDE 4

1

re

2

Bob’s laptop screen:

From: Alice Thank you for your

  • submission. We received

many interesting papers, and unfortunately your

Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified

  • r forged the message.

Systems are too complex. e.g. Firefox 60 (May 2018) co 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc.

slide-5
SLIDE 5

2

Bob’s laptop screen:

From: Alice Thank you for your

  • submission. We received

many interesting papers, and unfortunately your

Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified

  • r forged the message.

3

Systems are too complex. e.g. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc.

slide-6
SLIDE 6

2

Bob’s laptop screen:

From: Alice Thank you for your

  • submission. We received

many interesting papers, and unfortunately your

Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified

  • r forged the message.

3

Systems are too complex. e.g. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc. Every line in this code has full control over user messages.

slide-7
SLIDE 7

2

Bob’s laptop screen:

From: Alice Thank you for your

  • submission. We received

many interesting papers, and unfortunately your

Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified

  • r forged the message.

3

Systems are too complex. e.g. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc. Every line in this code has full control over user messages. Critical vulnerabilities fixed in 61: CVE-2018-12359, “Buffer

  • verflow using computed size
  • f canvas element”; CVE-2018-

12360, “Use-after-free when using focus()”; CVE-2018-12361, “Integer overflow in SwizzleData”.

slide-8
SLIDE 8

2

laptop screen:

From: Alice Thank you for your

  • submission. We received

many interesting papers, unfortunately your

assumes this message is something Alice actually sent. day’s “security” systems guarantee this property. er could have modified rged the message.

3

Systems are too complex. e.g. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc. Every line in this code has full control over user messages. Critical vulnerabilities fixed in 61: CVE-2018-12359, “Buffer

  • verflow using computed size
  • f canvas element”; CVE-2018-

12360, “Use-after-free when using focus()”; CVE-2018-12361, “Integer overflow in SwizzleData”. Trusted TCB: po that is resp the users’

slide-9
SLIDE 9

2

creen:

for your We received interesting papers, unfortunately your

this message is actually sent. “security” systems this property. have modified message.

3

Systems are too complex. e.g. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc. Every line in this code has full control over user messages. Critical vulnerabilities fixed in 61: CVE-2018-12359, “Buffer

  • verflow using computed size
  • f canvas element”; CVE-2018-

12360, “Use-after-free when using focus()”; CVE-2018-12361, “Integer overflow in SwizzleData”. Trusted computing TCB: portion of computer that is responsible the users’ security

slide-10
SLIDE 10

2

received papers, your

is sent. systems erty. dified

3

Systems are too complex. e.g. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc. Every line in this code has full control over user messages. Critical vulnerabilities fixed in 61: CVE-2018-12359, “Buffer

  • verflow using computed size
  • f canvas element”; CVE-2018-

12360, “Use-after-free when using focus()”; CVE-2018-12361, “Integer overflow in SwizzleData”. Trusted computing base (TCB TCB: portion of computer system that is responsible for enforcing the users’ security policy.

slide-11
SLIDE 11

3

Systems are too complex. e.g. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc. Every line in this code has full control over user messages. Critical vulnerabilities fixed in 61: CVE-2018-12359, “Buffer

  • verflow using computed size
  • f canvas element”; CVE-2018-

12360, “Use-after-free when using focus()”; CVE-2018-12361, “Integer overflow in SwizzleData”.

4

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy.

slide-12
SLIDE 12

3

Systems are too complex. e.g. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc. Every line in this code has full control over user messages. Critical vulnerabilities fixed in 61: CVE-2018-12359, “Buffer

  • verflow using computed size
  • f canvas element”; CVE-2018-

12360, “Use-after-free when using focus()”; CVE-2018-12361, “Integer overflow in SwizzleData”.

4

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice.

slide-13
SLIDE 13

3

Systems are too complex. e.g. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc. Every line in this code has full control over user messages. Critical vulnerabilities fixed in 61: CVE-2018-12359, “Buffer

  • verflow using computed size
  • f canvas element”; CVE-2018-

12360, “Use-after-free when using focus()”; CVE-2018-12361, “Integer overflow in SwizzleData”.

4

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does.

slide-14
SLIDE 14

3

Systems are too complex. Firefox 60 (May 2018) code: 4582680 lines in cpp files, 3093398 lines in h files, 2623454 lines in c files, etc. line in this code has control over user messages. Critical vulnerabilities fixed in 61: CVE-2018-12359, “Buffer w using computed size canvas element”; CVE-2018- 12360, “Use-after-free when focus()”; CVE-2018-12361, “Integer overflow in SwizzleData”.

4

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does. Examples

  • 1. Attack

in a device Linux

slide-15
SLIDE 15

3

complex. (May 2018) code: cpp files, h files, c files, etc. this code has user messages. vulnerabilities fixed in 61: CVE-2018-12359, “Buffer computed size element”; CVE-2018- “Use-after-free when CVE-2018-12361, in SwizzleData”.

4

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does. Examples of attack

  • 1. Attacker uses buffer

in a device driver Linux kernel on

slide-16
SLIDE 16

3

2018) code: etc. messages. fixed in 61: size CVE-2018- when CVE-2018-12361, SwizzleData”.

4

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does. Examples of attack strategies:

  • 1. Attacker uses buffer overflo

in a device driver to control Linux kernel on Alice’s laptop.

slide-17
SLIDE 17

4

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does.

5

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

slide-18
SLIDE 18

4

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does.

5

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop.

slide-19
SLIDE 19

4

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does.

5

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc.

slide-20
SLIDE 20

4

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does.

5

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?

slide-21
SLIDE 21

4

rusted computing base (TCB) portion of computer system responsible for enforcing users’ security policy. Security policy for this talk: message is displayed on screen as “From: Alice” message is from Alice. works correctly, message is guaranteed from Alice, no matter what rest of the system does.

5

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this? Classic sec Rearchitect to have a

slide-22
SLIDE 22

4

computing base (TCB) computer system

  • nsible for enforcing

security policy. for this talk: displayed on “From: Alice” from Alice. rrectly, guaranteed Alice, no matter what system does.

5

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this? Classic security strategy: Rearchitect computer to have a much smaller

slide-23
SLIDE 23

4

(TCB) system rcing talk:

Alice”

Alice. ranteed matter what es.

5

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this? Classic security strategy: Rearchitect computer systems to have a much smaller TCB

slide-24
SLIDE 24

5

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?

6

Classic security strategy: Rearchitect computer systems to have a much smaller TCB.

slide-25
SLIDE 25

5

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?

6

Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB.

slide-26
SLIDE 26

5

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?

6

Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs.

slide-27
SLIDE 27

5

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?

6

Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly.

slide-28
SLIDE 28

5

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?

6

Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly. Alice also runs many VMs.

slide-29
SLIDE 29

5

Examples of attack strategies: ttacker uses buffer overflow device driver to control Linux kernel on Alice’s laptop. ttacker uses buffer overflow web browser to control files on Bob’s laptop. driver is in the TCB. rowser is in the TCB. is in the TCB. Etc. Massive TCB has many bugs, including many security holes. hope of fixing this?

6

Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly. Alice also runs many VMs. Focus of How does that incoming is from Alice’s Cryptographic Message-authentication Alice’s authenticated authenticated Alice’s

slide-30
SLIDE 30

5

attack strategies: buffer overflow driver to control

  • n Alice’s laptop.

buffer overflow wser to control Bob’s laptop. in the TCB. in the TCB.

  • TCB. Etc.

has many bugs, security holes. fixing this?

6

Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly. Alice also runs many VMs. Focus of this talk: How does Bob’s laptop that incoming netw is from Alice’s laptop? Cryptographic solution: Message-authentication Alice’s message

  • authenticated message

untrusted

  • authenticated message
  • Alice’s message
slide-31
SLIDE 31

5

strategies:

  • verflow

control laptop.

  • verflow

control laptop. TCB. TCB. bugs, holes.

6

Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly. Alice also runs many VMs. Focus of this talk: Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message

  • authenticated message

untrusted netwo

  • authenticated message
  • Alice’s message
slide-32
SLIDE 32

6

Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly. Alice also runs many VMs.

7

Focus of this talk: Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message

  • k
  • authenticated message

untrusted network

  • authenticated message
  • Alice’s message

k

slide-33
SLIDE 33

6

Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly. Alice also runs many VMs.

7

Focus of this talk: Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message

  • k
  • authenticated message

untrusted network

  • modified message
  • “Alert: forgery!”

k

slide-34
SLIDE 34

6

security strategy: rchitect computer systems have a much smaller TCB. refully audit the TCB. Bob runs many VMs: A data VM C Charlie data · · · stops each VM from touching data in other VMs. wser in VM C isn’t in TCB. touch data in VM A, works correctly. also runs many VMs.

7

Focus of this talk: Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message

  • k
  • authenticated message

untrusted network

  • modified message
  • “Alert: forgery!”

k

  • Important

to share What if

  • n their
slide-35
SLIDE 35

6

strategy: computer systems smaller TCB. the TCB. many VMs: VM C Charlie data · · · VM from

  • ther VMs.

C isn’t in TCB. ta in VM A, rrectly. many VMs.

7

Focus of this talk: Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message

  • k
  • authenticated message

untrusted network

  • modified message
  • “Alert: forgery!”

k

  • Important for Alice

to share the same What if attacker w

  • n their communication
slide-36
SLIDE 36

6

systems TCB. · · · VMs. TCB. A, VMs.

7

Focus of this talk: Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message

  • k
  • authenticated message

untrusted network

  • modified message
  • “Alert: forgery!”

k

  • Important for Alice and Bob

to share the same secret k. What if attacker was spying

  • n their communication of k
slide-37
SLIDE 37

7

Focus of this talk: Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message

  • k
  • authenticated message

untrusted network

  • modified message
  • “Alert: forgery!”

k

  • 8

Important for Alice and Bob to share the same secret k. What if attacker was spying

  • n their communication of k?
slide-38
SLIDE 38

7

Focus of this talk: Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message

  • k
  • authenticated message

untrusted network

  • modified message
  • “Alert: forgery!”

k

  • 8

Important for Alice and Bob to share the same secret k. What if attacker was spying

  • n their communication of k?

Solution 1: Public-key encryption. k private key a

  • ciphertext
  • public key aG

network

  • ciphertext

network

  • public key aG
  • k
slide-39
SLIDE 39

7

  • f this talk: Cryptography

does Bob’s laptop know incoming network data Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message

  • k
  • authenticated message

untrusted network

  • dified message
  • “Alert: forgery!”

k

  • 8

Important for Alice and Bob to share the same secret k. What if attacker was spying

  • n their communication of k?

Solution 1: Public-key encryption. k private key a

  • ciphertext
  • public key aG

network

  • ciphertext

network

  • public key aG
  • k
  • Solution

Public-key m

  • signed message
  • signed message
  • m
slide-40
SLIDE 40

7

talk: Cryptography laptop know network data laptop? solution: Message-authentication codes. message k

  • message

untrusted network message rgery!” k

  • 8

Important for Alice and Bob to share the same secret k. What if attacker was spying

  • n their communication of k?

Solution 1: Public-key encryption. k private key a

  • ciphertext
  • public key aG

network

  • ciphertext

network

  • public key aG
  • k
  • Solution 2:

Public-key signatures. m

  • signed message

network

  • signed message

m

slide-41
SLIDE 41

7

Cryptography know data des. k work k

8

Important for Alice and Bob to share the same secret k. What if attacker was spying

  • n their communication of k?

Solution 1: Public-key encryption. k private key a

  • ciphertext
  • public key aG

network

  • ciphertext

network

  • public key aG
  • k
  • Solution 2:

Public-key signatures. m

  • a
  • signed message

network

  • aG

net

  • signed message
  • aG
  • m
slide-42
SLIDE 42

8

Important for Alice and Bob to share the same secret k. What if attacker was spying

  • n their communication of k?

Solution 1: Public-key encryption. k private key a

  • ciphertext
  • public key aG

network

  • ciphertext

network

  • public key aG
  • k
  • 9

Solution 2: Public-key signatures. m

  • a
  • signed message

network

  • aG

network

  • signed message
  • aG
  • m
slide-43
SLIDE 43

8

Important for Alice and Bob to share the same secret k. What if attacker was spying

  • n their communication of k?

Solution 1: Public-key encryption. k private key a

  • ciphertext
  • public key aG

network

  • ciphertext

network

  • public key aG
  • k
  • 9

Solution 2: Public-key signatures. m

  • a
  • signed message

network

  • aG

network

  • signed message
  • aG
  • m

Fantasy world: software for authentication/encryption/sigs is small and carefully audited ⇒ no cryptographic security failures.

slide-44
SLIDE 44

8

rtant for Alice and Bob re the same secret k. if attacker was spying their communication of k? Solution 1: Public-key encryption. private key a

  • ciphertext

public key aG network

  • ciphertext

network public key aG

  • 9

Solution 2: Public-key signatures. m

  • a
  • signed message

network

  • aG

network

  • signed message
  • aG
  • m

Fantasy world: software for authentication/encryption/sigs is small and carefully audited ⇒ no cryptographic security failures. Real world: Cryptographic is huge.

  • f many

Most complications

slide-45
SLIDE 45

8

Alice and Bob same secret k. was spying communication of k? encryption. private key a

  • public key aG

network

  • public key aG

9

Solution 2: Public-key signatures. m

  • a
  • signed message

network

  • aG

network

  • signed message
  • aG
  • m

Fantasy world: software for authentication/encryption/sigs is small and carefully audited ⇒ no cryptographic security failures. Real world: Cryptographic part is huge. Many implementations

  • f many cryptograph

Most complications

slide-46
SLIDE 46

8

Bob . ying

  • f k?

key a y aG network y aG

9

Solution 2: Public-key signatures. m

  • a
  • signed message

network

  • aG

network

  • signed message
  • aG
  • m

Fantasy world: software for authentication/encryption/sigs is small and carefully audited ⇒ no cryptographic security failures. Real world: Cryptographic part of the TCB is huge. Many implementations

  • f many cryptographic primitives.

Most complications are for sp

slide-47
SLIDE 47

9

Solution 2: Public-key signatures. m

  • a
  • signed message

network

  • aG

network

  • signed message
  • aG
  • m

Fantasy world: software for authentication/encryption/sigs is small and carefully audited ⇒ no cryptographic security failures.

10

Real world: Cryptographic part of the TCB is huge. Many implementations

  • f many cryptographic primitives.

Most complications are for speed.

slide-48
SLIDE 48

9

Solution 2: Public-key signatures. m

  • a
  • signed message

network

  • aG

network

  • signed message
  • aG
  • m

Fantasy world: software for authentication/encryption/sigs is small and carefully audited ⇒ no cryptographic security failures.

10

Real world: Cryptographic part of the TCB is huge. Many implementations

  • f many cryptographic primitives.

Most complications are for speed. e.g. February 2018: Google adds NSA’s Speck cipher to Linux kernel using hand-written asm for ARM Cortex-A7 processors.

slide-49
SLIDE 49

9

Solution 2: Public-key signatures. m

  • a
  • signed message

network

  • aG

network

  • signed message
  • aG
  • m

Fantasy world: software for authentication/encryption/sigs is small and carefully audited ⇒ no cryptographic security failures.

10

Real world: Cryptographic part of the TCB is huge. Many implementations

  • f many cryptographic primitives.

Most complications are for speed. e.g. February 2018: Google adds NSA’s Speck cipher to Linux kernel using hand-written asm for ARM Cortex-A7 processors. August 2018: Google switches from Speck to ChaCha12, again using hand-written assembly. Why not ChaCha20? Speed.

slide-50
SLIDE 50

9

Solution 2: Public-key signatures. m

  • a
  • message

network

  • aG

network

  • message
  • aG
  • m

antasy world: software for authentication/encryption/sigs small and carefully audited ⇒ cryptographic security failures.

10

Real world: Cryptographic part of the TCB is huge. Many implementations

  • f many cryptographic primitives.

Most complications are for speed. e.g. February 2018: Google adds NSA’s Speck cipher to Linux kernel using hand-written asm for ARM Cortex-A7 processors. August 2018: Google switches from Speck to ChaCha12, again using hand-written assembly. Why not ChaCha20? Speed. Keccak (SHA-3) “Keccak >20 optimized

  • f Keccak:

Includes many further

slide-51
SLIDE 51

9

signatures. a

  • aG

network

  • aG

software for authentication/encryption/sigs refully audited ⇒ security failures.

10

Real world: Cryptographic part of the TCB is huge. Many implementations

  • f many cryptographic primitives.

Most complications are for speed. e.g. February 2018: Google adds NSA’s Speck cipher to Linux kernel using hand-written asm for ARM Cortex-A7 processors. August 2018: Google switches from Speck to ChaCha12, again using hand-written assembly. Why not ChaCha20? Speed. Keccak (SHA-3) team “Keccak Code Pack >20 optimized implementations

  • f Keccak: AVX2,

Includes “parallel Keccak”: many further implementations.

slide-52
SLIDE 52

9

network r authentication/encryption/sigs audited ⇒ failures.

10

Real world: Cryptographic part of the TCB is huge. Many implementations

  • f many cryptographic primitives.

Most complications are for speed. e.g. February 2018: Google adds NSA’s Speck cipher to Linux kernel using hand-written asm for ARM Cortex-A7 processors. August 2018: Google switches from Speck to ChaCha12, again using hand-written assembly. Why not ChaCha20? Speed. Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations

  • f Keccak: AVX2, NEON, etc.

Includes “parallel Keccak”: many further implementations.

slide-53
SLIDE 53

10

Real world: Cryptographic part of the TCB is huge. Many implementations

  • f many cryptographic primitives.

Most complications are for speed. e.g. February 2018: Google adds NSA’s Speck cipher to Linux kernel using hand-written asm for ARM Cortex-A7 processors. August 2018: Google switches from Speck to ChaCha12, again using hand-written assembly. Why not ChaCha20? Speed.

11

Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations

  • f Keccak: AVX2, NEON, etc.

Includes “parallel Keccak”: many further implementations.

slide-54
SLIDE 54

10

Real world: Cryptographic part of the TCB is huge. Many implementations

  • f many cryptographic primitives.

Most complications are for speed. e.g. February 2018: Google adds NSA’s Speck cipher to Linux kernel using hand-written asm for ARM Cortex-A7 processors. August 2018: Google switches from Speck to ChaCha12, again using hand-written assembly. Why not ChaCha20? Speed.

11

Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations

  • f Keccak: AVX2, NEON, etc.

Includes “parallel Keccak”: many further implementations. Why not portable C code using “optimizing” compiler? Slower.

slide-55
SLIDE 55

10

Real world: Cryptographic part of the TCB is huge. Many implementations

  • f many cryptographic primitives.

Most complications are for speed. e.g. February 2018: Google adds NSA’s Speck cipher to Linux kernel using hand-written asm for ARM Cortex-A7 processors. August 2018: Google switches from Speck to ChaCha12, again using hand-written assembly. Why not ChaCha20? Speed.

11

Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations

  • f Keccak: AVX2, NEON, etc.

Includes “parallel Keccak”: many further implementations. Why not portable C code using “optimizing” compiler? Slower. Another example: many different primitives in NIST competition for post-quantum public-key

  • cryptography. (See next talk.)

Some overlap in implementations, but still huge volume of code.

slide-56
SLIDE 56

10

  • rld:

Cryptographic part of the TCB

  • huge. Many implementations

many cryptographic primitives. complications are for speed. ebruary 2018: Google adds Speck cipher to Linux using hand-written asm M Cortex-A7 processors. August 2018: Google switches Speck to ChaCha12, again hand-written assembly. not ChaCha20? Speed.

11

Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations

  • f Keccak: AVX2, NEON, etc.

Includes “parallel Keccak”: many further implementations. Why not portable C code using “optimizing” compiler? Slower. Another example: many different primitives in NIST competition for post-quantum public-key

  • cryptography. (See next talk.)

Some overlap in implementations, but still huge volume of code. Often people cryptographic e.g. NIST, really like

  • ptimized

⇒ More

slide-57
SLIDE 57

10

part of the TCB implementations cryptographic primitives. complications are for speed. 2018: Google adds cipher to Linux hand-written asm rtex-A7 processors. Google switches ChaCha12, again hand-written assembly. Cha20? Speed.

11

Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations

  • f Keccak: AVX2, NEON, etc.

Includes “parallel Keccak”: many further implementations. Why not portable C code using “optimizing” compiler? Slower. Another example: many different primitives in NIST competition for post-quantum public-key

  • cryptography. (See next talk.)

Some overlap in implementations, but still huge volume of code. Often people still complain cryptographic perfo e.g. NIST, May 2018: really like to see mo

  • ptimized impleme

⇒ More and more

slide-58
SLIDE 58

10

TCB implementations rimitives. r speed.

  • gle adds

Linux asm cessors. switches again sembly. eed.

11

Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations

  • f Keccak: AVX2, NEON, etc.

Includes “parallel Keccak”: many further implementations. Why not portable C code using “optimizing” compiler? Slower. Another example: many different primitives in NIST competition for post-quantum public-key

  • cryptography. (See next talk.)

Some overlap in implementations, but still huge volume of code. Often people still complain ab cryptographic performance. e.g. NIST, May 2018: “we’d really like to see more platfo

  • ptimized implementations”.

⇒ More and more software.

slide-59
SLIDE 59

11

Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations

  • f Keccak: AVX2, NEON, etc.

Includes “parallel Keccak”: many further implementations. Why not portable C code using “optimizing” compiler? Slower. Another example: many different primitives in NIST competition for post-quantum public-key

  • cryptography. (See next talk.)

Some overlap in implementations, but still huge volume of code.

12

Often people still complain about cryptographic performance. e.g. NIST, May 2018: “we’d really like to see more platform-

  • ptimized implementations”.

⇒ More and more software.

slide-60
SLIDE 60

11

Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations

  • f Keccak: AVX2, NEON, etc.

Includes “parallel Keccak”: many further implementations. Why not portable C code using “optimizing” compiler? Slower. Another example: many different primitives in NIST competition for post-quantum public-key

  • cryptography. (See next talk.)

Some overlap in implementations, but still huge volume of code.

12

Often people still complain about cryptographic performance. e.g. NIST, May 2018: “we’d really like to see more platform-

  • ptimized implementations”.

⇒ More and more software. Many security failures from incorrect computations: e.g., CVE-2017-3732, CVE-2017-3736, CVE-2017-3738 in OpenSSL.

slide-61
SLIDE 61

11

Keccak (SHA-3) team maintains “Keccak Code Package” with >20 optimized implementations

  • f Keccak: AVX2, NEON, etc.

Includes “parallel Keccak”: many further implementations. Why not portable C code using “optimizing” compiler? Slower. Another example: many different primitives in NIST competition for post-quantum public-key

  • cryptography. (See next talk.)

Some overlap in implementations, but still huge volume of code.

12

Often people still complain about cryptographic performance. e.g. NIST, May 2018: “we’d really like to see more platform-

  • ptimized implementations”.

⇒ More and more software. Many security failures from incorrect computations: e.g., CVE-2017-3732, CVE-2017-3736, CVE-2017-3738 in OpenSSL. Many security failures from variable-time computations: e.g. CVE-2018-0495, CVE-2018-0737, CVE-2018-5407 in OpenSSL.

slide-62
SLIDE 62

11

Keccak (SHA-3) team maintains “Keccak Code Package” with

  • ptimized implementations

Keccak: AVX2, NEON, etc. Includes “parallel Keccak”: further implementations. not portable C code using “optimizing” compiler? Slower. Another example: many different rimitives in NIST competition

  • st-quantum public-key
  • cryptography. (See next talk.)
  • verlap in implementations,

still huge volume of code.

12

Often people still complain about cryptographic performance. e.g. NIST, May 2018: “we’d really like to see more platform-

  • ptimized implementations”.

⇒ More and more software. Many security failures from incorrect computations: e.g., CVE-2017-3732, CVE-2017-3736, CVE-2017-3738 in OpenSSL. Many security failures from variable-time computations: e.g. CVE-2018-0495, CVE-2018-0737, CVE-2018-5407 in OpenSSL. Timing attacks Large po

  • ptimizations

addresses Consider instruction parallel cache store-to-load branch p

slide-63
SLIDE 63

11

team maintains ackage” with implementations VX2, NEON, etc. rallel Keccak”: implementations. ble C code using compiler? Slower. example: many different NIST competition public-key (See next talk.) implementations, volume of code.

12

Often people still complain about cryptographic performance. e.g. NIST, May 2018: “we’d really like to see more platform-

  • ptimized implementations”.

⇒ More and more software. Many security failures from incorrect computations: e.g., CVE-2017-3732, CVE-2017-3736, CVE-2017-3738 in OpenSSL. Many security failures from variable-time computations: e.g. CVE-2018-0495, CVE-2018-0737, CVE-2018-5407 in OpenSSL. Timing attacks Large portion of CPU

  • ptimizations depending

addresses of memo Consider data cachin instruction caching, parallel cache banks, store-to-load forwa branch prediction,

slide-64
SLIDE 64

11

maintains with implementations etc. Keccak”: implementations. using Slower. different etition ey talk.) implementations, de.

12

Often people still complain about cryptographic performance. e.g. NIST, May 2018: “we’d really like to see more platform-

  • ptimized implementations”.

⇒ More and more software. Many security failures from incorrect computations: e.g., CVE-2017-3732, CVE-2017-3736, CVE-2017-3738 in OpenSSL. Many security failures from variable-time computations: e.g. CVE-2018-0495, CVE-2018-0737, CVE-2018-5407 in OpenSSL. Timing attacks Large portion of CPU hardw

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc.

slide-65
SLIDE 65

12

Often people still complain about cryptographic performance. e.g. NIST, May 2018: “we’d really like to see more platform-

  • ptimized implementations”.

⇒ More and more software. Many security failures from incorrect computations: e.g., CVE-2017-3732, CVE-2017-3736, CVE-2017-3738 in OpenSSL. Many security failures from variable-time computations: e.g. CVE-2018-0495, CVE-2018-0737, CVE-2018-5407 in OpenSSL.

13

Timing attacks Large portion of CPU hardware:

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc.

slide-66
SLIDE 66

12

Often people still complain about cryptographic performance. e.g. NIST, May 2018: “we’d really like to see more platform-

  • ptimized implementations”.

⇒ More and more software. Many security failures from incorrect computations: e.g., CVE-2017-3732, CVE-2017-3736, CVE-2017-3738 in OpenSSL. Many security failures from variable-time computations: e.g. CVE-2018-0495, CVE-2018-0737, CVE-2018-5407 in OpenSSL.

13

Timing attacks Large portion of CPU hardware:

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks (e.g. TLBleed from 2018 Gras–Razavi–Bos–Giuffrida) show that this portion of the CPU has trouble keeping secrets.

slide-67
SLIDE 67

12

people still complain about cryptographic performance. NIST, May 2018: “we’d like to see more platform-

  • ptimized implementations”.

re and more software. security failures from rrect computations: e.g., CVE-2017-3732, CVE-2017-3736, CVE-2017-3738 in OpenSSL. security failures from riable-time computations: e.g. CVE-2018-0495, CVE-2018-0737, CVE-2018-5407 in OpenSSL.

13

Timing attacks Large portion of CPU hardware:

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks (e.g. TLBleed from 2018 Gras–Razavi–Bos–Giuffrida) show that this portion of the CPU has trouble keeping secrets. Typical literature Understand But details not exposed Try to push This becomes Tweak the to try to

slide-68
SLIDE 68

12

still complain about erformance. 2018: “we’d more platform- implementations”. re software. failures from computations: e.g., CVE-2017-3736, in OpenSSL. failures from computations: e.g. CVE-2018-0737, in OpenSSL.

13

Timing attacks Large portion of CPU hardware:

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks (e.g. TLBleed from 2018 Gras–Razavi–Bos–Giuffrida) show that this portion of the CPU has trouble keeping secrets. Typical literature on Understand this po But details are often not exposed to securit Try to push attacks This becomes very Tweak the attacked to try to stop the kno

slide-69
SLIDE 69

12

complain about rmance. e’d platform- tions”. re. e.g., CVE-2017-3736, enSSL. computations: e.g. CVE-2018-0737, enSSL.

13

Timing attacks Large portion of CPU hardware:

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks (e.g. TLBleed from 2018 Gras–Razavi–Bos–Giuffrida) show that this portion of the CPU has trouble keeping secrets. Typical literature on this topic: Understand this portion of CPU. But details are often proprieta not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks.

slide-70
SLIDE 70

13

Timing attacks Large portion of CPU hardware:

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks (e.g. TLBleed from 2018 Gras–Razavi–Bos–Giuffrida) show that this portion of the CPU has trouble keeping secrets.

14

Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks.

slide-71
SLIDE 71

13

Timing attacks Large portion of CPU hardware:

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks (e.g. TLBleed from 2018 Gras–Razavi–Bos–Giuffrida) show that this portion of the CPU has trouble keeping secrets.

14

Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great!

slide-72
SLIDE 72

13

Timing attacks Large portion of CPU hardware:

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks (e.g. TLBleed from 2018 Gras–Razavi–Bos–Giuffrida) show that this portion of the CPU has trouble keeping secrets.

14

Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security.

slide-73
SLIDE 73

13

Timing attacks portion of CPU hardware:

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, rallel cache banks, re-to-load forwarding, prediction, etc. attacks (e.g. TLBleed from Gras–Razavi–Bos–Giuffrida) that this portion of the CPU trouble keeping secrets.

14

Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security. The “constant-time” Don’t give to this p (1987 Goldreich, Oblivious domain-sp

slide-74
SLIDE 74

13

CPU hardware: depending on memory locations. caching, caching, banks, rwarding, rediction, etc. (e.g. TLBleed from Gras–Razavi–Bos–Giuffrida)

  • rtion of the CPU

eeping secrets.

14

Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security. The “constant-time” Don’t give any secrets to this portion of the (1987 Goldreich, 1990 Oblivious RAM; 2004 domain-specific for

slide-75
SLIDE 75

13

rdware:

  • n

cations. TLBleed from Gras–Razavi–Bos–Giuffrida) the CPU secrets.

14

Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security. The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better sp

slide-76
SLIDE 76

14

Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security.

15

The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed)

slide-77
SLIDE 77

14

Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security.

15

The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion

  • f the CPU to be correct, but

don’t need it to keep secrets. Makes auditing much easier.

slide-78
SLIDE 78

14

Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security.

15

The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion

  • f the CPU to be correct, but

don’t need it to keep secrets. Makes auditing much easier. Good match for attitude and experience of CPU designers: e.g., Intel issues errata for correctness bugs, not for information leaks.

slide-79
SLIDE 79

14

ypical literature on this topic: Understand this portion of CPU. details are often proprietary, exposed to security review. push attacks further. ecomes very complicated. the attacked software to stop the known attacks. researchers: This is great! auditors: This is a nightmare. years of security failures. confidence in future security.

15

The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion

  • f the CPU to be correct, but

don’t need it to keep secrets. Makes auditing much easier. Good match for attitude and experience of CPU designers: e.g., Intel issues errata for correctness bugs, not for information leaks. Case study: Subroutine Classic McEliec Gravity-SPHINCS, LEDApkc, sort array e.g. sort Typical so merge so choose load/sto based on also branch How to so without

slide-80
SLIDE 80

14

literature on this topic: portion of CPU.

  • ften proprietary,

security review. ttacks further. very complicated. attacked software the known attacks. This is great! This is a nightmare. security failures. future security.

15

The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion

  • f the CPU to be correct, but

don’t need it to keep secrets. Makes auditing much easier. Good match for attitude and experience of CPU designers: e.g., Intel issues errata for correctness bugs, not for information leaks. Case study: Constant-time Subroutine in (e.g.) Classic McEliece, GeMSS, Gravity-SPHINCS, LEDApkc, NTRU Prime, sort array of secret e.g. sort 768 32-bit Typical sorting algo merge sort, quickso choose load/store based on secret data. also branch based How to sort secret without any secret

slide-81
SLIDE 81

14

topic:

  • f CPU.

rietary, review. further. complicated. are attacks. great! nightmare. failures. security.

15

The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion

  • f the CPU to be correct, but

don’t need it to keep secrets. Makes auditing much easier. Good match for attitude and experience of CPU designers: e.g., Intel issues errata for correctness bugs, not for information leaks. Case study: Constant-time so Subroutine in (e.g.) BIG QUAKE, Classic McEliece, GeMSS, Gravity-SPHINCS, LEDAkem, LEDApkc, NTRU Prime, Round2: sort array of secret integers. e.g. sort 768 32-bit integers. Typical sorting algorithms— merge sort, quicksort, etc.— choose load/store addresses based on secret data. Usually also branch based on secret How to sort secret data without any secret addresses?

slide-82
SLIDE 82

15

The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion

  • f the CPU to be correct, but

don’t need it to keep secrets. Makes auditing much easier. Good match for attitude and experience of CPU designers: e.g., Intel issues errata for correctness bugs, not for information leaks.

16

Case study: Constant-time sorting Subroutine in (e.g.) BIG QUAKE, Classic McEliece, GeMSS, Gravity-SPHINCS, LEDAkem, LEDApkc, NTRU Prime, Round2: sort array of secret integers. e.g. sort 768 32-bit integers. Typical sorting algorithms— merge sort, quicksort, etc.— choose load/store addresses based on secret data. Usually also branch based on secret data. How to sort secret data without any secret addresses?

slide-83
SLIDE 83

15

“constant-time” solution: give any secrets portion of the CPU. Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) analysis: Need this portion CPU to be correct, but need it to keep secrets. auditing much easier. match for attitude and erience of CPU designers: e.g., issues errata for correctness not for information leaks.

16

Case study: Constant-time sorting Subroutine in (e.g.) BIG QUAKE, Classic McEliece, GeMSS, Gravity-SPHINCS, LEDAkem, LEDApkc, NTRU Prime, Round2: sort array of secret integers. e.g. sort 768 32-bit integers. Typical sorting algorithms— merge sort, quicksort, etc.— choose load/store addresses based on secret data. Usually also branch based on secret data. How to sort secret data without any secret addresses? Foundation a compa x

  • min{x; y

Easy constant-time Warning: compiler Even easier

slide-84
SLIDE 84

15

“constant-time” solution: secrets

  • f the CPU.

Goldreich, 1990 Ostrovsky: 2004 Bernstein: for better speed) Need this portion e correct, but keep secrets. much easier. attitude and U designers: e.g., errata for correctness information leaks.

16

Case study: Constant-time sorting Subroutine in (e.g.) BIG QUAKE, Classic McEliece, GeMSS, Gravity-SPHINCS, LEDAkem, LEDApkc, NTRU Prime, Round2: sort array of secret integers. e.g. sort 768 32-bit integers. Typical sorting algorithms— merge sort, quicksort, etc.— choose load/store addresses based on secret data. Usually also branch based on secret data. How to sort secret data without any secret addresses? Foundation of solution: a comparator sorting x

  • min{x; y}

max Easy constant-time Warning: C standa compiler to break the Even easier exercise

slide-85
SLIDE 85

15

solution: CPU. Ostrovsky: Bernstein: speed)

  • rtion

but secrets. easier. and designers: e.g., rrectness leaks.

16

Case study: Constant-time sorting Subroutine in (e.g.) BIG QUAKE, Classic McEliece, GeMSS, Gravity-SPHINCS, LEDAkem, LEDApkc, NTRU Prime, Round2: sort array of secret integers. e.g. sort 768 32-bit integers. Typical sorting algorithms— merge sort, quicksort, etc.— choose load/store addresses based on secret data. Usually also branch based on secret data. How to sort secret data without any secret addresses? Foundation of solution: a comparator sorting 2 integers. x y

  • min{x; y}

max{x; y} Easy constant-time exercise Warning: C standard allows compiler to break the solution. Even easier exercise in asm.

slide-86
SLIDE 86

16

Case study: Constant-time sorting Subroutine in (e.g.) BIG QUAKE, Classic McEliece, GeMSS, Gravity-SPHINCS, LEDAkem, LEDApkc, NTRU Prime, Round2: sort array of secret integers. e.g. sort 768 32-bit integers. Typical sorting algorithms— merge sort, quicksort, etc.— choose load/store addresses based on secret data. Usually also branch based on secret data. How to sort secret data without any secret addresses?

17

Foundation of solution: a comparator sorting 2 integers. x y

  • min{x; y}

max{x; y} Easy constant-time exercise in C. Warning: C standard allows compiler to break the solution. Even easier exercise in asm.

slide-87
SLIDE 87

16

study: Constant-time sorting routine in (e.g.) BIG QUAKE, McEliece, GeMSS, y-SPHINCS, LEDAkem, Apkc, NTRU Prime, Round2: rray of secret integers. rt 768 32-bit integers. ypical sorting algorithms— sort, quicksort, etc.— load/store addresses

  • n secret data. Usually

anch based on secret data. to sort secret data without any secret addresses?

17

Foundation of solution: a comparator sorting 2 integers. x y

  • min{x; y}

max{x; y} Easy constant-time exercise in C. Warning: C standard allows compiler to break the solution. Even easier exercise in asm. Combine sorting net Example

slide-88
SLIDE 88

16

Constant-time sorting (e.g.) BIG QUAKE, e, GeMSS, y-SPHINCS, LEDAkem, Prime, Round2: cret integers. 32-bit integers. algorithms— quicksort, etc.— re addresses

  • data. Usually

based on secret data. ecret data secret addresses?

17

Foundation of solution: a comparator sorting 2 integers. x y

  • min{x; y}

max{x; y} Easy constant-time exercise in C. Warning: C standard allows compiler to break the solution. Even easier exercise in asm. Combine comparato sorting network fo Example of a sorting

slide-89
SLIDE 89

16

Constant-time sorting QUAKE, em, Round2: integers. gers. rithms— tc.— addresses Usually secret data. addresses?

17

Foundation of solution: a comparator sorting 2 integers. x y

  • min{x; y}

max{x; y} Easy constant-time exercise in C. Warning: C standard allows compiler to break the solution. Even easier exercise in asm. Combine comparators into a sorting network for more inputs. Example of a sorting network:

slide-90
SLIDE 90

17

Foundation of solution: a comparator sorting 2 integers. x y

  • min{x; y}

max{x; y} Easy constant-time exercise in C. Warning: C standard allows compiler to break the solution. Even easier exercise in asm.

18

Combine comparators into a sorting network for more inputs. Example of a sorting network:

slide-91
SLIDE 91

17

  • undation of solution:

comparator sorting 2 integers. y

  • ; y}

max{x; y} constant-time exercise in C. rning: C standard allows compiler to break the solution. easier exercise in asm.

18

Combine comparators into a sorting network for more inputs. Example of a sorting network:

  • Positions

in a sorting independent Naturally

slide-92
SLIDE 92

17

solution: sorting 2 integers. y

  • max{x; y}

constant-time exercise in C. standard allows reak the solution. exercise in asm.

18

Combine comparators into a sorting network for more inputs. Example of a sorting network:

  • Positions of compa

in a sorting network independent of the Naturally constant-time.

slide-93
SLIDE 93

17

integers. } exercise in C. ws tion. .

18

Combine comparators into a sorting network for more inputs. Example of a sorting network:

  • Positions of comparators

in a sorting network are independent of the input. Naturally constant-time.

slide-94
SLIDE 94

18

Combine comparators into a sorting network for more inputs. Example of a sorting network:

  • 19

Positions of comparators in a sorting network are independent of the input. Naturally constant-time.

slide-95
SLIDE 95

18

Combine comparators into a sorting network for more inputs. Example of a sorting network:

  • 19

Positions of comparators in a sorting network are independent of the input. Naturally constant-time. But remember all the people complaining about speed: e.g., “We would be happy to hear that fixed weight sampling is efficient

  • n a variety of platforms : : :

We have not yet been convinced that this is the case.”

slide-96
SLIDE 96

18

Combine comparators into a sorting network for more inputs. Example of a sorting network:

  • 19

Positions of comparators in a sorting network are independent of the input. Naturally constant-time. But remember all the people complaining about speed: e.g., “We would be happy to hear that fixed weight sampling is efficient

  • n a variety of platforms : : :

We have not yet been convinced that this is the case.” (n2 − n)=2 comparators in bubble sort produce complaints about performance as n increases.

slide-97
SLIDE 97

18

Combine comparators into a rting network for more inputs. Example of a sorting network:

  • 19

Positions of comparators in a sorting network are independent of the input. Naturally constant-time. But remember all the people complaining about speed: e.g., “We would be happy to hear that fixed weight sampling is efficient

  • n a variety of platforms : : :

We have not yet been convinced that this is the case.” (n2 − n)=2 comparators in bubble sort produce complaints about performance as n increases.

void int32_sort(int32 { int64 if (n t = 1; while for (p for if for for } }

slide-98
SLIDE 98

18

rators into a for more inputs. rting network:

  • 19

Positions of comparators in a sorting network are independent of the input. Naturally constant-time. But remember all the people complaining about speed: e.g., “We would be happy to hear that fixed weight sampling is efficient

  • n a variety of platforms : : :

We have not yet been convinced that this is the case.” (n2 − n)=2 comparators in bubble sort produce complaints about performance as n increases.

void int32_sort(int32 { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - for (p = t;p > for (i = 0;i if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q for (i = 0;i if (!(i & minmax(x+i+p,x+i+q); } }

slide-99
SLIDE 99

18

a inputs.

  • rk:
  • 19

Positions of comparators in a sorting network are independent of the input. Naturally constant-time. But remember all the people complaining about speed: e.g., “We would be happy to hear that fixed weight sampling is efficient

  • n a variety of platforms : : :

We have not yet been convinced that this is the case.” (n2 − n)=2 comparators in bubble sort produce complaints about performance as n increases.

void int32_sort(int32 *x,int64 { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - t) t += for (p = t;p > 0;p >>= for (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q > p;q >>= for (i = 0;i < n - if (!(i & p)) minmax(x+i+p,x+i+q); } }

slide-100
SLIDE 100

19

Positions of comparators in a sorting network are independent of the input. Naturally constant-time. But remember all the people complaining about speed: e.g., “We would be happy to hear that fixed weight sampling is efficient

  • n a variety of platforms : : :

We have not yet been convinced that this is the case.” (n2 − n)=2 comparators in bubble sort produce complaints about performance as n increases.

20

void int32_sort(int32 *x,int64 n) { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - t) t += t; for (p = t;p > 0;p >>= 1) { for (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q > p;q >>= 1) for (i = 0;i < n - q;++i) if (!(i & p)) minmax(x+i+p,x+i+q); } }

slide-101
SLIDE 101

19

  • sitions of comparators

rting network are endent of the input. Naturally constant-time. remember all the people complaining about speed: e.g.,

  • uld be happy to hear that

eight sampling is efficient variety of platforms : : : ve not yet been convinced this is the case.” n)=2 comparators in bubble roduce complaints about rmance as n increases.

20

void int32_sort(int32 *x,int64 n) { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - t) t += t; for (p = t;p > 0;p >>= 1) { for (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q > p;q >>= 1) for (i = 0;i < n - q;++i) if (!(i & p)) minmax(x+i+p,x+i+q); } }

Previous 1973 Knuth which is 1968 Batcher sorting net ≈n(log2 Much faster Warning:

  • f Batcher’s

require n Also, Wikip networks handling

slide-102
SLIDE 102

19

comparators

  • rk are

the input. constant-time. ll the people

  • ut speed: e.g.,

happy to hear that sampling is efficient platforms : : : been convinced case.” comparators in bubble complaints about n increases.

20

void int32_sort(int32 *x,int64 n) { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - t) t += t; for (p = t;p > 0;p >>= 1) { for (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q > p;q >>= 1) for (i = 0;i < n - q;++i) if (!(i & p)) minmax(x+i+p,x+i+q); } }

Previous slide: C translation 1973 Knuth “merge which is a simplified 1968 Batcher “odd-even sorting networks. ≈n(log2 n)2=4 compa Much faster than bubble Warning: many other

  • f Batcher’s sorting

require n to be a p Also, Wikipedia sa networks : : : are n handling arbitrarily

slide-103
SLIDE 103

19

eople e.g., hear that efficient : : convinced bubble about increases.

20

void int32_sort(int32 *x,int64 n) { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - t) t += t; for (p = t;p > 0;p >>= 1) { for (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q > p;q >>= 1) for (i = 0;i < n - q;++i) if (!(i & p)) minmax(x+i+p,x+i+q); } }

Previous slide: C translation 1973 Knuth “merge exchange”, which is a simplified version 1968 Batcher “odd-even merge” sorting networks. ≈n(log2 n)2=4 comparators. Much faster than bubble sort. Warning: many other descriptions

  • f Batcher’s sorting networks

require n to be a power of 2. Also, Wikipedia says “Sorting networks : : : are not capable handling arbitrarily large inputs

slide-104
SLIDE 104

20

void int32_sort(int32 *x,int64 n) { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - t) t += t; for (p = t;p > 0;p >>= 1) { for (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q > p;q >>= 1) for (i = 0;i < n - q;++i) if (!(i & p)) minmax(x+i+p,x+i+q); } }

21

Previous slide: C translation of 1973 Knuth “merge exchange”, which is a simplified version of 1968 Batcher “odd-even merge” sorting networks. ≈n(log2 n)2=4 comparators. Much faster than bubble sort. Warning: many other descriptions

  • f Batcher’s sorting networks

require n to be a power of 2. Also, Wikipedia says “Sorting networks : : : are not capable of handling arbitrarily large inputs.”

slide-105
SLIDE 105

20

int32_sort(int32 *x,int64 n) t,p,q,i; < 2) return; 1; (t < n - t) t += t; (p = t;p > 0;p >>= 1) { (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); (q = t;q > p;q >>= 1) for (i = 0;i < n - q;++i) if (!(i & p)) minmax(x+i+p,x+i+q);

21

Previous slide: C translation of 1973 Knuth “merge exchange”, which is a simplified version of 1968 Batcher “odd-even merge” sorting networks. ≈n(log2 n)2=4 comparators. Much faster than bubble sort. Warning: many other descriptions

  • f Batcher’s sorting networks

require n to be a power of 2. Also, Wikipedia says “Sorting networks : : : are not capable of handling arbitrarily large inputs.” This constant-time Constant-time Bernstein–Chuengsatiansup– Lange–van “NTRU constant-time

slide-106
SLIDE 106

20

int32_sort(int32 *x,int64 n) return; t) t += t; 0;p >>= 1) { < n - p;++i) p)) minmax(x+i,x+i+p); > p;q >>= 1) 0;i < n - q;++i) & p)) minmax(x+i+p,x+i+q);

21

Previous slide: C translation of 1973 Knuth “merge exchange”, which is a simplified version of 1968 Batcher “odd-even merge” sorting networks. ≈n(log2 n)2=4 comparators. Much faster than bubble sort. Warning: many other descriptions

  • f Batcher’s sorting networks

require n to be a power of 2. Also, Wikipedia says “Sorting networks : : : are not capable of handling arbitrarily large inputs.” This constant-time vecto (fo

  • Constant-time so

included in Bernstein–Chuengsatiansup– Lange–van V “NTRU Prime” soft revamp higher

  • New: “djbso

constant-time so

slide-107
SLIDE 107

20

*x,int64 n) t; 1) { p;++i) minmax(x+i,x+i+p); >>= 1) q;++i) minmax(x+i+p,x+i+q);

21

Previous slide: C translation of 1973 Knuth “merge exchange”, which is a simplified version of 1968 Batcher “odd-even merge” sorting networks. ≈n(log2 n)2=4 comparators. Much faster than bubble sort. Warning: many other descriptions

  • f Batcher’s sorting networks

require n to be a power of 2. Also, Wikipedia says “Sorting networks : : : are not capable of handling arbitrarily large inputs.” This constant-time sorting co vectorization (for Haswell)

  • Constant-time sorting code

included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped fo higher speed

  • New: “djbsort”

constant-time sorting code

slide-108
SLIDE 108

21

Previous slide: C translation of 1973 Knuth “merge exchange”, which is a simplified version of 1968 Batcher “odd-even merge” sorting networks. ≈n(log2 n)2=4 comparators. Much faster than bubble sort. Warning: many other descriptions

  • f Batcher’s sorting networks

require n to be a power of 2. Also, Wikipedia says “Sorting networks : : : are not capable of handling arbitrarily large inputs.”

22

This constant-time sorting code vectorization (for Haswell)

  • Constant-time sorting code

included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed

  • New: “djbsort”

constant-time sorting code

slide-109
SLIDE 109

21

Previous slide: C translation of Knuth “merge exchange”, is a simplified version of Batcher “odd-even merge” networks. (log2 n)2=4 comparators. faster than bubble sort. rning: many other descriptions Batcher’s sorting networks n to be a power of 2. Wikipedia says “Sorting rks : : : are not capable of handling arbitrarily large inputs.”

22

This constant-time sorting code vectorization (for Haswell)

  • Constant-time sorting code

included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed

  • New: “djbsort”

constant-time sorting code The slowdo Massive Includes sorting using

  • n modern

2015 Gueron–Krasnov Haswell ( 25608 stdsort 21844 herf 15136 krasnov

slide-110
SLIDE 110

21

translation of “merge exchange”, simplified version of dd-even merge” rks. comparators. than bubble sort.

  • ther descriptions

rting networks power of 2. says “Sorting not capable of rily large inputs.”

22

This constant-time sorting code vectorization (for Haswell)

  • Constant-time sorting code

included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed

  • New: “djbsort”

constant-time sorting code The slowdown for Massive fast-sorting Includes several effo sorting using AVX2

  • n modern Intel CPUs:

2015 Gueron–Krasnov Haswell (titan0) 25608 stdsort 21844 herf 15136 krasnov

slide-111
SLIDE 111

21

translation of exchange”, version of merge” rs. sort. descriptions rks 2. rting capable of inputs.”

22

This constant-time sorting code vectorization (for Haswell)

  • Constant-time sorting code

included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed

  • New: “djbsort”

constant-time sorting code The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions

  • n modern Intel CPUs: e.g.

2015 Gueron–Krasnov quickso Haswell (titan0) cycles, n = 25608 stdsort 21844 herf 15136 krasnov

slide-112
SLIDE 112

22

This constant-time sorting code vectorization (for Haswell)

  • Constant-time sorting code

included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed

  • New: “djbsort”

constant-time sorting code

23

The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions

  • n modern Intel CPUs: e.g.

2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 15136 krasnov

slide-113
SLIDE 113

22

This constant-time sorting code vectorization (for Haswell)

  • Constant-time sorting code

included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed

  • New: “djbsort”

constant-time sorting code

23

The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions

  • n modern Intel CPUs: e.g.

2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov

slide-114
SLIDE 114

22

This constant-time sorting code vectorization (for Haswell)

  • Constant-time sorting code

included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed

  • New: “djbsort”

constant-time sorting code

23

The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions

  • n modern Intel CPUs: e.g.

2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov 6596 avx2 (2018 djbsort)

slide-115
SLIDE 115

22

This constant-time sorting code vectorization (for Haswell)

  • Constant-time sorting code

included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed

  • New: “djbsort”

constant-time sorting code

23

The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions

  • n modern Intel CPUs: e.g.

2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov 6596 avx2 (2018 djbsort) No slowdown. New speed records!

slide-116
SLIDE 116

22

constant-time sorting code vectorization (for Haswell)

  • Constant-time sorting code

included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed

  • New: “djbsort”

constant-time sorting code

23

The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions

  • n modern Intel CPUs: e.g.

2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov 6596 avx2 (2018 djbsort) No slowdown. New speed records! How can beat standa

slide-117
SLIDE 117

22

constant-time sorting code vectorization (for Haswell) Constant-time sorting code in 2017 Bernstein–Chuengsatiansup– Vredendaal software release revamped for higher speed “djbsort” constant-time sorting code

23

The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions

  • n modern Intel CPUs: e.g.

2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov 6596 avx2 (2018 djbsort) No slowdown. New speed records! How can an n(log n beat standard n log

slide-118
SLIDE 118

22

rting code rization ell) code Bernstein–Chuengsatiansup– redendaal release ed for eed code

23

The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions

  • n modern Intel CPUs: e.g.

2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov 6596 avx2 (2018 djbsort) No slowdown. New speed records! How can an n(log n)2 algorithm beat standard n log n algorithms?

slide-119
SLIDE 119

23

The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions

  • n modern Intel CPUs: e.g.

2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov 6596 avx2 (2018 djbsort) No slowdown. New speed records!

24

How can an n(log n)2 algorithm beat standard n log n algorithms?

slide-120
SLIDE 120

23

The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions

  • n modern Intel CPUs: e.g.

2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov 6596 avx2 (2018 djbsort) No slowdown. New speed records!

24

How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs

  • f various operations.
slide-121
SLIDE 121

23

The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions

  • n modern Intel CPUs: e.g.

2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov 6596 avx2 (2018 djbsort) No slowdown. New speed records!

24

How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs

  • f various operations.

Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers.

slide-122
SLIDE 122

23

The slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize sorting using AVX2 instructions

  • n modern Intel CPUs: e.g.

2015 Gueron–Krasnov quicksort. Haswell (titan0) cycles, n = 768: 25608 stdsort 21844 herf 18548 oldavx2 (2017 BCLvV) 15136 krasnov 6596 avx2 (2018 djbsort) No slowdown. New speed records!

24

How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs

  • f various operations.

Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers. Loading a 32-bit integer from a random address: much slower. Conditional branch: much slower.

slide-123
SLIDE 123

23

slowdown for constant time Massive fast-sorting literature. Includes several efforts to optimize using AVX2 instructions dern Intel CPUs: e.g. Gueron–Krasnov quicksort. ell (titan0) cycles, n = 768: stdsort herf

  • ldavx2 (2017 BCLvV)

krasnov avx2 (2018 djbsort)

  • wdown. New speed records!

24

How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs

  • f various operations.

Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers. Loading a 32-bit integer from a random address: much slower. Conditional branch: much slower. Verification Sorting s Does it w Test the random inputs, decreasing

slide-124
SLIDE 124

23

for constant time rting literature. efforts to optimize VX2 instructions CPUs: e.g. Gueron–Krasnov quicksort. ) cycles, n = 768: (2017 BCLvV) (2018 djbsort) New speed records!

24

How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs

  • f various operations.

Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers. Loading a 32-bit integer from a random address: much slower. Conditional branch: much slower. Verification Sorting software is Does it work corre Test the sorting soft random inputs, increasing decreasing inputs.

slide-125
SLIDE 125

23

constant time literature.

  • ptimize

instructions e.g. quicksort. n = 768: BCLvV) rt) records!

24

How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs

  • f various operations.

Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers. Loading a 32-bit integer from a random address: much slower. Conditional branch: much slower. Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on random inputs, increasing inputs, decreasing inputs. Seems to

slide-126
SLIDE 126

24

How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs

  • f various operations.

Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers. Loading a 32-bit integer from a random address: much slower. Conditional branch: much slower.

25

Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on many random inputs, increasing inputs, decreasing inputs. Seems to work.

slide-127
SLIDE 127

24

How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs

  • f various operations.

Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers. Loading a 32-bit integer from a random address: much slower. Conditional branch: much slower.

25

Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on many random inputs, increasing inputs, decreasing inputs. Seems to work. But are there occasional inputs where this sorting software fails to sort correctly? History: Many security problems involve occasional inputs where TCB works incorrectly.

slide-128
SLIDE 128

24

can an n(log n)2 algorithm standard n log n algorithms? er: well-known trends design, reflecting fundamental hardware costs rious operations. cycle, Haswell core can do “min” ops on 32-bit integers + “max” ops on 32-bit integers. Loading a 32-bit integer from a address: much slower. Conditional branch: much slower.

25

Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on many random inputs, increasing inputs, decreasing inputs. Seems to work. But are there occasional inputs where this sorting software fails to sort correctly? History: Many security problems involve occasional inputs where TCB works incorrectly. For each machine fully unrolled unrolled yes,

slide-129
SLIDE 129

24

(log n)2 algorithm log n algorithms? ell-known trends reflecting rdware costs tions. Haswell core can do 32-bit integers + 32-bit integers. integer from a much slower. ranch: much slower.

25

Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on many random inputs, increasing inputs, decreasing inputs. Seems to work. But are there occasional inputs where this sorting software fails to sort correctly? History: Many security problems involve occasional inputs where TCB works incorrectly. For each used n (e.g., C code normal

  • machine code

symbolic

  • fully unrolled co

new p

  • unrolled min-max

new so

  • yes, code works
slide-130
SLIDE 130

24

rithm rithms? trends costs can do integers + integers. from a wer. slower.

25

Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on many random inputs, increasing inputs, decreasing inputs. Seems to work. But are there occasional inputs where this sorting software fails to sort correctly? History: Many security problems involve occasional inputs where TCB works incorrectly. For each used n (e.g., 768): C code normal compiler

  • machine code

symbolic execution

  • fully unrolled code

new peephole optim

  • unrolled min-max code

new sorting verifier

  • yes, code works
slide-131
SLIDE 131

25

Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on many random inputs, increasing inputs, decreasing inputs. Seems to work. But are there occasional inputs where this sorting software fails to sort correctly? History: Many security problems involve occasional inputs where TCB works incorrectly.

26

For each used n (e.g., 768): C code normal compiler

  • machine code

symbolic execution

  • fully unrolled code

new peephole optimizer

  • unrolled min-max code

new sorting verifier

  • yes, code works
slide-132
SLIDE 132

25

erification rting software is in the TCB. it work correctly? the sorting software on many inputs, increasing inputs, decreasing inputs. Seems to work. re there occasional inputs this sorting software sort correctly? ry: Many security problems

  • ccasional inputs

TCB works incorrectly.

26

For each used n (e.g., 768): C code normal compiler

  • machine code

symbolic execution

  • fully unrolled code

new peephole optimizer

  • unrolled min-max code

new sorting verifier

  • yes, code works

Symbolic use existing with tiny eliminating a few missing

slide-133
SLIDE 133

25

is in the TCB. rrectly? software on many increasing inputs,

  • inputs. Seems to work.

ccasional inputs rting software rrectly? security problems ccasional inputs rks incorrectly.

26

For each used n (e.g., 768): C code normal compiler

  • machine code

symbolic execution

  • fully unrolled code

new peephole optimizer

  • unrolled min-max code

new sorting verifier

  • yes, code works

Symbolic execution: use existing “angr” with tiny new patches eliminating byte splitting, a few missing vecto

slide-134
SLIDE 134

25

TCB.

  • n many

inputs, to work. inputs roblems rrectly.

26

For each used n (e.g., 768): C code normal compiler

  • machine code

symbolic execution

  • fully unrolled code

new peephole optimizer

  • unrolled min-max code

new sorting verifier

  • yes, code works

Symbolic execution: use existing “angr” library, with tiny new patches for eliminating byte splitting, adding a few missing vector instructions.

slide-135
SLIDE 135

26

For each used n (e.g., 768): C code normal compiler

  • machine code

symbolic execution

  • fully unrolled code

new peephole optimizer

  • unrolled min-max code

new sorting verifier

  • yes, code works

27

Symbolic execution: use existing “angr” library, with tiny new patches for eliminating byte splitting, adding a few missing vector instructions.

slide-136
SLIDE 136

26

For each used n (e.g., 768): C code normal compiler

  • machine code

symbolic execution

  • fully unrolled code

new peephole optimizer

  • unrolled min-max code

new sorting verifier

  • yes, code works

27

Symbolic execution: use existing “angr” library, with tiny new patches for eliminating byte splitting, adding a few missing vector instructions. Peephole optimizer: recognize instruction patterns equivalent to min, max.

slide-137
SLIDE 137

26

For each used n (e.g., 768): C code normal compiler

  • machine code

symbolic execution

  • fully unrolled code

new peephole optimizer

  • unrolled min-max code

new sorting verifier

  • yes, code works

27

Symbolic execution: use existing “angr” library, with tiny new patches for eliminating byte splitting, adding a few missing vector instructions. Peephole optimizer: recognize instruction patterns equivalent to min, max. Sorting verifier: decompose DAG into merging networks. Verify each merging network using generalization of 2007 Even–Levi–Litman, correction of 1990 Chung–Ravikumar.

slide-138
SLIDE 138

26

each used n (e.g., 768): C code normal compiler

  • machine code

symbolic execution

  • fully unrolled code

new peephole optimizer

  • unrolled min-max code

new sorting verifier

  • es, code works

27

Symbolic execution: use existing “angr” library, with tiny new patches for eliminating byte splitting, adding a few missing vector instructions. Peephole optimizer: recognize instruction patterns equivalent to min, max. Sorting verifier: decompose DAG into merging networks. Verify each merging network using generalization of 2007 Even–Levi–Litman, correction of 1990 Chung–Ravikumar. Current verified A verified p https://sorting.cr.yp.to Includes automatic simple b verification Web site use the verification Next release verified ARM

slide-139
SLIDE 139

26

(e.g., 768): rmal compiler de symbolic execution code peephole optimizer min-max code sorting verifier

  • rks

27

Symbolic execution: use existing “angr” library, with tiny new patches for eliminating byte splitting, adding a few missing vector instructions. Peephole optimizer: recognize instruction patterns equivalent to min, max. Sorting verifier: decompose DAG into merging networks. Verify each merging network using generalization of 2007 Even–Levi–Litman, correction of 1990 Chung–Ravikumar. Current djbsort release, verified AVX2 code verified portable co https://sorting.cr.yp.to Includes the sorting automatic build-time simple benchmarking verification tools. Web site shows ho use the verification Next release planned: verified ARM NEON

slide-140
SLIDE 140

26

768):

  • mpiler

execution

  • ptimizer

verifier

27

Symbolic execution: use existing “angr” library, with tiny new patches for eliminating byte splitting, adding a few missing vector instructions. Peephole optimizer: recognize instruction patterns equivalent to min, max. Sorting verifier: decompose DAG into merging networks. Verify each merging network using generalization of 2007 Even–Levi–Litman, correction of 1990 Chung–Ravikumar. Current djbsort release, verified AVX2 code and verified portable code: https://sorting.cr.yp.to Includes the sorting code; automatic build-time tests; simple benchmarking program; verification tools. Web site shows how to use the verification tools. Next release planned: verified ARM NEON code.

slide-141
SLIDE 141

27

Symbolic execution: use existing “angr” library, with tiny new patches for eliminating byte splitting, adding a few missing vector instructions. Peephole optimizer: recognize instruction patterns equivalent to min, max. Sorting verifier: decompose DAG into merging networks. Verify each merging network using generalization of 2007 Even–Levi–Litman, correction of 1990 Chung–Ravikumar.

28

Current djbsort release, verified AVX2 code and verified portable code: https://sorting.cr.yp.to Includes the sorting code; automatic build-time tests; simple benchmarking program; verification tools. Web site shows how to use the verification tools. Next release planned: verified ARM NEON code.

slide-142
SLIDE 142

27

  • lic execution:

existing “angr” library, tiny new patches for eliminating byte splitting, adding missing vector instructions. eephole optimizer: recognize instruction patterns equivalent to min, max. rting verifier: decompose into merging networks. each merging network generalization of 2007 Even–Levi–Litman, correction of Chung–Ravikumar.

28

Current djbsort release, verified AVX2 code and verified portable code: https://sorting.cr.yp.to Includes the sorting code; automatic build-time tests; simple benchmarking program; verification tools. Web site shows how to use the verification tools. Next release planned: verified ARM NEON code. The future I don’t thin fundamental

  • crypto
  • stopping
  • making

See the so Firefox has verified constant-time Curve25519+ChaCha20+P I’m working post-quantum

slide-143
SLIDE 143

27

execution: “angr” library, patches for splitting, adding vector instructions.

  • ptimizer:

instruction patterns min, max. decompose merging networks. merging network generalization of 2007 Even–Levi–Litman, correction of Chung–Ravikumar.

28

Current djbsort release, verified AVX2 code and verified portable code: https://sorting.cr.yp.to Includes the sorting code; automatic build-time tests; simple benchmarking program; verification tools. Web site shows how to use the verification tools. Next release planned: verified ARM NEON code. The future I don’t think there fundamental tension

  • crypto performance,
  • stopping timing
  • making sure soft

See the sorting example. Firefox has already verified constant-time Curve25519+ChaCha20+P I’m working on easier post-quantum code,

slide-144
SLIDE 144

27

, adding instructions. patterns

  • se

rks. rk 2007 rrection of

28

Current djbsort release, verified AVX2 code and verified portable code: https://sorting.cr.yp.to Includes the sorting code; automatic build-time tests; simple benchmarking program; verification tools. Web site shows how to use the verification tools. Next release planned: verified ARM NEON code. The future I don’t think there is a fundamental tension between

  • crypto performance,
  • stopping timing attacks,
  • making sure software works.

See the sorting example. Firefox has already deployed verified constant-time softwa Curve25519+ChaCha20+Poly1305. I’m working on easier verification, post-quantum code, faster c

slide-145
SLIDE 145

28

Current djbsort release, verified AVX2 code and verified portable code: https://sorting.cr.yp.to Includes the sorting code; automatic build-time tests; simple benchmarking program; verification tools. Web site shows how to use the verification tools. Next release planned: verified ARM NEON code.

29

The future I don’t think there is a fundamental tension between

  • crypto performance,
  • stopping timing attacks,
  • making sure software works.

See the sorting example. Firefox has already deployed verified constant-time software for Curve25519+ChaCha20+Poly1305. I’m working on easier verification, post-quantum code, faster code.