Sorting integer arrays: Bobs laptop screen: security, speed, and - - PowerPoint PPT Presentation

sorting integer arrays bob s laptop screen security speed
SMART_READER_LITE
LIVE PREVIEW

Sorting integer arrays: Bobs laptop screen: security, speed, and - - PowerPoint PPT Presentation

1 2 Sorting integer arrays: Bobs laptop screen: security, speed, and verification From: Alice D. J. Bernstein Thank you for your University of Illinois at Chicago, submission. We received Ruhr-University Bochum many interesting papers,


slide-1
SLIDE 1

1

Sorting integer arrays: security, speed, and verification

  • D. J. Bernstein

University of Illinois at Chicago, Ruhr-University Bochum

2

Bob’s laptop screen:

From: Alice Thank you for your

  • submission. We received

many interesting papers, and unfortunately your

Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified

  • r forged the message.
slide-2
SLIDE 2

1

rting integer arrays: y, speed, and verification Bernstein University of Illinois at Chicago, Ruhr-University Bochum

2

Bob’s laptop screen:

From: Alice Thank you for your

  • submission. We received

many interesting papers, and unfortunately your

Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified

  • r forged the message.

Trusted TCB: po that is resp the users’

slide-3
SLIDE 3

1

rrays: and verification Illinois at Chicago, Bochum

2

Bob’s laptop screen:

From: Alice Thank you for your

  • submission. We received

many interesting papers, and unfortunately your

Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified

  • r forged the message.

Trusted computing TCB: portion of computer that is responsible the users’ security

slide-4
SLIDE 4

1

verification Chicago,

2

Bob’s laptop screen:

From: Alice Thank you for your

  • submission. We received

many interesting papers, and unfortunately your

Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified

  • r forged the message.

Trusted computing base (TCB TCB: portion of computer system that is responsible for enforcing the users’ security policy.

slide-5
SLIDE 5

2

Bob’s laptop screen:

From: Alice Thank you for your

  • submission. We received

many interesting papers, and unfortunately your

Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified

  • r forged the message.

3

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy.

slide-6
SLIDE 6

2

Bob’s laptop screen:

From: Alice Thank you for your

  • submission. We received

many interesting papers, and unfortunately your

Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified

  • r forged the message.

3

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Bob’s security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice.

slide-7
SLIDE 7

2

Bob’s laptop screen:

From: Alice Thank you for your

  • submission. We received

many interesting papers, and unfortunately your

Bob assumes this message is something Alice actually sent. But today’s “security” systems fail to guarantee this property. Attacker could have modified

  • r forged the message.

3

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Bob’s security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does.

slide-8
SLIDE 8

2

laptop screen:

From: Alice Thank you for your

  • submission. We received

many interesting papers, unfortunately your

assumes this message is something Alice actually sent. day’s “security” systems guarantee this property. er could have modified rged the message.

3

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Bob’s security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does. Examples

  • 1. Attack

in a device Linux

slide-9
SLIDE 9

2

creen:

for your We received interesting papers, unfortunately your

this message is actually sent. “security” systems this property. have modified message.

3

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Bob’s security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does. Examples of attack

  • 1. Attacker uses buffer

in a device driver Linux kernel on

slide-10
SLIDE 10

2

received papers, your

is sent. systems erty. dified

3

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Bob’s security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does. Examples of attack strategies:

  • 1. Attacker uses buffer overflo

in a device driver to control Linux kernel on Alice’s laptop.

slide-11
SLIDE 11

3

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Bob’s security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does.

4

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

slide-12
SLIDE 12

3

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Bob’s security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does.

4

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop.

slide-13
SLIDE 13

3

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Bob’s security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does.

4

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc.

slide-14
SLIDE 14

3

Trusted computing base (TCB) TCB: portion of computer system that is responsible for enforcing the users’ security policy. Bob’s security policy for this talk: If message is displayed on Bob’s screen as “From: Alice” then message is from Alice. If TCB works correctly, then message is guaranteed to be from Alice, no matter what the rest of the system does.

4

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?

slide-15
SLIDE 15

3

rusted computing base (TCB) portion of computer system responsible for enforcing users’ security policy. security policy for this talk: message is displayed on screen as “From: Alice” message is from Alice. works correctly, message is guaranteed from Alice, no matter what rest of the system does.

4

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this? Classic sec Rearchitect to have a

slide-16
SLIDE 16

3

computing base (TCB) computer system

  • nsible for enforcing

security policy.

  • licy for this talk:

displayed on “From: Alice” from Alice. rrectly, guaranteed Alice, no matter what system does.

4

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this? Classic security strategy: Rearchitect computer to have a much smaller

slide-17
SLIDE 17

3

(TCB) system rcing this talk:

Alice”

Alice. ranteed matter what es.

4

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this? Classic security strategy: Rearchitect computer systems to have a much smaller TCB

slide-18
SLIDE 18

4

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?

5

Classic security strategy: Rearchitect computer systems to have a much smaller TCB.

slide-19
SLIDE 19

4

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?

5

Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB.

slide-20
SLIDE 20

4

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?

5

Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs.

slide-21
SLIDE 21

4

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?

5

Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly.

slide-22
SLIDE 22

4

Examples of attack strategies:

  • 1. Attacker uses buffer overflow

in a device driver to control Linux kernel on Alice’s laptop.

  • 2. Attacker uses buffer overflow

in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?

5

Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly. Alice also runs many VMs.

slide-23
SLIDE 23

4

Examples of attack strategies: ttacker uses buffer overflow device driver to control Linux kernel on Alice’s laptop. ttacker uses buffer overflow web browser to control files on Bob’s laptop. driver is in the TCB. rowser is in the TCB. is in the TCB. Etc. Massive TCB has many bugs, including many security holes. hope of fixing this?

5

Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly. Alice also runs many VMs. Cryptography How does that incoming is from Alice’s Cryptographic Message-authentication Alice’s authenticated authenticated Alice’s

slide-24
SLIDE 24

4

attack strategies: buffer overflow driver to control

  • n Alice’s laptop.

buffer overflow wser to control Bob’s laptop. in the TCB. in the TCB.

  • TCB. Etc.

has many bugs, security holes. fixing this?

5

Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly. Alice also runs many VMs. Cryptography How does Bob’s laptop that incoming netw is from Alice’s laptop? Cryptographic solution: Message-authentication Alice’s message

  • authenticated message

untrusted

  • authenticated message
  • Alice’s message
slide-25
SLIDE 25

4

strategies:

  • verflow

control laptop.

  • verflow

control laptop. TCB. TCB. bugs, holes.

5

Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly. Alice also runs many VMs. Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message

  • authenticated message

untrusted netwo

  • authenticated message
  • Alice’s message
slide-26
SLIDE 26

5

Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly. Alice also runs many VMs.

6

Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message

  • k
  • authenticated message

untrusted network

  • authenticated message
  • Alice’s message

k

slide-27
SLIDE 27

5

Classic security strategy: Rearchitect computer systems to have a much smaller TCB. Carefully audit the TCB. e.g. Bob runs many VMs: VM A Alice data VM C Charlie data · · · TCB stops each VM from touching data in other VMs. Browser in VM C isn’t in TCB. Can’t touch data in VM A, if TCB works correctly. Alice also runs many VMs.

6

Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message

  • k
  • authenticated message

untrusted network

  • modified message
  • “Alert: forgery!”

k

slide-28
SLIDE 28

5

security strategy: rchitect computer systems have a much smaller TCB. refully audit the TCB. Bob runs many VMs: A data VM C Charlie data · · · stops each VM from touching data in other VMs. wser in VM C isn’t in TCB. touch data in VM A, works correctly. also runs many VMs.

6

Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message

  • k
  • authenticated message

untrusted network

  • modified message
  • “Alert: forgery!”

k

  • Important

to share What if

  • n their
slide-29
SLIDE 29

5

strategy: computer systems smaller TCB. the TCB. many VMs: VM C Charlie data · · · VM from

  • ther VMs.

C isn’t in TCB. ta in VM A, rrectly. many VMs.

6

Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message

  • k
  • authenticated message

untrusted network

  • modified message
  • “Alert: forgery!”

k

  • Important for Alice

to share the same What if attacker w

  • n their communication
slide-30
SLIDE 30

5

systems TCB. · · · VMs. TCB. A, VMs.

6

Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message

  • k
  • authenticated message

untrusted network

  • modified message
  • “Alert: forgery!”

k

  • Important for Alice and Bob

to share the same secret k. What if attacker was spying

  • n their communication of k
slide-31
SLIDE 31

6

Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message

  • k
  • authenticated message

untrusted network

  • modified message
  • “Alert: forgery!”

k

  • 7

Important for Alice and Bob to share the same secret k. What if attacker was spying

  • n their communication of k?
slide-32
SLIDE 32

6

Cryptography How does Bob’s laptop know that incoming network data is from Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message

  • k
  • authenticated message

untrusted network

  • modified message
  • “Alert: forgery!”

k

  • 7

Important for Alice and Bob to share the same secret k. What if attacker was spying

  • n their communication of k?

Solution 1: Public-key encryption. k private key a

  • ciphertext
  • public key aG

network

  • ciphertext

network

  • public key aG
  • k
slide-33
SLIDE 33

6

Cryptography does Bob’s laptop know incoming network data Alice’s laptop? Cryptographic solution: Message-authentication codes. Alice’s message

  • k
  • authenticated message

untrusted network

  • dified message
  • “Alert: forgery!”

k

  • 7

Important for Alice and Bob to share the same secret k. What if attacker was spying

  • n their communication of k?

Solution 1: Public-key encryption. k private key a

  • ciphertext
  • public key aG

network

  • ciphertext

network

  • public key aG
  • k
  • Solution

Public-key m

  • signed message
  • signed message
  • m
slide-34
SLIDE 34

6

laptop know network data laptop? solution: Message-authentication codes. message k

  • message

untrusted network message rgery!” k

  • 7

Important for Alice and Bob to share the same secret k. What if attacker was spying

  • n their communication of k?

Solution 1: Public-key encryption. k private key a

  • ciphertext
  • public key aG

network

  • ciphertext

network

  • public key aG
  • k
  • Solution 2:

Public-key signatures. m

  • signed message

network

  • signed message

m

slide-35
SLIDE 35

6

know data des. k work k

7

Important for Alice and Bob to share the same secret k. What if attacker was spying

  • n their communication of k?

Solution 1: Public-key encryption. k private key a

  • ciphertext
  • public key aG

network

  • ciphertext

network

  • public key aG
  • k
  • Solution 2:

Public-key signatures. m

  • a
  • signed message

network

  • aG

net

  • signed message
  • aG
  • m
slide-36
SLIDE 36

7

Important for Alice and Bob to share the same secret k. What if attacker was spying

  • n their communication of k?

Solution 1: Public-key encryption. k private key a

  • ciphertext
  • public key aG

network

  • ciphertext

network

  • public key aG
  • k
  • 8

Solution 2: Public-key signatures. m

  • a
  • signed message

network

  • aG

network

  • signed message
  • aG
  • m
slide-37
SLIDE 37

7

Important for Alice and Bob to share the same secret k. What if attacker was spying

  • n their communication of k?

Solution 1: Public-key encryption. k private key a

  • ciphertext
  • public key aG

network

  • ciphertext

network

  • public key aG
  • k
  • 8

Solution 2: Public-key signatures. m

  • a
  • signed message

network

  • aG

network

  • signed message
  • aG
  • m

No more shared secret k but Alice still has secret a. Cryptography requires TCB to protect secrecy of keys, even if user has no other secrets.

slide-38
SLIDE 38

7

rtant for Alice and Bob re the same secret k. if attacker was spying their communication of k? Solution 1: Public-key encryption. private key a

  • ciphertext

public key aG network

  • ciphertext

network public key aG

  • 8

Solution 2: Public-key signatures. m

  • a
  • signed message

network

  • aG

network

  • signed message
  • aG
  • m

No more shared secret k but Alice still has secret a. Cryptography requires TCB to protect secrecy of keys, even if user has no other secrets. Constant-time Large po

  • ptimizations

addresses Consider instruction parallel cache store-to-load branch p

slide-39
SLIDE 39

7

Alice and Bob same secret k. was spying communication of k? encryption. private key a

  • public key aG

network

  • public key aG

8

Solution 2: Public-key signatures. m

  • a
  • signed message

network

  • aG

network

  • signed message
  • aG
  • m

No more shared secret k but Alice still has secret a. Cryptography requires TCB to protect secrecy of keys, even if user has no other secrets. Constant-time soft Large portion of CPU

  • ptimizations depending

addresses of memo Consider data cachin instruction caching, parallel cache banks, store-to-load forwa branch prediction,

slide-40
SLIDE 40

7

Bob . ying

  • f k?

key a y aG network y aG

8

Solution 2: Public-key signatures. m

  • a
  • signed message

network

  • aG

network

  • signed message
  • aG
  • m

No more shared secret k but Alice still has secret a. Cryptography requires TCB to protect secrecy of keys, even if user has no other secrets. Constant-time software Large portion of CPU hardw

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc.

slide-41
SLIDE 41

8

Solution 2: Public-key signatures. m

  • a
  • signed message

network

  • aG

network

  • signed message
  • aG
  • m

No more shared secret k but Alice still has secret a. Cryptography requires TCB to protect secrecy of keys, even if user has no other secrets.

9

Constant-time software Large portion of CPU hardware:

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc.

slide-42
SLIDE 42

8

Solution 2: Public-key signatures. m

  • a
  • signed message

network

  • aG

network

  • signed message
  • aG
  • m

No more shared secret k but Alice still has secret a. Cryptography requires TCB to protect secrecy of keys, even if user has no other secrets.

9

Constant-time software Large portion of CPU hardware:

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks show that this portion of the CPU has trouble keeping secrets. e.g. RIDL: 2019 Schaik–Milburn–¨ Osterlund–Frigo– Maisuradze–Razavi–Bos–Giuffrida.

slide-43
SLIDE 43

8

Solution 2: Public-key signatures. m

  • a
  • message

network

  • aG

network

  • message
  • aG
  • m

re shared secret k Alice still has secret a. Cryptography requires TCB rotect secrecy of keys, if user has no other secrets.

9

Constant-time software Large portion of CPU hardware:

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks show that this portion of the CPU has trouble keeping secrets. e.g. RIDL: 2019 Schaik–Milburn–¨ Osterlund–Frigo– Maisuradze–Razavi–Bos–Giuffrida. Typical literature Understand But details not exposed Try to push This becomes Tweak the to try to

slide-44
SLIDE 44

8

signatures. a

  • aG

network

  • aG

secret k has secret a. requires TCB secrecy of keys, no other secrets.

9

Constant-time software Large portion of CPU hardware:

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks show that this portion of the CPU has trouble keeping secrets. e.g. RIDL: 2019 Schaik–Milburn–¨ Osterlund–Frigo– Maisuradze–Razavi–Bos–Giuffrida. Typical literature on Understand this po But details are often not exposed to securit Try to push attacks This becomes very Tweak the attacked to try to stop the kno

slide-45
SLIDE 45

8

network . TCB eys, secrets.

9

Constant-time software Large portion of CPU hardware:

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks show that this portion of the CPU has trouble keeping secrets. e.g. RIDL: 2019 Schaik–Milburn–¨ Osterlund–Frigo– Maisuradze–Razavi–Bos–Giuffrida. Typical literature on this topic: Understand this portion of CPU. But details are often proprieta not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks.

slide-46
SLIDE 46

9

Constant-time software Large portion of CPU hardware:

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks show that this portion of the CPU has trouble keeping secrets. e.g. RIDL: 2019 Schaik–Milburn–¨ Osterlund–Frigo– Maisuradze–Razavi–Bos–Giuffrida.

10

Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks.

slide-47
SLIDE 47

9

Constant-time software Large portion of CPU hardware:

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks show that this portion of the CPU has trouble keeping secrets. e.g. RIDL: 2019 Schaik–Milburn–¨ Osterlund–Frigo– Maisuradze–Razavi–Bos–Giuffrida.

10

Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great!

slide-48
SLIDE 48

9

Constant-time software Large portion of CPU hardware:

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, parallel cache banks, store-to-load forwarding, branch prediction, etc. Many attacks show that this portion of the CPU has trouble keeping secrets. e.g. RIDL: 2019 Schaik–Milburn–¨ Osterlund–Frigo– Maisuradze–Razavi–Bos–Giuffrida.

10

Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security.

slide-49
SLIDE 49

9

Constant-time software portion of CPU hardware:

  • ptimizations depending on

addresses of memory locations. Consider data caching, instruction caching, rallel cache banks, re-to-load forwarding, prediction, etc. attacks show that this rtion of the CPU has trouble eeping secrets. e.g. RIDL: 2019 Schaik–Milburn–¨ Osterlund–Frigo– Maisuradze–Razavi–Bos–Giuffrida.

10

Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security. The “constant-time” Don’t give to this p (1987 Goldreich, Oblivious domain-sp

slide-50
SLIDE 50

9

software CPU hardware: depending on memory locations. caching, caching, banks, rwarding, rediction, etc. show that this CPU has trouble e.g. RIDL: 2019 ¨ Osterlund–Frigo– Maisuradze–Razavi–Bos–Giuffrida.

10

Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security. The “constant-time” Don’t give any secrets to this portion of the (1987 Goldreich, 1990 Oblivious RAM; 2004 domain-specific for

slide-51
SLIDE 51

9

rdware:

  • n

cations. this trouble RIDL: 2019 Osterlund–Frigo– Maisuradze–Razavi–Bos–Giuffrida.

10

Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security. The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better sp

slide-52
SLIDE 52

10

Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security.

11

The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed)

slide-53
SLIDE 53

10

Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security.

11

The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion

  • f the CPU to be correct, but

don’t need it to keep secrets. Makes auditing much easier.

slide-54
SLIDE 54

10

Typical literature on this topic: Understand this portion of CPU. But details are often proprietary, not exposed to security review. Try to push attacks further. This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security.

11

The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion

  • f the CPU to be correct, but

don’t need it to keep secrets. Makes auditing much easier. Good match for attitude and experience of CPU designers: e.g., Intel issues errata for correctness bugs, not for information leaks.

slide-55
SLIDE 55

10

ypical literature on this topic: Understand this portion of CPU. details are often proprietary, exposed to security review. push attacks further. ecomes very complicated. the attacked software to stop the known attacks. researchers: This is great! auditors: This is a nightmare. years of security failures. confidence in future security.

11

The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion

  • f the CPU to be correct, but

don’t need it to keep secrets. Makes auditing much easier. Good match for attitude and experience of CPU designers: e.g., Intel issues errata for correctness bugs, not for information leaks. Case study: Serious risk Attacker breaking public-key e.g., finding

slide-56
SLIDE 56

10

literature on this topic: portion of CPU.

  • ften proprietary,

security review. ttacks further. very complicated. attacked software the known attacks. This is great! This is a nightmare. security failures. future security.

11

The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion

  • f the CPU to be correct, but

don’t need it to keep secrets. Makes auditing much easier. Good match for attitude and experience of CPU designers: e.g., Intel issues errata for correctness bugs, not for information leaks. Case study: Constant-time Serious risk within Attacker has quantum breaking today’s most public-key crypto (RSA e.g., finding a given

slide-57
SLIDE 57

10

topic:

  • f CPU.

rietary, review. further. complicated. are attacks. great! nightmare. failures. security.

11

The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion

  • f the CPU to be correct, but

don’t need it to keep secrets. Makes auditing much easier. Good match for attitude and experience of CPU designers: e.g., Intel issues errata for correctness bugs, not for information leaks. Case study: Constant-time so Serious risk within 10 years: Attacker has quantum computer breaking today’s most popula public-key crypto (RSA and e.g., finding a given aG).

slide-58
SLIDE 58

11

The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion

  • f the CPU to be correct, but

don’t need it to keep secrets. Makes auditing much easier. Good match for attitude and experience of CPU designers: e.g., Intel issues errata for correctness bugs, not for information leaks.

12

Case study: Constant-time sorting Serious risk within 10 years: Attacker has quantum computer breaking today’s most popular public-key crypto (RSA and ECC; e.g., finding a given aG).

slide-59
SLIDE 59

11

The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion

  • f the CPU to be correct, but

don’t need it to keep secrets. Makes auditing much easier. Good match for attitude and experience of CPU designers: e.g., Intel issues errata for correctness bugs, not for information leaks.

12

Case study: Constant-time sorting Serious risk within 10 years: Attacker has quantum computer breaking today’s most popular public-key crypto (RSA and ECC; e.g., finding a given aG). 2017: Hundreds of people submit 69 complete proposals to international competition for post-quantum crypto standards.

slide-60
SLIDE 60

11

The “constant-time” solution: Don’t give any secrets to this portion of the CPU. (1987 Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) TCB analysis: Need this portion

  • f the CPU to be correct, but

don’t need it to keep secrets. Makes auditing much easier. Good match for attitude and experience of CPU designers: e.g., Intel issues errata for correctness bugs, not for information leaks.

12

Case study: Constant-time sorting Serious risk within 10 years: Attacker has quantum computer breaking today’s most popular public-key crypto (RSA and ECC; e.g., finding a given aG). 2017: Hundreds of people submit 69 complete proposals to international competition for post-quantum crypto standards. Subroutine in some submissions: sort array of secret integers. e.g. sort 768 32-bit integers.

slide-61
SLIDE 61

11

“constant-time” solution: give any secrets portion of the CPU. Goldreich, 1990 Ostrovsky: Oblivious RAM; 2004 Bernstein: domain-specific for better speed) analysis: Need this portion CPU to be correct, but need it to keep secrets. auditing much easier. match for attitude and erience of CPU designers: e.g., issues errata for correctness not for information leaks.

12

Case study: Constant-time sorting Serious risk within 10 years: Attacker has quantum computer breaking today’s most popular public-key crypto (RSA and ECC; e.g., finding a given aG). 2017: Hundreds of people submit 69 complete proposals to international competition for post-quantum crypto standards. Subroutine in some submissions: sort array of secret integers. e.g. sort 768 32-bit integers. How to so without

slide-62
SLIDE 62

11

“constant-time” solution: secrets

  • f the CPU.

Goldreich, 1990 Ostrovsky: 2004 Bernstein: for better speed) Need this portion e correct, but keep secrets. much easier. attitude and U designers: e.g., errata for correctness information leaks.

12

Case study: Constant-time sorting Serious risk within 10 years: Attacker has quantum computer breaking today’s most popular public-key crypto (RSA and ECC; e.g., finding a given aG). 2017: Hundreds of people submit 69 complete proposals to international competition for post-quantum crypto standards. Subroutine in some submissions: sort array of secret integers. e.g. sort 768 32-bit integers. How to sort secret without any secret

slide-63
SLIDE 63

11

solution: CPU. Ostrovsky: Bernstein: speed)

  • rtion

but secrets. easier. and designers: e.g., rrectness leaks.

12

Case study: Constant-time sorting Serious risk within 10 years: Attacker has quantum computer breaking today’s most popular public-key crypto (RSA and ECC; e.g., finding a given aG). 2017: Hundreds of people submit 69 complete proposals to international competition for post-quantum crypto standards. Subroutine in some submissions: sort array of secret integers. e.g. sort 768 32-bit integers. How to sort secret data without any secret addresses?

slide-64
SLIDE 64

12

Case study: Constant-time sorting Serious risk within 10 years: Attacker has quantum computer breaking today’s most popular public-key crypto (RSA and ECC; e.g., finding a given aG). 2017: Hundreds of people submit 69 complete proposals to international competition for post-quantum crypto standards. Subroutine in some submissions: sort array of secret integers. e.g. sort 768 32-bit integers.

13

How to sort secret data without any secret addresses?

slide-65
SLIDE 65

12

Case study: Constant-time sorting Serious risk within 10 years: Attacker has quantum computer breaking today’s most popular public-key crypto (RSA and ECC; e.g., finding a given aG). 2017: Hundreds of people submit 69 complete proposals to international competition for post-quantum crypto standards. Subroutine in some submissions: sort array of secret integers. e.g. sort 768 32-bit integers.

13

How to sort secret data without any secret addresses? Typical sorting algorithms— merge sort, quicksort, etc.— choose load/store addresses based on secret data. Usually also branch based on secret data.

slide-66
SLIDE 66

12

Case study: Constant-time sorting Serious risk within 10 years: Attacker has quantum computer breaking today’s most popular public-key crypto (RSA and ECC; e.g., finding a given aG). 2017: Hundreds of people submit 69 complete proposals to international competition for post-quantum crypto standards. Subroutine in some submissions: sort array of secret integers. e.g. sort 768 32-bit integers.

13

How to sort secret data without any secret addresses? Typical sorting algorithms— merge sort, quicksort, etc.— choose load/store addresses based on secret data. Usually also branch based on secret data. One submission to competition: “Radix sort is used as constant-time sorting algorithm.” Some versions of radix sort avoid secret branches.

slide-67
SLIDE 67

12

Case study: Constant-time sorting Serious risk within 10 years: Attacker has quantum computer breaking today’s most popular public-key crypto (RSA and ECC; e.g., finding a given aG). 2017: Hundreds of people submit 69 complete proposals to international competition for post-quantum crypto standards. Subroutine in some submissions: sort array of secret integers. e.g. sort 768 32-bit integers.

13

How to sort secret data without any secret addresses? Typical sorting algorithms— merge sort, quicksort, etc.— choose load/store addresses based on secret data. Usually also branch based on secret data. One submission to competition: “Radix sort is used as constant-time sorting algorithm.” Some versions of radix sort avoid secret branches. But data addresses in radix sort still depend on secrets.

slide-68
SLIDE 68

12

study: Constant-time sorting Serious risk within 10 years: er has quantum computer reaking today’s most popular public-key crypto (RSA and ECC; finding a given aG). Hundreds of people submit 69 complete proposals international competition for

  • st-quantum crypto standards.

routine in some submissions: rray of secret integers. rt 768 32-bit integers.

13

How to sort secret data without any secret addresses? Typical sorting algorithms— merge sort, quicksort, etc.— choose load/store addresses based on secret data. Usually also branch based on secret data. One submission to competition: “Radix sort is used as constant-time sorting algorithm.” Some versions of radix sort avoid secret branches. But data addresses in radix sort still depend on secrets. Foundation a compa x

  • min{x; y

Easy constant-time Warning: compiler Even easier

slide-69
SLIDE 69

12

Constant-time sorting within 10 years: quantum computer most popular crypto (RSA and ECC; given aG).

  • f people

complete proposals competition for crypto standards. some submissions: cret integers. 32-bit integers.

13

How to sort secret data without any secret addresses? Typical sorting algorithms— merge sort, quicksort, etc.— choose load/store addresses based on secret data. Usually also branch based on secret data. One submission to competition: “Radix sort is used as constant-time sorting algorithm.” Some versions of radix sort avoid secret branches. But data addresses in radix sort still depend on secrets. Foundation of solution: a comparator sorting x

  • min{x; y}

max Easy constant-time Warning: C standa compiler to screw Even easier exercise

slide-70
SLIDE 70

12

Constant-time sorting rs: computer

  • pular

and ECC;

  • sals

etition for standards. submissions: integers. gers.

13

How to sort secret data without any secret addresses? Typical sorting algorithms— merge sort, quicksort, etc.— choose load/store addresses based on secret data. Usually also branch based on secret data. One submission to competition: “Radix sort is used as constant-time sorting algorithm.” Some versions of radix sort avoid secret branches. But data addresses in radix sort still depend on secrets. Foundation of solution: a comparator sorting 2 integers. x y

  • min{x; y}

max{x; y} Easy constant-time exercise Warning: C standard allows compiler to screw this up. Even easier exercise in asm.

slide-71
SLIDE 71

13

How to sort secret data without any secret addresses? Typical sorting algorithms— merge sort, quicksort, etc.— choose load/store addresses based on secret data. Usually also branch based on secret data. One submission to competition: “Radix sort is used as constant-time sorting algorithm.” Some versions of radix sort avoid secret branches. But data addresses in radix sort still depend on secrets.

14

Foundation of solution: a comparator sorting 2 integers. x y

  • min{x; y}

max{x; y} Easy constant-time exercise in C. Warning: C standard allows compiler to screw this up. Even easier exercise in asm.

slide-72
SLIDE 72

13

to sort secret data without any secret addresses? ypical sorting algorithms— sort, quicksort, etc.— load/store addresses

  • n secret data. Usually

anch based on secret data. submission to competition: sort is used as constant-time sorting algorithm.” versions of radix sort secret branches. data addresses in radix sort depend on secrets.

14

Foundation of solution: a comparator sorting 2 integers. x y

  • min{x; y}

max{x; y} Easy constant-time exercise in C. Warning: C standard allows compiler to screw this up. Even easier exercise in asm. Combine sorting net Example

slide-73
SLIDE 73

13

ecret data secret addresses? algorithms— quicksort, etc.— re addresses

  • data. Usually

based on secret data. to competition: used as rting algorithm.”

  • f radix sort

ranches. addresses in radix sort secrets.

14

Foundation of solution: a comparator sorting 2 integers. x y

  • min{x; y}

max{x; y} Easy constant-time exercise in C. Warning: C standard allows compiler to screw this up. Even easier exercise in asm. Combine comparato sorting network fo Example of a sorting

slide-74
SLIDE 74

13

addresses? rithms— tc.— addresses Usually secret data. etition: rithm.” rt radix sort

14

Foundation of solution: a comparator sorting 2 integers. x y

  • min{x; y}

max{x; y} Easy constant-time exercise in C. Warning: C standard allows compiler to screw this up. Even easier exercise in asm. Combine comparators into a sorting network for more inputs. Example of a sorting network:

slide-75
SLIDE 75

14

Foundation of solution: a comparator sorting 2 integers. x y

  • min{x; y}

max{x; y} Easy constant-time exercise in C. Warning: C standard allows compiler to screw this up. Even easier exercise in asm.

15

Combine comparators into a sorting network for more inputs. Example of a sorting network:

slide-76
SLIDE 76

14

  • undation of solution:

comparator sorting 2 integers. y

  • ; y}

max{x; y} constant-time exercise in C. rning: C standard allows compiler to screw this up. easier exercise in asm.

15

Combine comparators into a sorting network for more inputs. Example of a sorting network:

  • Positions

in a sorting independent Naturally

slide-77
SLIDE 77

14

solution: sorting 2 integers. y

  • max{x; y}

constant-time exercise in C. standard allows screw this up. exercise in asm.

15

Combine comparators into a sorting network for more inputs. Example of a sorting network:

  • Positions of compa

in a sorting network independent of the Naturally constant-time.

slide-78
SLIDE 78

14

integers. } exercise in C. ws .

15

Combine comparators into a sorting network for more inputs. Example of a sorting network:

  • Positions of comparators

in a sorting network are independent of the input. Naturally constant-time.

slide-79
SLIDE 79

15

Combine comparators into a sorting network for more inputs. Example of a sorting network:

  • 16

Positions of comparators in a sorting network are independent of the input. Naturally constant-time.

slide-80
SLIDE 80

15

Combine comparators into a sorting network for more inputs. Example of a sorting network:

  • 16

Positions of comparators in a sorting network are independent of the input. Naturally constant-time. But (n2 − n)=2 comparators produce complaints about performance as n increases.

slide-81
SLIDE 81

15

Combine comparators into a sorting network for more inputs. Example of a sorting network:

  • 16

Positions of comparators in a sorting network are independent of the input. Naturally constant-time. But (n2 − n)=2 comparators produce complaints about performance as n increases. Speed is a serious issue in the post-quantum competition. “Cost” is evaluation criterion; “we’d like to stress this once again on the forum that we’d really like to see more platform-

  • ptimized implementations”; etc.
slide-82
SLIDE 82

15

Combine comparators into a rting network for more inputs. Example of a sorting network:

  • 16

Positions of comparators in a sorting network are independent of the input. Naturally constant-time. But (n2 − n)=2 comparators produce complaints about performance as n increases. Speed is a serious issue in the post-quantum competition. “Cost” is evaluation criterion; “we’d like to stress this once again on the forum that we’d really like to see more platform-

  • ptimized implementations”; etc.

void int32_sort(int32 { int64 if (n t = 1; while for (p for if for for } }

slide-83
SLIDE 83

15

rators into a for more inputs. rting network:

  • 16

Positions of comparators in a sorting network are independent of the input. Naturally constant-time. But (n2 − n)=2 comparators produce complaints about performance as n increases. Speed is a serious issue in the post-quantum competition. “Cost” is evaluation criterion; “we’d like to stress this once again on the forum that we’d really like to see more platform-

  • ptimized implementations”; etc.

void int32_sort(int32 { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - for (p = t;p > for (i = 0;i if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q for (i = 0;i if (!(i & minmax(x+i+p,x+i+q); } }

slide-84
SLIDE 84

15

a inputs.

  • rk:
  • 16

Positions of comparators in a sorting network are independent of the input. Naturally constant-time. But (n2 − n)=2 comparators produce complaints about performance as n increases. Speed is a serious issue in the post-quantum competition. “Cost” is evaluation criterion; “we’d like to stress this once again on the forum that we’d really like to see more platform-

  • ptimized implementations”; etc.

void int32_sort(int32 *x,int64 { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - t) t += for (p = t;p > 0;p >>= for (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q > p;q >>= for (i = 0;i < n - if (!(i & p)) minmax(x+i+p,x+i+q); } }

slide-85
SLIDE 85

16

Positions of comparators in a sorting network are independent of the input. Naturally constant-time. But (n2 − n)=2 comparators produce complaints about performance as n increases. Speed is a serious issue in the post-quantum competition. “Cost” is evaluation criterion; “we’d like to stress this once again on the forum that we’d really like to see more platform-

  • ptimized implementations”; etc.

17

void int32_sort(int32 *x,int64 n) { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - t) t += t; for (p = t;p > 0;p >>= 1) { for (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q > p;q >>= 1) for (i = 0;i < n - q;++i) if (!(i & p)) minmax(x+i+p,x+i+q); } }

slide-86
SLIDE 86

16

  • sitions of comparators

rting network are endent of the input. Naturally constant-time.

2 − n)=2 comparators

duce complaints about rmance as n increases. is a serious issue in the

  • st-quantum competition.

is evaluation criterion; like to stress this once

  • n the forum that we’d

like to see more platform-

  • ptimized implementations”; etc.

17

void int32_sort(int32 *x,int64 n) { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - t) t += t; for (p = t;p > 0;p >>= 1) { for (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q > p;q >>= 1) for (i = 0;i < n - q;++i) if (!(i & p)) minmax(x+i+p,x+i+q); } }

Previous 1973 Knuth which is 1968 Batcher sorting net ≈n(log2 Much faster Warning:

  • f Batcher’s

require n Also, Wikip networks handling

slide-87
SLIDE 87

16

comparators

  • rk are

the input. constant-time. comparators complaints about n increases. serious issue in the competition. evaluation criterion; stress this once rum that we’d more platform- implementations”; etc.

17

void int32_sort(int32 *x,int64 n) { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - t) t += t; for (p = t;p > 0;p >>= 1) { for (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q > p;q >>= 1) for (i = 0;i < n - q;++i) if (!(i & p)) minmax(x+i+p,x+i+q); } }

Previous slide: C translation 1973 Knuth “merge which is a simplified 1968 Batcher “odd-even sorting networks. ≈n(log2 n)2=4 compa Much faster than bubble Warning: many other

  • f Batcher’s sorting

require n to be a p Also, Wikipedia sa networks : : : are n handling arbitrarily

slide-88
SLIDE 88

16

rs increases. the etition. criterion;

  • nce

e’d platform- tations”; etc.

17

void int32_sort(int32 *x,int64 n) { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - t) t += t; for (p = t;p > 0;p >>= 1) { for (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q > p;q >>= 1) for (i = 0;i < n - q;++i) if (!(i & p)) minmax(x+i+p,x+i+q); } }

Previous slide: C translation 1973 Knuth “merge exchange”, which is a simplified version 1968 Batcher “odd-even merge” sorting networks. ≈n(log2 n)2=4 comparators. Much faster than bubble sort. Warning: many other descriptions

  • f Batcher’s sorting networks

require n to be a power of 2. Also, Wikipedia says “Sorting networks : : : are not capable handling arbitrarily large inputs

slide-89
SLIDE 89

17

void int32_sort(int32 *x,int64 n) { int64 t,p,q,i; if (n < 2) return; t = 1; while (t < n - t) t += t; for (p = t;p > 0;p >>= 1) { for (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); for (q = t;q > p;q >>= 1) for (i = 0;i < n - q;++i) if (!(i & p)) minmax(x+i+p,x+i+q); } }

18

Previous slide: C translation of 1973 Knuth “merge exchange”, which is a simplified version of 1968 Batcher “odd-even merge” sorting networks. ≈n(log2 n)2=4 comparators. Much faster than bubble sort. Warning: many other descriptions

  • f Batcher’s sorting networks

require n to be a power of 2. Also, Wikipedia says “Sorting networks : : : are not capable of handling arbitrarily large inputs.”

slide-90
SLIDE 90

17

int32_sort(int32 *x,int64 n) t,p,q,i; < 2) return; 1; (t < n - t) t += t; (p = t;p > 0;p >>= 1) { (i = 0;i < n - p;++i) if (!(i & p)) minmax(x+i,x+i+p); (q = t;q > p;q >>= 1) for (i = 0;i < n - q;++i) if (!(i & p)) minmax(x+i+p,x+i+q);

18

Previous slide: C translation of 1973 Knuth “merge exchange”, which is a simplified version of 1968 Batcher “odd-even merge” sorting networks. ≈n(log2 n)2=4 comparators. Much faster than bubble sort. Warning: many other descriptions

  • f Batcher’s sorting networks

require n to be a power of 2. Also, Wikipedia says “Sorting networks : : : are not capable of handling arbitrarily large inputs.” This constant-time Constant-time Bernstein–Chuengsatiansup– Lange–van “NTRU constant-time

slide-91
SLIDE 91

17

int32_sort(int32 *x,int64 n) return; t) t += t; 0;p >>= 1) { < n - p;++i) p)) minmax(x+i,x+i+p); > p;q >>= 1) 0;i < n - q;++i) & p)) minmax(x+i+p,x+i+q);

18

Previous slide: C translation of 1973 Knuth “merge exchange”, which is a simplified version of 1968 Batcher “odd-even merge” sorting networks. ≈n(log2 n)2=4 comparators. Much faster than bubble sort. Warning: many other descriptions

  • f Batcher’s sorting networks

require n to be a power of 2. Also, Wikipedia says “Sorting networks : : : are not capable of handling arbitrarily large inputs.” This constant-time vecto (fo

  • Constant-time so

included in Bernstein–Chuengsatiansup– Lange–van V “NTRU Prime” soft revamp higher

  • New: djbsort

constant-time so

slide-92
SLIDE 92

17

*x,int64 n) t; 1) { p;++i) minmax(x+i,x+i+p); >>= 1) q;++i) minmax(x+i+p,x+i+q);

18

Previous slide: C translation of 1973 Knuth “merge exchange”, which is a simplified version of 1968 Batcher “odd-even merge” sorting networks. ≈n(log2 n)2=4 comparators. Much faster than bubble sort. Warning: many other descriptions

  • f Batcher’s sorting networks

require n to be a power of 2. Also, Wikipedia says “Sorting networks : : : are not capable of handling arbitrarily large inputs.” This constant-time sorting co vectorization (for Haswell)

  • Constant-time sorting code

included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped fo higher speed

  • New: djbsort

constant-time sorting code

slide-93
SLIDE 93

18

Previous slide: C translation of 1973 Knuth “merge exchange”, which is a simplified version of 1968 Batcher “odd-even merge” sorting networks. ≈n(log2 n)2=4 comparators. Much faster than bubble sort. Warning: many other descriptions

  • f Batcher’s sorting networks

require n to be a power of 2. Also, Wikipedia says “Sorting networks : : : are not capable of handling arbitrarily large inputs.”

19

This constant-time sorting code vectorization (for Haswell)

  • Constant-time sorting code

included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed

  • New: djbsort

constant-time sorting code

slide-94
SLIDE 94

18

Previous slide: C translation of Knuth “merge exchange”, is a simplified version of Batcher “odd-even merge” networks. (log2 n)2=4 comparators. faster than bubble sort. rning: many other descriptions Batcher’s sorting networks n to be a power of 2. Wikipedia says “Sorting rks : : : are not capable of handling arbitrarily large inputs.”

19

This constant-time sorting code vectorization (for Haswell)

  • Constant-time sorting code

included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed

  • New: djbsort

constant-time sorting code The slowdo How much by refusin quicksort, Cycles on to sort n 26948 stdsort 22812 herf 17748 krasnov 16980 ipp 12672 sid1607

slide-95
SLIDE 95

18

translation of “merge exchange”, simplified version of dd-even merge” rks. comparators. than bubble sort.

  • ther descriptions

rting networks power of 2. says “Sorting not capable of rily large inputs.”

19

This constant-time sorting code vectorization (for Haswell)

  • Constant-time sorting code

included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed

  • New: djbsort

constant-time sorting code The slowdown for How much speed did by refusing to use quicksort, radix sort, Cycles on Intel Hasw to sort n = 768 32-bit 26948 stdsort (va 22812 herf (variable-time) 17748 krasnov (va 16980 ipp 2019.5 12672 sid1607 (va

slide-96
SLIDE 96

18

translation of exchange”, version of merge” rs. sort. descriptions rks 2. rting capable of inputs.”

19

This constant-time sorting code vectorization (for Haswell)

  • Constant-time sorting code

included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed

  • New: djbsort

constant-time sorting code The slowdown for constant time How much speed did we lose by refusing to use variable-time quicksort, radix sort, etc.? Cycles on Intel Haswell CPU to sort n = 768 32-bit integers: 26948 stdsort (variable-time) 22812 herf (variable-time) 17748 krasnov (variable-time) 16980 ipp 2019.5 (variable-time) 12672 sid1607 (variable-time)

slide-97
SLIDE 97

19

This constant-time sorting code vectorization (for Haswell)

  • Constant-time sorting code

included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed

  • New: djbsort

constant-time sorting code

20

The slowdown for constant time How much speed did we lose by refusing to use variable-time quicksort, radix sort, etc.? Cycles on Intel Haswell CPU core to sort n = 768 32-bit integers: 26948 stdsort (variable-time) 22812 herf (variable-time) 17748 krasnov (variable-time) 16980 ipp 2019.5 (variable-time) 12672 sid1607 (variable-time)

slide-98
SLIDE 98

19

This constant-time sorting code vectorization (for Haswell)

  • Constant-time sorting code

included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed

  • New: djbsort

constant-time sorting code

20

The slowdown for constant time How much speed did we lose by refusing to use variable-time quicksort, radix sort, etc.? Cycles on Intel Haswell CPU core to sort n = 768 32-bit integers: 26948 stdsort (variable-time) 22812 herf (variable-time) 17748 krasnov (variable-time) 16980 ipp 2019.5 (variable-time) 12672 sid1607 (variable-time) 5964 djbsort (constant-time)

slide-99
SLIDE 99

19

This constant-time sorting code vectorization (for Haswell)

  • Constant-time sorting code

included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed

  • New: djbsort

constant-time sorting code

20

The slowdown for constant time How much speed did we lose by refusing to use variable-time quicksort, radix sort, etc.? Cycles on Intel Haswell CPU core to sort n = 768 32-bit integers: 26948 stdsort (variable-time) 22812 herf (variable-time) 17748 krasnov (variable-time) 16980 ipp 2019.5 (variable-time) 12672 sid1607 (variable-time) 5964 djbsort (constant-time) No slowdown. New speed records!

slide-100
SLIDE 100

19

constant-time sorting code vectorization (for Haswell)

  • Constant-time sorting code

included in 2017 Bernstein–Chuengsatiansup– Lange–van Vredendaal “NTRU Prime” software release revamped for higher speed

  • New: djbsort

constant-time sorting code

20

The slowdown for constant time How much speed did we lose by refusing to use variable-time quicksort, radix sort, etc.? Cycles on Intel Haswell CPU core to sort n = 768 32-bit integers: 26948 stdsort (variable-time) 22812 herf (variable-time) 17748 krasnov (variable-time) 16980 ipp 2019.5 (variable-time) 12672 sid1607 (variable-time) 5964 djbsort (constant-time) No slowdown. New speed records! How can beat standa

slide-101
SLIDE 101

19

constant-time sorting code vectorization (for Haswell) Constant-time sorting code in 2017 Bernstein–Chuengsatiansup– Vredendaal software release revamped for higher speed djbsort constant-time sorting code

20

The slowdown for constant time How much speed did we lose by refusing to use variable-time quicksort, radix sort, etc.? Cycles on Intel Haswell CPU core to sort n = 768 32-bit integers: 26948 stdsort (variable-time) 22812 herf (variable-time) 17748 krasnov (variable-time) 16980 ipp 2019.5 (variable-time) 12672 sid1607 (variable-time) 5964 djbsort (constant-time) No slowdown. New speed records! How can an n(log n beat standard n log

slide-102
SLIDE 102

19

rting code rization ell) code Bernstein–Chuengsatiansup– redendaal release ed for eed code

20

The slowdown for constant time How much speed did we lose by refusing to use variable-time quicksort, radix sort, etc.? Cycles on Intel Haswell CPU core to sort n = 768 32-bit integers: 26948 stdsort (variable-time) 22812 herf (variable-time) 17748 krasnov (variable-time) 16980 ipp 2019.5 (variable-time) 12672 sid1607 (variable-time) 5964 djbsort (constant-time) No slowdown. New speed records! How can an n(log n)2 algorithm beat standard n log n algorithms?

slide-103
SLIDE 103

20

The slowdown for constant time How much speed did we lose by refusing to use variable-time quicksort, radix sort, etc.? Cycles on Intel Haswell CPU core to sort n = 768 32-bit integers: 26948 stdsort (variable-time) 22812 herf (variable-time) 17748 krasnov (variable-time) 16980 ipp 2019.5 (variable-time) 12672 sid1607 (variable-time) 5964 djbsort (constant-time) No slowdown. New speed records!

21

How can an n(log n)2 algorithm beat standard n log n algorithms?

slide-104
SLIDE 104

20

The slowdown for constant time How much speed did we lose by refusing to use variable-time quicksort, radix sort, etc.? Cycles on Intel Haswell CPU core to sort n = 768 32-bit integers: 26948 stdsort (variable-time) 22812 herf (variable-time) 17748 krasnov (variable-time) 16980 ipp 2019.5 (variable-time) 12672 sid1607 (variable-time) 5964 djbsort (constant-time) No slowdown. New speed records!

21

How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs

  • f various operations.
slide-105
SLIDE 105

20

The slowdown for constant time How much speed did we lose by refusing to use variable-time quicksort, radix sort, etc.? Cycles on Intel Haswell CPU core to sort n = 768 32-bit integers: 26948 stdsort (variable-time) 22812 herf (variable-time) 17748 krasnov (variable-time) 16980 ipp 2019.5 (variable-time) 12672 sid1607 (variable-time) 5964 djbsort (constant-time) No slowdown. New speed records!

21

How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs

  • f various operations.

Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers.

slide-106
SLIDE 106

20

The slowdown for constant time How much speed did we lose by refusing to use variable-time quicksort, radix sort, etc.? Cycles on Intel Haswell CPU core to sort n = 768 32-bit integers: 26948 stdsort (variable-time) 22812 herf (variable-time) 17748 krasnov (variable-time) 16980 ipp 2019.5 (variable-time) 12672 sid1607 (variable-time) 5964 djbsort (constant-time) No slowdown. New speed records!

21

How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs

  • f various operations.

Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers. Loading a 32-bit integer from a random address: much slower. Conditional branch: much slower.

slide-107
SLIDE 107

20

slowdown for constant time much speed did we lose sing to use variable-time quicksort, radix sort, etc.?

  • n Intel Haswell CPU core

n = 768 32-bit integers: stdsort (variable-time) herf (variable-time) krasnov (variable-time) ipp 2019.5 (variable-time) sid1607 (variable-time) djbsort (constant-time)

  • wdown. New speed records!

21

How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs

  • f various operations.

Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers. Loading a 32-bit integer from a random address: much slower. Conditional branch: much slower. Verification Sorting s Does it w Test the random inputs, decreasing

slide-108
SLIDE 108

20

for constant time eed did we lose use variable-time sort, etc.? Haswell CPU core 32-bit integers: (variable-time) riable-time) (variable-time) 2019.5 (variable-time) (variable-time) (constant-time) New speed records!

21

How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs

  • f various operations.

Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers. Loading a 32-bit integer from a random address: much slower. Conditional branch: much slower. Verification Sorting software is Does it work corre Test the sorting soft random inputs, increasing decreasing inputs.

slide-109
SLIDE 109

20

constant time lose riable-time CPU core integers: riable-time) riable-time) riable-time) riable-time) riable-time) (constant-time) records!

21

How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs

  • f various operations.

Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers. Loading a 32-bit integer from a random address: much slower. Conditional branch: much slower. Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on random inputs, increasing inputs, decreasing inputs. Seems to

slide-110
SLIDE 110

21

How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs

  • f various operations.

Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers. Loading a 32-bit integer from a random address: much slower. Conditional branch: much slower.

22

Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on many random inputs, increasing inputs, decreasing inputs. Seems to work.

slide-111
SLIDE 111

21

How can an n(log n)2 algorithm beat standard n log n algorithms? Answer: well-known trends in CPU design, reflecting fundamental hardware costs

  • f various operations.

Every cycle, Haswell core can do 8 “min” ops on 32-bit integers + 8 “max” ops on 32-bit integers. Loading a 32-bit integer from a random address: much slower. Conditional branch: much slower.

22

Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on many random inputs, increasing inputs, decreasing inputs. Seems to work. But are there occasional inputs where this sorting software fails to sort correctly? History: Many security problems involve occasional inputs where TCB works incorrectly.

slide-112
SLIDE 112

21

can an n(log n)2 algorithm standard n log n algorithms? er: well-known trends design, reflecting fundamental hardware costs rious operations. cycle, Haswell core can do “min” ops on 32-bit integers + “max” ops on 32-bit integers. Loading a 32-bit integer from a address: much slower. Conditional branch: much slower.

22

Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on many random inputs, increasing inputs, decreasing inputs. Seems to work. But are there occasional inputs where this sorting software fails to sort correctly? History: Many security problems involve occasional inputs where TCB works incorrectly. For each machine fully unrolled unrolled yes,

slide-113
SLIDE 113

21

(log n)2 algorithm log n algorithms? ell-known trends reflecting rdware costs tions. Haswell core can do 32-bit integers + 32-bit integers. integer from a much slower. ranch: much slower.

22

Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on many random inputs, increasing inputs, decreasing inputs. Seems to work. But are there occasional inputs where this sorting software fails to sort correctly? History: Many security problems involve occasional inputs where TCB works incorrectly. For each used n (e.g., C code normal

  • machine code

symbolic

  • fully unrolled co

new p

  • unrolled min-max

new so

  • yes, code works
slide-114
SLIDE 114

21

rithm rithms? trends costs can do integers + integers. from a wer. slower.

22

Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on many random inputs, increasing inputs, decreasing inputs. Seems to work. But are there occasional inputs where this sorting software fails to sort correctly? History: Many security problems involve occasional inputs where TCB works incorrectly. For each used n (e.g., 768): C code normal compiler

  • machine code

symbolic execution

  • fully unrolled code

new peephole optim

  • unrolled min-max code

new sorting verifier

  • yes, code works
slide-115
SLIDE 115

22

Verification Sorting software is in the TCB. Does it work correctly? Test the sorting software on many random inputs, increasing inputs, decreasing inputs. Seems to work. But are there occasional inputs where this sorting software fails to sort correctly? History: Many security problems involve occasional inputs where TCB works incorrectly.

23

For each used n (e.g., 768): C code normal compiler

  • machine code

symbolic execution

  • fully unrolled code

new peephole optimizer

  • unrolled min-max code

new sorting verifier

  • yes, code works
slide-116
SLIDE 116

22

erification rting software is in the TCB. it work correctly? the sorting software on many inputs, increasing inputs, decreasing inputs. Seems to work. re there occasional inputs this sorting software sort correctly? ry: Many security problems

  • ccasional inputs

TCB works incorrectly.

23

For each used n (e.g., 768): C code normal compiler

  • machine code

symbolic execution

  • fully unrolled code

new peephole optimizer

  • unrolled min-max code

new sorting verifier

  • yes, code works

Symbolic use existing with several eliminating a few missing

slide-117
SLIDE 117

22

is in the TCB. rrectly? software on many increasing inputs,

  • inputs. Seems to work.

ccasional inputs rting software rrectly? security problems ccasional inputs rks incorrectly.

23

For each used n (e.g., 768): C code normal compiler

  • machine code

symbolic execution

  • fully unrolled code

new peephole optimizer

  • unrolled min-max code

new sorting verifier

  • yes, code works

Symbolic execution: use existing angr.io with several tiny new eliminating byte splitting, a few missing vecto

slide-118
SLIDE 118

22

TCB.

  • n many

inputs, to work. inputs roblems rrectly.

23

For each used n (e.g., 768): C code normal compiler

  • machine code

symbolic execution

  • fully unrolled code

new peephole optimizer

  • unrolled min-max code

new sorting verifier

  • yes, code works

Symbolic execution: use existing angr.io toolkit, with several tiny new patches eliminating byte splitting, adding a few missing vector instructions.

slide-119
SLIDE 119

23

For each used n (e.g., 768): C code normal compiler

  • machine code

symbolic execution

  • fully unrolled code

new peephole optimizer

  • unrolled min-max code

new sorting verifier

  • yes, code works

24

Symbolic execution: use existing angr.io toolkit, with several tiny new patches for eliminating byte splitting, adding a few missing vector instructions.

slide-120
SLIDE 120

23

For each used n (e.g., 768): C code normal compiler

  • machine code

symbolic execution

  • fully unrolled code

new peephole optimizer

  • unrolled min-max code

new sorting verifier

  • yes, code works

24

Symbolic execution: use existing angr.io toolkit, with several tiny new patches for eliminating byte splitting, adding a few missing vector instructions. Peephole optimizer: recognize instruction patterns equivalent to min, max.

slide-121
SLIDE 121

23

For each used n (e.g., 768): C code normal compiler

  • machine code

symbolic execution

  • fully unrolled code

new peephole optimizer

  • unrolled min-max code

new sorting verifier

  • yes, code works

24

Symbolic execution: use existing angr.io toolkit, with several tiny new patches for eliminating byte splitting, adding a few missing vector instructions. Peephole optimizer: recognize instruction patterns equivalent to min, max. Sorting verifier: decompose DAG into merging networks. Verify each merging network using generalization of 2007 Even–Levi–Litman, correction of 1990 Chung–Ravikumar.

slide-122
SLIDE 122

23

each used n (e.g., 768): C code normal compiler

  • machine code

symbolic execution

  • fully unrolled code

new peephole optimizer

  • unrolled min-max code

new sorting verifier

  • es, code works

24

Symbolic execution: use existing angr.io toolkit, with several tiny new patches for eliminating byte splitting, adding a few missing vector instructions. Peephole optimizer: recognize instruction patterns equivalent to min, max. Sorting verifier: decompose DAG into merging networks. Verify each merging network using generalization of 2007 Even–Levi–Litman, correction of 1990 Chung–Ravikumar. Current (verified verified p fast uint32 sorting.cr.yp.to Includes automatic simple b verification Web site use the verification Next release verified ARM

slide-123
SLIDE 123

23

(e.g., 768): rmal compiler de symbolic execution code peephole optimizer min-max code sorting verifier

  • rks

24

Symbolic execution: use existing angr.io toolkit, with several tiny new patches for eliminating byte splitting, adding a few missing vector instructions. Peephole optimizer: recognize instruction patterns equivalent to min, max. Sorting verifier: decompose DAG into merging networks. Verify each merging network using generalization of 2007 Even–Levi–Litman, correction of 1990 Chung–Ravikumar. Current djbsort release (verified fast int32 verified portable int32 fast uint32, fast float32 sorting.cr.yp.to Includes the sorting automatic build-time simple benchmarking verification tools. Web site shows ho use the verification Next release planned: verified ARM NEON

slide-124
SLIDE 124

23

768):

  • mpiler

execution

  • ptimizer

verifier

24

Symbolic execution: use existing angr.io toolkit, with several tiny new patches for eliminating byte splitting, adding a few missing vector instructions. Peephole optimizer: recognize instruction patterns equivalent to min, max. Sorting verifier: decompose DAG into merging networks. Verify each merging network using generalization of 2007 Even–Levi–Litman, correction of 1990 Chung–Ravikumar. Current djbsort release (verified fast int32 on AVX2, verified portable int32, fast uint32, fast float32): sorting.cr.yp.to Includes the sorting code; automatic build-time tests; simple benchmarking program; verification tools. Web site shows how to use the verification tools. Next release planned: verified ARM NEON code.

slide-125
SLIDE 125

24

Symbolic execution: use existing angr.io toolkit, with several tiny new patches for eliminating byte splitting, adding a few missing vector instructions. Peephole optimizer: recognize instruction patterns equivalent to min, max. Sorting verifier: decompose DAG into merging networks. Verify each merging network using generalization of 2007 Even–Levi–Litman, correction of 1990 Chung–Ravikumar.

25

Current djbsort release (verified fast int32 on AVX2, verified portable int32, fast uint32, fast float32): sorting.cr.yp.to Includes the sorting code; automatic build-time tests; simple benchmarking program; verification tools. Web site shows how to use the verification tools. Next release planned: verified ARM NEON code.