Software Benchmarking of the 2 nd round CAESAR Candidates Ralph - - PowerPoint PPT Presentation

software benchmarking of the 2 nd round caesar candidates
SMART_READER_LITE
LIVE PREVIEW

Software Benchmarking of the 2 nd round CAESAR Candidates Ralph - - PowerPoint PPT Presentation

Software Benchmarking of the 2 nd round CAESAR Candidates Ralph Ankele 1 , Robin Ankele 2 1 Royal Holloway, University of London, UK 2 University of Oxford, UK September 27, 2016 Directions in Authenticated Ciphers - Nagoya, Japan Software


slide-1
SLIDE 1

Software Benchmarking of the 2nd round CAESAR Candidates

Ralph Ankele1, Robin Ankele2

1Royal Holloway, University of London, UK 2University of Oxford, UK

September 27, 2016 Directions in Authenticated Ciphers - Nagoya, Japan

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 1 /39

slide-2
SLIDE 2

Motivation1

Use Case 1: Lightweight applications (resource constrained environments) Use Case 2: High-performance applications

I critical: efficiency on 64-bit CPUs (servers) and/or dedicated

hardware

I desirable: efficiency on 32-bit CPUs (small smartphones) I desirable: constant time when the message length is constant I message sizes: usually long (more than 1024 bytes), sometimes

shorter

Use Case 3: Defense in depth

1CAESAR usecases on CAESAR mailing list (16. July 2016) by Dan J. Bernstein:

https://groups.google.com/forum/#!topic/crypto-competitions/DLv193SPSDc

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 2 /39

slide-3
SLIDE 3

Overview

  • 1. Classification of the 2nd round CAESAR Candidates
  • 2. Software Optimizations
  • 3. Benchmarking Framework
  • 4. Results
  • 5. Conclusions

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 3 /39

slide-4
SLIDE 4

Classification of the 2nd round CAESAR Candidates

  • 1. Classification of the 2nd round CAESAR Candidates
  • 2. Software Optimizations
  • 3. Benchmarking Framework
  • 4. Results
  • 5. Conclusions

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 4 /39

slide-5
SLIDE 5

CAESAR competition

CAESAR Round 2 candidates

ACORN AEGIS AES-COPA AES-JAMBU AES-OTR AEZ Ascon CLOC Deoxys ELmD HS1-SIV ICEPOLE Joltik Ketje Keyak MORUS Minalpher NORX OCB OMD PAEQ POET PRIMATEs SCREAM SHELL SILC STRIBOB Tiaoxin TriviA-ck π-Cipher

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 5 /39

slide-6
SLIDE 6

Type

15

Block Cipher

1

Compression Function

2

Permutations

4

Stream Cipher

8

Sponge Construction

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 6 /39

slide-7
SLIDE 7

Underlying Primitive

10

AES

9

Others

3

AES Round

3

SPN

3

Keccak

2

LRX

1

ARX

1

SHA2

1

Dedicated Block Cipher

1

Dedicated Stream Cipher

1

Dedicated Permutation

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 7 /39

slide-8
SLIDE 8

Parallel Encryption/Decryption

14

Fully/Fully

10

No/No

5

Partly/Partly

1

Fully/No

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 8 /39

slide-9
SLIDE 9

Online Encryption/Decryption

27

Fully/Fully

3

No/No Encryption of a message block Mi only depends on message blocks M1 . . . Mi1.

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 9 /39

slide-10
SLIDE 10

Inverse Free

19

Yes

10

No

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 10 /39

slide-11
SLIDE 11

Security Proof

24

Yes

6

No

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 11 /39

slide-12
SLIDE 12

Nonce-Missuse Resistance

16

None

7

Longest Common Prefix (Online Ciphers)

2

Max (Offline Ciphers)

1

Intermediate Longest common prefix: an adversary can observe the longest common prefix of messages for repeated nonces Max: the repetition of nonces only leak the ability to see a repeated message

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 12 /39

slide-13
SLIDE 13

Software Optimizations

  • 1. Classification of the 2nd round CAESAR Candidates
  • 2. Software Optimizations
  • 3. Benchmarking Framework
  • 4. Results
  • 5. Conclusions

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 13 /39

slide-14
SLIDE 14

Software Optimizations

12

AES New Instructions

9

Streaming SIMD Extensions

7

No Software Optimization

6

Advanced Vector Instructions

4

Dedicated Processor Optimizations

4

NEON

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 14 /39

slide-15
SLIDE 15

AES-New Instructions

Instructions

I Introduced with Intel R 2010 Westmere microarchitecture I Consists of 6 new instructions that are implemented in hardware I Four instructions for encryption/decryption (i.e. AESENC,

AESENCLAST, AESDEC, AESDECLAST)

I Two instructions for the keyschedule (i.e. AESKEYGENASSIST,

AESIMC)

Performance

I 10 times faster for parallel modes (i.e. CTR) I 2-3 times faster for non-parallel modes (i.e. CBC)

Security

I Improved security against side channel attacks [Gue12]

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 15 /39

slide-16
SLIDE 16

AES-New Instructions

Instructions

I Introduced with Intel R 2010 Westmere microarchitecture I Consists of 6 new instructions that are implemented in hardware I Four instructions for encryption/decryption (i.e. AESENC,

AESENCLAST, AESDEC, AESDECLAST)

I Two instructions for the keyschedule (i.e. AESKEYGENASSIST,

AESIMC)

Performance

I 10 times faster for parallel modes (i.e. CTR) I 2-3 times faster for non-parallel modes (i.e. CBC)

Security

I Improved security against side channel attacks [Gue12]

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 15 /39

slide-17
SLIDE 17

AES-New Instructions

Instructions

I Introduced with Intel R 2010 Westmere microarchitecture I Consists of 6 new instructions that are implemented in hardware I Four instructions for encryption/decryption (i.e. AESENC,

AESENCLAST, AESDEC, AESDECLAST)

I Two instructions for the keyschedule (i.e. AESKEYGENASSIST,

AESIMC)

Performance

I 10 times faster for parallel modes (i.e. CTR) I 2-3 times faster for non-parallel modes (i.e. CBC)

Security

I Improved security against side channel attacks [Gue12]

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 15 /39

slide-18
SLIDE 18

Streaming SIMD Extensions

Instructions

I Vector-mode operations that enables parallel execution of one

instruction on multible data

I 16 · 128-bit registers (xmm0-15) I Expanded over Intel R processor generations to include SSE2,

SSE3/SSE3S and SSE4

Image: https://software.intel.com/sites/default/files/37208.gif

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 16 /39

slide-19
SLIDE 19

Advanced Vector Extensions

Instructions

I Introduced with Intel R SandyBridge microarchitecture I Extends SSE 128-bit registers with 16 new 256-bit registers

(ymm0-15)

I Support of three-operand non-destructive operations (two-operand

instructions e.g. A = A + B are replaced by three-operand instructions e.g. A = B + C)

I AVX2 instructions expand integer vector types and vector shift

  • perations

Performance

I AVX is 1.8 times faster than fastest SSE4.2 instructions [Len14] I AVX2 is 2.8 times faster than fastest SSE4.2 instructions [Len14]

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 17 /39

slide-20
SLIDE 20

Advanced Vector Extensions

Instructions

I Introduced with Intel R SandyBridge microarchitecture I Extends SSE 128-bit registers with 16 new 256-bit registers

(ymm0-15)

I Support of three-operand non-destructive operations (two-operand

instructions e.g. A = A + B are replaced by three-operand instructions e.g. A = B + C)

I AVX2 instructions expand integer vector types and vector shift

  • perations

Performance

I AVX is 1.8 times faster than fastest SSE4.2 instructions [Len14] I AVX2 is 2.8 times faster than fastest SSE4.2 instructions [Len14]

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 17 /39

slide-21
SLIDE 21

NEON

Instructions

I Advanced SIMD instructions for ARM processors avaliable since

CORTEX-A microarchitecture

I 32 · 64-bit registers (dual view 16 · 128-bit registers)

Performance

I 2-8 times performance boost [neo]

Image: http://www.arm.com/assets/images/NEON_ISA.jpg

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 18 /39

slide-22
SLIDE 22

NEON

Instructions

I Advanced SIMD instructions for ARM processors avaliable since

CORTEX-A microarchitecture

I 32 · 64-bit registers (dual view 16 · 128-bit registers)

Performance

I 2-8 times performance boost [neo]

Image: http://www.arm.com/assets/images/NEON_ISA.jpg

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 18 /39

slide-23
SLIDE 23

Benchmarking Framework

  • 1. Classification of the 2nd round CAESAR Candidates
  • 2. Software Optimizations
  • 3. Benchmarking Framework
  • 4. Results
  • 5. Conclusions

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 19 /39

slide-24
SLIDE 24

High Resolution Methods for CPU Timing Information

High Resolution Timers

I HPET (High Precision Event Timer) I QueryPerformanceCounter I time() and clock() posix functions I TSC (Timer Stamp Counter)

Timer Stamp Counter

I 64-bit Machine State Register containing the number of cycles since

last reset

I RDTSC instruction to read out I Use CPUID instruction against out-of-order execution I Our framework uses RDTSCP [Pao10] which is an optimised RDTSC +

CPUID

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 20 /39

slide-25
SLIDE 25

High Resolution Methods for CPU Timing Information

High Resolution Timers

I HPET (High Precision Event Timer) I QueryPerformanceCounter I time() and clock() posix functions I TSC (Timer Stamp Counter)

Timer Stamp Counter

I 64-bit Machine State Register containing the number of cycles since

last reset

I RDTSC instruction to read out I Use CPUID instruction against out-of-order execution I Our framework uses RDTSCP [Pao10] which is an optimised RDTSC +

CPUID

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 20 /39

slide-26
SLIDE 26

Benchmarking Framework

SUPERCOP [Ber16]

I System for Unified Performance Evaluation Related to

Cryptographic Operations and Primitives

I Uses Timer Stamp Counter as Timer (with RDTSC)

BRUTUS [Saa16]

I Small codebase, rapid testing cycle I Uses clock() as Timer

Our Framework

I Simple with only focus on Authenticated Encryption schemes I Optimized Timer Stamp Counter (i.e. RDTSCP) for accurate timing

measurements [Pao10]

I Reduction of noise using single user mode, averaging and median

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 21 /39

slide-27
SLIDE 27

Benchmarking Framework

SUPERCOP [Ber16]

I System for Unified Performance Evaluation Related to

Cryptographic Operations and Primitives

I Uses Timer Stamp Counter as Timer (with RDTSC)

BRUTUS [Saa16]

I Small codebase, rapid testing cycle I Uses clock() as Timer

Our Framework

I Simple with only focus on Authenticated Encryption schemes I Optimized Timer Stamp Counter (i.e. RDTSCP) for accurate timing

measurements [Pao10]

I Reduction of noise using single user mode, averaging and median

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 21 /39

slide-28
SLIDE 28

Benchmarking Framework

SUPERCOP [Ber16]

I System for Unified Performance Evaluation Related to

Cryptographic Operations and Primitives

I Uses Timer Stamp Counter as Timer (with RDTSC)

BRUTUS [Saa16]

I Small codebase, rapid testing cycle I Uses clock() as Timer

Our Framework

I Simple with only focus on Authenticated Encryption schemes I Optimized Timer Stamp Counter (i.e. RDTSCP) for accurate timing

measurements [Pao10]

I Reduction of noise using single user mode, averaging and median

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 21 /39

slide-29
SLIDE 29

Measurement Setup

MacBook Pro Early 2011 Dell Latitude E7470 Intel R

Core i5-2415M

Intel R

Core i5-6300U

SandyBridge SkyLake

I Compiler:

I clang compiler version 6.1.0 (clang-602.0.53) I gcc compiler version 5.4.0 (5.4.0-6ubuntu1-16.04.2)

I Compiler flags: -Ofast -fno-stack-protector -march=native I Operating System in Single User mode to get rid of noise (e.g.

context switches)

I Calculate the median of 91 averaged timings of 200

measurements [KR11]

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 22 /39

slide-30
SLIDE 30

Benchmarking Settings and Real-World Usecases

Table: Real-world use case settings for our benchmarking.

Message Size Associated Data Size2 Comments 1 byte 5 byte

  • ne keystroke (e.g. SSH)

16 bytes 5 byte small payload 557 byte 5 byte average IP packet size3 1.5 kB 5 byte ethernet MTU, TLS 16 kB 5 byte max TCP packet size 1 MB 5 byte file upload

2http://netsekure.org/2010/03/tls-overhead 3http://slaptijack.com/networking/average-ip-packet-size

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 23 /39

slide-31
SLIDE 31

Results

  • 1. Classification of the 2nd round CAESAR Candidates
  • 2. Software Optimizations
  • 3. Benchmarking Framework
  • 4. Results
  • 5. Conclusions

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 24 /39

slide-32
SLIDE 32

Comparison of all CAESAR 2nd Round Candidates

128 256 384 512 640 768 896 102411521280140815361664179219202048

Message length (bytes)

0.29 0.47 0.63 0.94 1.5 2.53 3.69 5.09 8.25 11.42 17.61 25.16 36.93 61.14 128.46 482.85 609.55 1567.98

Performance (cpb)

acorn128v2_opt aeadaes256ocbtaglen128v1_opt aegis128l_aesnib aes128gcmv1_openssl aes128n8t8clocv2_aesni aes128n8t8silcv2_aesni aes128otrpv3_nip7m1 aescopav2_ref aesjambuv2_aesni aezv4_aesni ascon128av11_opt64 deoxysneq256128v13_aesni elmd600v2_ref hs1sivlov1_ref icepole128av2_ref joltikneq6464v13_ref ketjesrv1_reference minalpherv11_ref morus1280256v1_avx2 norx6441_ymm

  • mdsha512k512n256tau256v2_avx1

paeq64_aesni pi64cipher256v2_goptv poetv2aes4_ni primatesv1gibbon80_ref scream10v3_sse seakeyakv2_SandyBridge shellaes128v2d4n80_ref stribob192r2_ssse3 tiaoxinv2_nim trivia0v2_sse4

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 25 /39

slide-33
SLIDE 33

Comparison of all Block Cipher based schemes

128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

0.29 0.47 0.63 0.94 1.37 2.01 2.8 7.25 11.22 36.93 58.55 104.93 128.46 510.99

Performance (cpb)

aeadaes256ocbtaglen128v1_opt aegis128l_aesnib aes128gcmv1_openssl aes128n8t8clocv2_aesni aes128n8t8silcv2_aesni aes128otrpv3_nip7m1 aescopav2_ref aesjambuv2_aesni aezv4_aesni deoxysneq256128v13_aesni elmd600v2_ref joltikneq6464v13_ref poetv2aes4_ni scream10v3_sse shellaes128v2d4n80_ref tiaoxinv2_nim

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 26 /39

slide-34
SLIDE 34

Comparison of all Sponge based schemes

128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

1.51 2.3 2.87 3.69 5.83 8.3 11.22 22.38 32.46 56.84 1309.3 1567.98

Performance (cpb)

aes128gcmv1_openssl ascon128av11_opt64 icepole128av2_ref ketjesrv1_reference norx6441_ymm pi64cipher256v2_goptv primatesv1gibbon80_ref seakeyakv2_SandyBridge stribob192r2_ssse3

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 27 /39

slide-35
SLIDE 35

Comparison of all Stream Cipher based schemes

128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

0.93 1.35 1.65 2.53 7.07 8.25 10.48 25.16

Performance (cpb)

acorn128v2_opt aes128gcmv1_openssl hs1sivlov1_ref morus1280256v1_avx2 trivia0v2_sse4 Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 28 /39

slide-36
SLIDE 36

Comparison of all Permutation based schemes

128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048

Message length (bytes)

1.51 2.53 5.09 11.22 609.55

Performance (cpb)

aes128gcmv1_openssl minalpherv11_ref paeq64_aesni Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 29 /39

slide-37
SLIDE 37

Comparison of all Compression Function based schemes

128 256 384 512 640 768 896 102411521280140815361664179219202048

Message length (bytes)

1.51 1.65 2.53 11.22 17.61 21.65 60.24

Performance (cpb)

aes128gcmv1_openssl

  • mdsha512k512n256tau256v2_avx1

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 30 /39

slide-38
SLIDE 38

Comparison in the TLS setting

10 10

1

10

2

10

3

tiaoxinv2_nim aegis128l_aesnic aezv4_aesni aes128otrpv3_nip7m2 morus1280256v1_avx2 aeadaes128ocbtaglen128v1_opt deoxysneq128128v13_aesni poetv2aes4_ni aes128gcmv1_openssl norx6441_ymm aes128n12t8clocv2_aesni aes128n8t8silcv2_aesni lakekeyakv2_generic64 paeq80_aesni ascon128av11_opt64 aesjambuv2_aesni hs1sivlov1_ref pi64cipher128v2_goptv trivia0v2_sse4 acorn128v2_opt scream10v3_sse icepole128av2_ref

  • mdsha512k128n128tau128v2_sse4

stribob192r2_ssse3 ketjesrv1_reference shellaes128v2d8n80_ref aescopav2_ref minalpherv11_ref primatesv1gibbon80_ref joltikeq12864v13_ref Performance (cpb)

0.38 0.4 0.68 0.82 1.07 1.26 1.37 1.6 1.8 2.43 2.57 2.62 4.12 4.56 6.25 6.31 7.11 7.14 9.01 9.05 9.26 9.45 18.05 22.47 36.76 37.31 105.68 543.38 1309.54 1757.25

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 31 /39

slide-39
SLIDE 39

Comparison in the SSH setting

10 10

1

10

2

10

3

10

4

aegis128l_aesnic aezv4_aesni tiaoxinv2_nim aes128otrpv3_nip7m2 aesjambuv2_aesni aes128n12t8clocv2_aesni ascon128av11_opt64 aes128n8t8silcv2_aesni morus1280256v1_avx2 deoxysneq128128v13_aesni poetv2aes4_ni paeq80_aesni norx6441_ymm aeadaes128ocbtaglen128v1_opt aes128gcmv1_openssl lakekeyakv2_generic64 hs1sivlov1_ref trivia0v2_sse4 acorn128v2_opt stribob192r2_ssse3 ketjesrv1_reference pi64cipher128v2_goptv icepole128av2_ref scream10v3_sse aescopav2_ref

  • mdsha512k128n128tau128v2_sse4

shellaes128v2d8n80_ref minalpherv11_ref joltikeq12864v13_ref primatesv1gibbon80_ref Performance (cpb)

44.82 49.46 50.06 54.47 61.53 67.45 71.4 80.5 100.4 101.41 130 144.74 158.96 191.06 213.41 235.17 256.33 257.01 391.85 440.1 475.58 489.86 646.63 712.97 954.52 1121.2 1267.94 3879.76 6969.07 7487.21

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 32 /39

slide-40
SLIDE 40

Currently fastest cipher (Software)

128 256 384 512 640 768 896 1024 1152 1280 1408 1536 1664 1792 1920 2048 Message length (bytes) 0.29 0.47 1.95 4.73 7.01 27.17 Performance (cpb) Associated Data (bytes) Message (bytes) 0.5 1 128 128 1.5 256 256 2 384 384 Performance (cycles/byte) 512 512 2.5 640 640 3 768 768 3.5 896 896 1024 1024 1152 1152 1280 1280 1408 1408 1536 1536 1664 1664 1792 1792 1920 1920 2048 2048 cpb: 3.67 cpb: 1.47 cpb: 0.84 cpb: 0.59 cpb: 3.45 cpb: 1.98 cpb: 1.21 cpb: 0.79 cpb: 0.58 cpb: 1.34 cpb: 1.12 cpb: 0.87 cpb: 0.67 cpb: 0.54 cpb: 0.74 cpb: 0.71 cpb: 0.64 cpb: 0.56 cpb: 0.49 cpb: 0.49 cpb: 0.49 cpb: 0.48 cpb: 0.46 cpb: 0.44

1 16 557 1500 16000 1000000 10 20 30 40 50 60 Message Size Performance (cpb) 50.06 13.93 0.71 0.38 0.2 0.19

Figure: Tiaoxin v2.0 (SSE and AES-NI optimized)

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 33 /39

slide-41
SLIDE 41

Conclusions

  • 1. Classification of the 2nd round CAESAR Candidates
  • 2. Software Optimizations
  • 3. Benchmarking Framework
  • 4. Results
  • 5. Conclusions

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 34 /39

slide-42
SLIDE 42

Conclusions

I New framework to benchmark Authenticated Encryption ciphers

I Very simple, only focus on AE ciphers I Timer Stamp Counter (with optimized RDTSCP instruction) I Reduction of noise during measurements

I Comparison of CAESAR 2nd round Candidates

I TLS setting I SSH setting

I 23 out of 30 ciphers offer at least one optimization

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 35 /39

slide-43
SLIDE 43

Further Work

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 36 /39

slide-44
SLIDE 44

Questions?

Thank you for your attention!

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 37 /39

slide-45
SLIDE 45

References I

Daniel J. Bernstein. Supercop. https://bench.cr.yp.to/supercop.html, 2016. Shay Gueron. Intel R

advanced encryption standard (aes) new instructions set.

Technical report, Intel Corporation, 2012. Ted Krovetz and Phillip Rogaway. The Software Performance of Authenticated-Encryption Modes, pages 306–327. Springer, 2011. Gregory Lento. Optimizing performance with intel R

advanced vector extensions.

http://www.intel.com/content/www/us/en/benchmarks/ performance-xeon-e5-v3-advanced-vector-extensions-paper. html, 2014. Neon.

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 38 /39

slide-46
SLIDE 46

References II

Gabriele Paoloni. How to benchmark code execution times on intel R

ia-32 and ia-64

instruction set architectures. Technical report, Intel Corporation, 2010. Markku-Juhani Saarinen. The BRUTUS automatic cryptanalytic framework. Journal of Cryptographic Engineering, 6(1):75–82, 2016.

Ralph Ankele - Royal Holloway, University of London Software Benchmarking of the 2nd round CAESAR Candidates slide 39 /39