Software NFs The good: The fmexibility of software The software - - PowerPoint PPT Presentation

software nfs
SMART_READER_LITE
LIVE PREVIEW

Software NFs The good: The fmexibility of software The software - - PowerPoint PPT Presentation

Automated Synthesis of Adversarial Workloads for Network Functions Luis Pedrosa, Rishabh Iyer, Arseniy Zaostrovnykh, Jonas Fietz, Katerina Argyraki N etwork A rchitecture L aboratory Software NFs The good: The fmexibility of software The


slide-1
SLIDE 1

Automated Synthesis of Adversarial Workloads for Network Functions

Luis Pedrosa, Rishabh Iyer, Arseniy Zaostrovnykh, Jonas Fietz, Katerina Argyraki

Network Architecture Laboratory

slide-2
SLIDE 2

2

Software NFs

The good:

The fmexibility of software The software development cycle

The bad:

The reliability of software Inconsistent performance

The ugly:

Adversarial traffjc / DoS / Slowdowns

slide-3
SLIDE 3

3

We need better tools...

Dynamic analysis: profjling

Reasons about known inputs Helps fjnd root cause / debug Only as good as the inputs used

slide-4
SLIDE 4

4

WCET

We need better tools...

Static analysis

Reasons about potential inputs in abstract Over-approximating: WCET Under-approximating: adversarial inputs

MAX Latency (not to scale) Typical Adversarial

slide-5
SLIDE 5

5

Statically analyze NF

Analyze code Generate PCAP fjle with adversarial workload

Exploit

The CPU cache hierarchy Algorithmic complexity

It works!

Increased NF latency up to 3×

CASTAN – Cycle Approximating Symbolic Timing Analysis for NFs

slide-6
SLIDE 6

6

Outline

Introduction SymbEx in a Nutshell CASTAN Evaluation Conclusion

slide-7
SLIDE 7

7

SymbEx in a Nutshell

Procedure Interpret code with symbolic values

1 : i n t v a r = i n p u t ( ) ; 2 : r e t u r n v a r + + ; / / α / / + 1 α

  • J. C. King, Symbolic Execution and Program Testing, 1976
slide-8
SLIDE 8

8

SymbEx in a Nutshell

Procedure Interpret code with symbolic values

1 : i n t v a r = i n p u t ( ) ; / / α 2 : i f ( v a r > = ) { 3 : r e t u r n v a r ; 4 : } e l s e { 5 : r e t u r n

  • v

a r ; 6 : }

slide-9
SLIDE 9

9

SymbEx in a Nutshell

Procedure Interpret code with symbolic values Fork execution on symbolic conditions Keep track of path constraints

1 : i n t v a r = i n p u t ( ) ; / / α 2 : i f ( v a r > = ) { 3 : r e t u r n v a r ; 4 : } e l s e { 5 : r e t u r n

  • v

a r ; 6 : } / / i f > = α α / /

  • i

f < α α

slide-10
SLIDE 10

10

SymbEx in a Nutshell

Procedure Interpret code with symbolic values Fork execution on symbolic conditions Keep track of path constraints SMT solver fjnds concrete inputs

1 : i n t v a r = i n p u t ( ) ; / / α 2 : i f ( v a r > = ) { 3 : r e t u r n v a r ; 4 : } e l s e { 5 : r e t u r n

  • v

a r ; 6 : } / / i f > = α α / /

  • i

f < α α , e . g . = α , e . g . =

  • 1

α

slide-11
SLIDE 11

11

SymbEx in a Nutshell

Challenges

Path Explosion! T ypically exponential # of paths / branch Unbounded with loops Impractical to SymbEx exhaustively

slide-12
SLIDE 12

12

SymbEx in a Nutshell

Mitigation

Can’t do everything: prioritize! Directed Symbolic Execution

Prioritize executing relevant paths over others Graph search with heuristic T ry to reach a bug / increase coverage / etc. Stop SEE when satisfjed (or impatient)

slide-13
SLIDE 13

13

CASTAN

Overview

Generate adversarial NF workloads

Packet sequence ⇒ more CPU cycles / packet

Under-approximate: not WCET Largely automated

slide-14
SLIDE 14

14

CASTAN

Approach

Exploits performance variation

  • 1. CPU cache: +DRAM accesses
  • 2. Algorithmic complexity: +instructions
  • 3. Hashing: reverse to expose internals
slide-15
SLIDE 15

15

CASTAN

Attacking the CPU Cache

Symbolic Pointers

Index into memory with packet: a r r a y [ p a c k e t . d s t _ a d d r ] Find packets ⇒ memory addresses ⇒ DRAM access

CPU Cache Model

Simple 1-tier model of the LLC Models contention, associativity, write-back Empirical contention set model

slide-16
SLIDE 16

16

CASTAN

Attacking Algorithmic Complexity

Maximize Instructions / Packet

Find packets ⇒ longer code paths

Guide SymbEx with a Heuristic

Maximize cycles w/o inducing breadth-fjrst-search Estimate cycles / packet

Receive Packet Receive Packet

slide-17
SLIDE 17

17

CASTAN

Attacking Algorithmic Complexity

CFG Distance Heuristic

max(successors)+cost<current> cost = cycles conservatively assuming an L1 hit

slide-18
SLIDE 18

18

CASTAN

Attacking Algorithmic Complexity

CFG Distance Heuristic

max(successors)+cost<current> cost = cycles conservatively assuming an L1 hit

slide-19
SLIDE 19

19

CASTAN

Attacking Algorithmic Complexity

CFG Distance Heuristic

max(successors)+cost<current> cost = cycles conservatively assuming an L1 hit 1 2

3

slide-20
SLIDE 20

20

CASTAN

Attacking Algorithmic Complexity

CFG Distance Heuristic

max(successors)+cost<current> cost = cycles conservatively assuming an L1 hit 1 2

3

1 2 4 3 5

slide-21
SLIDE 21

21

CASTAN

Attacking Algorithmic Complexity

Handling Loops

Distance vector algorithm Limit repeats to 2 (unrolls loops once)

slide-22
SLIDE 22

22

CASTAN

Attacking Algorithmic Complexity

Handling Loops

Distance vector algorithm Limit repeats to 2 (unrolls loops once)

slide-23
SLIDE 23

23

CASTAN

Attacking Algorithmic Complexity

Handling Loops

Distance vector algorithm Limit repeats to 2 (unrolls loops once)

1 4 2 3 3

slide-24
SLIDE 24

24

CASTAN

Attacking Algorithmic Complexity

Handling Loops

Distance vector algorithm Limit repeats to 2 (unrolls loops once)

1 4 2 3 3

slide-25
SLIDE 25

25

CASTAN

Attacking Algorithmic Complexity

Handling Loops

Distance vector algorithm Limit repeats to 2 (unrolls loops once)

1 4 2 3 3 5 8 6 7 7

slide-26
SLIDE 26

26

CASTAN

Handling Hash Functions

SymbExing hash functions is hard

Complex expression / Path explosion Reason about hash value, without computing it?

slide-27
SLIDE 27

27

CASTAN

Handling Hash Functions

SymbExing hash functions is hard

Complex expression / Path explosion Reason about hash value, without computing it?

Havocing

Annotate and disable hash function Assign hash value a new symbol Analyze data structure internals unencumbered Find packet ⇒ hash value ⇒ expected behavior

slide-28
SLIDE 28

28

CASTAN

Handling Hash Functions Packets Hash Inputs Hashes Solve Hashes Reverse Hashes Solve Packets

slide-29
SLIDE 29

29

Evaluation

Setup

Network Measurement Campaign

E2E Latency / Throughput Intel Xeon E5-2667v2 3.3GHz

25.6MB LLC / 32GB RAM

Intel 82599ES 10Gb NICs

Tester DUT

slide-30
SLIDE 30

30

Evaluation

NFs

11 NF Implementations

3 types, difgerent data structures

NAT LB LPM Unbalanced Tree

  • Red-Black Tree
  • Hash Ring
  • Hash Table
  • Hierarchical Lookup (DPDK)
  • Single Lookup
  • Patricia Trie
slide-31
SLIDE 31

31

Evaluation

NFs

11 NF Implementations

3 types, difgerent data structures

NAT LB LPM Unbalanced Tree

  • Red-Black Tree
  • Hash Ring
  • Hash Table
  • Hierarchical Lookup (DPDK)
  • Single Lookup
  • Patricia Trie
  • Algorithmic

Complexity Cache

slide-32
SLIDE 32

32

Evaluation

Workloads

Baseline

NOP

Adversarial

CASTAN (~50 fmows), Manual (~50 fmows)

Random

UniRand (1Mfmows) Zipf (100kpkts, 6.7kfmows) UniRand CASTAN (# fmows = CASTAN)

slide-33
SLIDE 33

33

Evaluation

LPM / Single Lookup Table

3× 3×

CDF

slide-34
SLIDE 34

34

Evaluation

LPM / Single Lookup Table

3× 3×

CDF

CASTAN induces DRAM accesses 3× Latency ≃ UniRand; 2×105 fewer flows

slide-35
SLIDE 35

35

Evaluation

LPM / Single Lookup Table

  • 19%
  • 19%
slide-36
SLIDE 36

36

Evaluation

NAT / Unbalanced Tree

1.7× 1.7×

CDF

slide-37
SLIDE 37

37

Evaluation

NAT / Unbalanced Tree

1.7× 1.7×

CDF

CASTAN skews the tree +70% Latency / -7% Throughput ≃ Manual; without intuition

slide-38
SLIDE 38

38

Conclusion

CASTAN

Attacks complexity, CPU cache, hash functions Little developer input

Adversarial Workloads

≃ Manual when available > Uniform random for same number of fmows Up to +201% latency / -19% throughput

slide-39
SLIDE 39

39

Find out more!

Look for our poster! Get the source and more: https://pedrosa.2y.net/Projects/CASTAN

slide-40
SLIDE 40

40

Backup Slides

slide-41
SLIDE 41

41

Cache Structure

3 bits 6 bits 6 bits 15 bits byte offset L1d line L2 line L3 slice 34 bits 1GB page offset 1GB page index

slide-42
SLIDE 42

42

Latency Deviation from NOP

slide-43
SLIDE 43

43

Throughput

slide-44
SLIDE 44

44

LPM / Single Lookup Table

slide-45
SLIDE 45

45

NAT / Unbalanced Tree

slide-46
SLIDE 46

46

NAT / Hash Ring

slide-47
SLIDE 47

47

NAT / Red-Black Tree

slide-48
SLIDE 48

48

NAT / Hash Table

slide-49
SLIDE 49

49

LPM / Hierarchical Lookup (DPDK)

slide-50
SLIDE 50

50

LPM / Patricia Trie

slide-51
SLIDE 51

51

LB / Unbalanced Tree

slide-52
SLIDE 52

52

LB / Red-Black Tree

slide-53
SLIDE 53

53

LB / Hash Ring

slide-54
SLIDE 54

54

LB / Hash Table