Approximate Reduction of Finite Automata for High-Speed Network - - PowerPoint PPT Presentation

approximate reduction of finite automata for high speed
SMART_READER_LITE
LIVE PREVIEW

Approximate Reduction of Finite Automata for High-Speed Network - - PowerPoint PPT Presentation

Approximate Reduction of Finite Automata for High-Speed Network Intrusion Detection Milan Ce Vojt Luk a ska ech Havlena s Hol k Ond rej Leng al Tom a s Vojnar Brno University of Technology Czech Republic 18


slide-1
SLIDE 1

Approximate Reduction of Finite Automata for High-Speed Network Intrusion Detection

Milan ˇ Ceˇ ska Vojtˇ ech Havlena Luk´ aˇ s Hol´ ık Ondˇ rej Leng´ al Tom´ aˇ s Vojnar

Brno University of Technology Czech Republic

18 April 2018 (TACAS’18)

slide-2
SLIDE 2

Main Points

reduction of nondeterministic finite automata (NFAs)

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 2 / 23
slide-3
SLIDE 3

Main Points

reduction of nondeterministic finite automata (NFAs) the reduction does NOT preserve language

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 2 / 23
slide-4
SLIDE 4

Main Points

reduction of nondeterministic finite automata (NFAs) the reduction does NOT preserve language BUT guarantees maximum error

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 2 / 23
slide-5
SLIDE 5

Main Points

reduction of nondeterministic finite automata (NFAs) the reduction does NOT preserve language BUT guarantees maximum error w.r.t. a probabilistic distribution

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 2 / 23
slide-6
SLIDE 6

Main Points

reduction of nondeterministic finite automata (NFAs) the reduction does NOT preserve language BUT guarantees maximum error w.r.t. a probabilistic distribution application in high-speed network intrusion detection

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 2 / 23
slide-7
SLIDE 7

Computer Network Intrusion Detection

recently a large number of security incidents, e.g.

◮ WannaCry

  • ransomware, 1 G$

◮ Spectre & Meltdown

  • security vulnerabilities in Intel CPUs

exploits often spread via networks

◮ these attacks can be detected

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 3 / 23
slide-8
SLIDE 8

Computer Network Intrusion Detection

Local Network Malicious User NIDS EVIL E V I L E V I L EVIL E V I L Internet NIDS = Network Intrusion Detection System

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 4 / 23
slide-9
SLIDE 9

Computer Network Intrusion Detection

SNORT

◮ popular NIDS ◮ RegExes to describe attacks

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 5 / 23
slide-10
SLIDE 10

Computer Network Intrusion Detection

SNORT

◮ popular NIDS ◮ RegExes to describe attacks

/ˆPOST HTTP\/1\.[01]\r\n(\V+\r\n)*\r\n[\x00-\xff]*DROP TABLE/ /ˆHTTP\/1\.[01] 404[\x00-\xff]*(admin|wordpress)/ /ˆPOST HTTP\/1\.[01]\r\n(\V+\r\n)*\r\n[\x00-\xff]*admin:admin/ /ˆPOST HTTP\/1\.[01]\r\n(\V+\r\n)*\r\n[\x00-\xff]*admin:password/ /ˆPOST HTTP\/1\.[01]\r\n(\V+\r\n)*\r\n[\x00-\xff]*YWRtaW46cGFzc3dvcmQ/ /ˆPOST HTTP\/1\.[01]\r\n(\V+\r\n)*\r\n[\x00-\xff]*YWRtaW46YWRtaW4/ /ˆPOST HTTP\/1\.[01]\r\n(\V+\r\n)*\r\n[\x00-\xff]*\/bin\/sh/ ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 5 / 23
slide-11
SLIDE 11

Computer Network Intrusion Detection

NIDS 100 Gbps High-speed networks

◮ 100 Gbps, 400 Gbps

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 6 / 23
slide-12
SLIDE 12

Computer Network Intrusion Detection

NIDS 100 Gbps High-speed networks

◮ 100 Gbps, 400 Gbps

100 Gbps — max. ∼150 Mpkt/s (100 / 84*8)

◮ cf. 56 kbps dial-up — max. ∼80 pkt/s ◮ ∼10 GB/s (of data)

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 6 / 23
slide-13
SLIDE 13

Computer Network Intrusion Detection

NIDS 100 Gbps High-speed networks

◮ 100 Gbps, 400 Gbps

100 Gbps — max. ∼150 Mpkt/s (100 / 84*8)

◮ cf. 56 kbps dial-up — max. ∼80 pkt/s ◮ ∼10 GB/s (of data)

consider 4 GHz CPU

◮ 0.4 cycle/B ◮ ∼27 cycles/pkt

  • cf. DRAM latency ∼100 cycles
ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 6 / 23
slide-14
SLIDE 14

Computer Network Intrusion Detection

NIDS 100 Gbps High-speed networks

◮ 100 Gbps, 400 Gbps

100 Gbps — max. ∼150 Mpkt/s (100 / 84*8)

◮ cf. 56 kbps dial-up — max. ∼80 pkt/s ◮ ∼10 GB/s (of data)

consider 4 GHz CPU

◮ 0.4 cycle/B ◮ ∼27 cycles/pkt

  • cf. DRAM latency ∼100 cycles

no hope for SW solutions

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 6 / 23
slide-15
SLIDE 15

HW-accelerated NIDS

HW-accelerated NIDS

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 7 / 23
slide-16
SLIDE 16

HW-accelerated NIDS

HW-accelerated NIDS cooperation with ANT@FIT

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 7 / 23
slide-17
SLIDE 17

HW-accelerated NIDS

HW-accelerated NIDS cooperation with ANT@FIT using a COMBO-100G accelerator card

◮ FPGA Xilinx Virtex-7 H580T

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 7 / 23
slide-18
SLIDE 18

HW-accelerated NIDS

HW-accelerated NIDS cooperation with ANT@FIT using a COMBO-100G accelerator card

◮ FPGA Xilinx Virtex-7 H580T

used as a pre-filter NIDS 100 Gbps <1 Gbps

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 7 / 23
slide-19
SLIDE 19

HW-accelerated NIDS

HW-accelerated NIDS RegEx matching in HW

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 8 / 23
slide-20
SLIDE 20

HW-accelerated NIDS

HW-accelerated NIDS RegEx matching in HW

◮ NFAs!

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 8 / 23
slide-21
SLIDE 21

HW-accelerated NIDS

HW-accelerated NIDS RegEx matching in HW

◮ NFAs! ◮ smaller than DFAs ◮ but still too big (even after language-preserving reduction)

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 8 / 23
slide-22
SLIDE 22

HW-accelerated NIDS

HW-accelerated NIDS RegEx matching in HW

◮ NFAs! ◮ smaller than DFAs ◮ but still too big (even after language-preserving reduction) ◮ many units in parallel

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 8 / 23
slide-23
SLIDE 23

HW-accelerated NIDS

HW-accelerated NIDS RegEx matching in HW

◮ NFAs! ◮ smaller than DFAs ◮ but still too big (even after language-preserving reduction) ◮ many units in parallel ◮ http-backdoor.pcre: 38.4 Gbps

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 8 / 23
slide-24
SLIDE 24

HW-accelerated NIDS

HW-accelerated NIDS RegEx matching in HW

◮ NFAs! ◮ smaller than DFAs ◮ but still too big (even after language-preserving reduction) ◮ many units in parallel ◮ http-backdoor.pcre: 38.4 Gbps ◮ language non-preserving reduction

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 8 / 23
slide-25
SLIDE 25

Distance of NFAs

Language non-preserving NFA reduction A → Ared:

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 9 / 23
slide-26
SLIDE 26

Distance of NFAs

Language non-preserving NFA reduction A → Ared: trivial solutions not satisfactory

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 9 / 23
slide-27
SLIDE 27

Distance of NFAs

Language non-preserving NFA reduction A → Ared: trivial solutions not satisfactory need to quantify the error

◮ distance of NFAs

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 9 / 23
slide-28
SLIDE 28

Distance of NFAs

Language non-preserving NFA reduction A → Ared: trivial solutions not satisfactory need to quantify the error

◮ distance of NFAs

Distance of NFAs: Jaccard distance, Ces` aro-Jaccard distance Levenshtein (edit) distance

◮ not suitable for languages

. . .

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 9 / 23
slide-29
SLIDE 29

Distance of NFAs

Language non-preserving NFA reduction A → Ared: trivial solutions not satisfactory need to quantify the error

◮ distance of NFAs

Distance of NFAs: Jaccard distance, Ces` aro-Jaccard distance Levenshtein (edit) distance

◮ not suitable for languages

. . . not suitable!

◮ distribution of network packets is not uniform

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 9 / 23
slide-30
SLIDE 30

Distance of NFAs

Probabilistic distance of NFAs:

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 10 / 23
slide-31
SLIDE 31

Distance of NFAs

Probabilistic distance of NFAs: various packets have different likelihood

◮ e.g. Pr(HTTP) > Pr(Gopher) ◮ e.g. Pr HTTP(GET) > Pr HTTP(POST)

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 10 / 23
slide-32
SLIDE 32

Distance of NFAs

Probabilistic distance of NFAs: various packets have different likelihood

◮ e.g. Pr(HTTP) > Pr(Gopher) ◮ e.g. Pr HTTP(GET) > Pr HTTP(POST)

probabilistic automaton q0(0.2) 0.42 q1(0.35) q2(0.1) 0.58 a(0.5) c(0.3) b(0.4) b(0.25) c(0.9)

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 10 / 23
slide-33
SLIDE 33

Distance of NFAs

Probabilistic distance of NFAs: various packets have different likelihood

◮ e.g. Pr(HTTP) > Pr(Gopher) ◮ e.g. Pr HTTP(GET) > Pr HTTP(POST)

probabilistic automaton q0(0.2) 0.42 q1(0.35) q2(0.1) 0.58 a(0.5) c(0.3) b(0.4) b(0.25) c(0.9) Represents Pr P : Σ∗ → 0, 1

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 10 / 23
slide-34
SLIDE 34

Distance of NFAs

Probabilistic distance of NFAs: various packets have different likelihood

◮ e.g. Pr(HTTP) > Pr(Gopher) ◮ e.g. Pr HTTP(GET) > Pr HTTP(POST)

probabilistic automaton q0(0.2) 0.42 q1(0.35) q2(0.1) 0.58 a(0.5) c(0.3) b(0.4) b(0.25) c(0.9) Represents Pr P : Σ∗ → 0, 1 Pr P(abc) = 0.42 · 0.5 · 0.4 · 0.3 · 0.1 + 0.42 · 0.5 · 0.25 · 0.9 · 0.1

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 10 / 23
slide-35
SLIDE 35

Distance of NFAs

Probabilistic distance of NFAs: distP(A, Ared) = Pr P(L(A) ⊲ ⊳

  • symmetric difference

L(Ared)) = Pr P(L(A)) + Pr P(L(Ared)) − 2Pr P(A ∩ Ared)

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 11 / 23
slide-36
SLIDE 36

Distance of NFAs

Probabilistic distance of NFAs: distP(A, Ared) = Pr P(L(A) ⊲ ⊳

  • symmetric difference

L(Ared)) = Pr P(L(A)) + Pr P(L(Ared)) − 2Pr P(A ∩ Ared)

Theorem

Computing Pr P(L(A)) is PSPACE-complete. If A is unambiguous, it is in PTIME.

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 11 / 23
slide-37
SLIDE 37

Distance of NFAs

Probabilistic distance of NFAs: distP(A, Ared) = Pr P(L(A) ⊲ ⊳

  • symmetric difference

L(Ared)) = Pr P(L(A)) + Pr P(L(Ared)) − 2Pr P(A ∩ Ared)

Theorem

Computing Pr P(L(A)) is PSPACE-complete. If A is unambiguous, it is in PTIME.

  • Proof. PSPACE-hardness: reduction from NFA universality (PSPACE):

let ∀w ∈ Σ∗ : Pr P(w) > 0 check Pr P(L(A)) = 1

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 11 / 23
slide-38
SLIDE 38

Distance of NFAs

Probabilistic distance of NFAs: distP(A, Ared) = Pr P(L(A) ⊲ ⊳

  • symmetric difference

L(Ared)) = Pr P(L(A)) + Pr P(L(Ared)) − 2Pr P(A ∩ Ared)

Theorem

Computing Pr P(L(A)) is PSPACE-complete. If A is unambiguous, it is in PTIME.

  • Proof. PSPACE-hardness: reduction from NFA universality (PSPACE):

let ∀w ∈ Σ∗ : Pr P(w) > 0 check Pr P(L(A)) = 1 Upper bounds: PTIME: product of A and P system of linear equations PSPACE: on-the-fly determinize A × run ↑ (std. composition)

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 11 / 23
slide-39
SLIDE 39

Pr-driven NFA Reduction

Probability-driven NFA Reduction 2 optimization problems:

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 12 / 23
slide-40
SLIDE 40

Pr-driven NFA Reduction

Probability-driven NFA Reduction 2 optimization problems:

◮ size-driven: (n) A Ared s.t. |Ared| ≤ n and distP(A, Ared) minimal

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 12 / 23
slide-41
SLIDE 41

Pr-driven NFA Reduction

Probability-driven NFA Reduction 2 optimization problems:

◮ size-driven: (n) A Ared s.t. |Ared| ≤ n and distP(A, Ared) minimal ◮ error-driven: (ǫ) A Ared s.t. distP(A, Ared) ≤ ǫ and |Ared| minimal

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 12 / 23
slide-42
SLIDE 42

Pr-driven NFA Reduction

Probability-driven NFA Reduction 2 optimization problems:

◮ size-driven: (n) A Ared s.t. |Ared| ≤ n and distP(A, Ared) minimal ◮ error-driven: (ǫ) A Ared s.t. distP(A, Ared) ≤ ǫ and |Ared| minimal

Theorem

Determining existence of Ared s.t. distP(A, Ared) ≤ ǫ and |Ared| ≤ n is PSPACE-complete. not easier than finding minimal NFA an enumerative algorithm not practical

  • prob. (bi-)simulations don’t work
ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 12 / 23
slide-43
SLIDE 43

Pr-driven NFA Reduction

Practical reductions: based on removing states and transitions

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 13 / 23
slide-44
SLIDE 44

Pr-driven NFA Reduction

Practical reductions: based on removing states and transitions 2 approaches:

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 13 / 23
slide-45
SLIDE 45

Pr-driven NFA Reduction

Practical reductions: based on removing states and transitions 2 approaches: self-loop reduction

q0 q1 q2 q3 q4 q5 q6 b c a a c b a a Σ Σ ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 13 / 23
slide-46
SLIDE 46

Pr-driven NFA Reduction

Practical reductions: based on removing states and transitions 2 approaches: self-loop reduction

q0 q1 q2 q3 q4 q5 q6 b c a a c b a a Σ Σ

pruning reduction

q0 q1 q2 q3 q4 q5 q6 b c a a c b a a c ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 13 / 23
slide-47
SLIDE 47

Self-Loop Reduction

Self-Loop Reduction introduces self-loops ⇒ over-approximating

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 14 / 23
slide-48
SLIDE 48

Self-Loop Reduction

Self-Loop Reduction introduces self-loops ⇒ over-approximating

Theorem

Given n and ǫ, determining whether there exists Ared with n states and error ≤ ǫ obtained from A by adding self-loops is PSPACE-complete.

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 14 / 23
slide-49
SLIDE 49

Self-Loop Reduction

Self-Loop Reduction introduces self-loops ⇒ over-approximating

Theorem

Given n and ǫ, determining whether there exists Ared with n states and error ≤ ǫ obtained from A by adding self-loops is PSPACE-complete. practical greedy algorithm to select states to add self-loops redundant states removed labelling — approximates the error

q0 q1 q2 q3 q4 q5 q6 βP,A(q2) βP,A(q6) βP,A(q4) βP,A(q0) βP,A(q1) βP,A(q3) βP,A(q5) b c a a c b a a Σ Σ ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 14 / 23
slide-50
SLIDE 50

Pruning Reduction

Pruning Reduction: removes states ⇒ under-approximating

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 15 / 23
slide-51
SLIDE 51

Pruning Reduction

Pruning Reduction: removes states ⇒ under-approximating

Theorem

Given n and ǫ, determining whether there exists Ared with n states and error ≤ ǫ obtained from A by removing states is PSPACE-complete.

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 15 / 23
slide-52
SLIDE 52

Pruning Reduction

Pruning Reduction: removes states ⇒ under-approximating

Theorem

Given n and ǫ, determining whether there exists Ared with n states and error ≤ ǫ obtained from A by removing states is PSPACE-complete. practical greedy algorithm to select states to remove labelling — approximates the error

q0 q1 q2 q3 q4 q5 q6 θP,A(q2) θP,A(q6) θP,A(q4) θP,A(q0) θP,A(q1) θP,A(q3) θP,A(q5) b c a a c b a a c ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 15 / 23
slide-53
SLIDE 53

Results

slide-54
SLIDE 54

Results

case studies from SNORT

◮ targeting attacks over HTTP ◮ self-loop reduction

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 17 / 23
slide-55
SLIDE 55

Results

case studies from SNORT

◮ targeting attacks over HTTP ◮ self-loop reduction

model of network traffic — probabilistic automaton PHTTP

◮ structure constructed manually ◮ probabilities learnt using real traffic (∼243 kpkt from ∼30 GiB)

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 17 / 23
slide-56
SLIDE 56

Results

case studies from SNORT

◮ targeting attacks over HTTP ◮ self-loop reduction

model of network traffic — probabilistic automaton PHTTP

◮ structure constructed manually ◮ probabilities learnt using real traffic (∼243 kpkt from ∼30 GiB)

RABIT (R. Mayr) used for exact NFA reduction

◮ simulation-based

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 17 / 23
slide-57
SLIDE 57

Results

case studies from SNORT

◮ targeting attacks over HTTP ◮ self-loop reduction

model of network traffic — probabilistic automaton PHTTP

◮ structure constructed manually ◮ probabilities learnt using real traffic (∼243 kpkt from ∼30 GiB)

RABIT (R. Mayr) used for exact NFA reduction

◮ simulation-based

synthesis for Xilinx Virtex-7

◮ reporting #LUTs (look-up tables)

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 17 / 23
slide-58
SLIDE 58

Results

case studies from SNORT

◮ targeting attacks over HTTP ◮ self-loop reduction

model of network traffic — probabilistic automaton PHTTP

◮ structure constructed manually ◮ probabilities learnt using real traffic (∼243 kpkt from ∼30 GiB)

RABIT (R. Mayr) used for exact NFA reduction

◮ simulation-based

synthesis for Xilinx Virtex-7

◮ reporting #LUTs (look-up tables)

tool APPREAL

◮ APProximate REduction of Automata and Languages ◮ https://github.com/vhavlena/appreal

C
  • n
s i s t e n t * C
  • m
p l e t e * W e l l D
  • c
u m e n t e d * E a s y t
  • R
e u s e * * E v a l u a t e d * T A C A S * A r t i f a c t * A E C ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 17 / 23
slide-59
SLIDE 59

Results — case study 1

http-malicious.pcre

/ˆPOST HTTP\/1\.[01]\r\n(\V+\r\n)*\r\n[\x00-\xff]*DROP TABLE/ /ˆHTTP\/1\.[01] 404[\x00-\xff]*(admin|wordpress)/ /ˆPOST HTTP\/1\.[01]\r\n(\V+\r\n)*\r\n[\x00-\xff]*admin:admin/ /ˆPOST HTTP\/1\.[01]\r\n(\V+\r\n)*\r\n[\x00-\xff]*admin:password/ /ˆPOST HTTP\/1\.[01]\r\n(\V+\r\n)*\r\n[\x00-\xff]*YWRtaW46cGFzc3dvcmQ/ /ˆPOST HTTP\/1\.[01]\r\n(\V+\r\n)*\r\n[\x00-\xff]*YWRtaW46YWRtaW4/ /ˆPOST HTTP\/1\.[01]\r\n(\V+\r\n)*\r\n[\x00-\xff]*\/bin\/sh/

Before Pr reduction |Amal| = 249 states |ARED

mal | = 98 states

time(label) = 39 s time(APP) < 1 s LUT(ARED

mal ) = 382 Error Error Error k |AAPP mal| |A′ mal| label PHTTP traffic LUTs 0.1 9 9 0.0704 0.0704 0.0685 — 0.2 19 19 0.0677 0.0677 0.0648 — 0.3 29 26 0.0279 0.0278 0.0598 154 0.4 39 36 0.0032 0.0032 0.0008 — 0.5 49 44 2.8e-05 2.8e-05 4.1e-06 — 0.6 58 49 8.7e-08 8.7e-08 0.0 224 0.8 78 75 2.4e-17 2.4e-17 0.0 297 ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 18 / 23
slide-60
SLIDE 60

Results — case study 2

http-attacks.pcre

/calendar(|[-_]admin)\.pl[\x00-\xff]*/Ui /db4web_c(\.exe)?\/.*(\.\.[\#\/]|[a-z]\:)[\x00-\xff]*/smiU /evtdump\x3f.*?\x2525[ˆ\x20]*?\x20HTTP[\x00-\xff]*/i /instancename=[ˆ&\x3b\r\n]{10}[\x00-\xff]*/smi /itemid=\d*[ˆ\d\&\;\r\n][\x00-\xff]*/i /ˆGET\s+[ˆ\x20]*\x2Ewm[zd][\x00-\xff]*/smi /mstshash\s*\x3d\s*Administr[\x00-\xff]*/smi /SILC\x2d\d\x2e\d[\x00-\xff]*/smi

Before Pr reduction |Aatt| = 142 states |ARED

att | = 112 states

time(label) = 28 min time(APP) ≈ 1 s

Error Error Error k |AAPP att| |A′ att| label PHTTP traffic 0.2 22 14 1.0 0.8341 0.2313 0.3 33 24 0.081 0.0770 0.0067 0.4 44 37 0.0005 0.0005 0.0010 0.5 56 49 3.3e-06 3.3e-06 0.0010 0.6 67 61 1.2e-09 1.2e-09 8.7e-05 0.7 78 72 4.8e-12 4.8e-12 1.2e-05 0.9 100 93 3.7e-16 1.1e-15 0.0 ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 19 / 23
slide-61
SLIDE 61

Results — case study 3

http-backdoor.pcre

/000File\s+is\s+executed\x2E\x2E\x2E/smi /ˆ000Ok\s+echter\s+server\s+\?/smi /ˆ001\xACOptix\s+Pro\s+v\d+\x2E\d+\s+Connected\s+Successfully\x21/smi /ˆ100013Agentsvr\x5E\x5EMerlin/smi /ˆ666\d+\xFF\d+\xFF\d+\xFF\d+\xFF\d+\xFF\d+\xFF\d+\xFF/smi /ˆA-311 Death welcome/smi /ˆanswer\x00{6}NetControl\x2EServer\s+\d+\x2E\d+\s+\x22The\s+UNSEEN\x22\s+Project/smi [... 42 more lines ...]

Before Pr reduction |Abd| = 1,352 states |ARED

bd | = ?? states

time(label) = 20 min time(APP) ≈ 1.5 min LUT(ARED

mal ) = 2,266 Error Error k |AAPP bd | |A′ bd| label traffic LUTs 0.1 135 8 1.0 0.997 202 0.2 270 111 0.0012 0.0631 579 0.3 405 233 3.4e-08 0.0003 894 0.4 540 351 1.0e-12 0.0003 1063 0.5 676 473 1.2e-17 0.0 1249 0.7 946 739 8.3e-30 0.0 1735 0.9 1216 983 1.3e-52 0.0 2033 ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 20 / 23
slide-62
SLIDE 62

Results — case study 4

Real impact on COMBO-100G (Xilinx Virtex-7 H580T) http-malicious.pcre

◮ LUT(ARED

mal ) = 382

http-backdoor.pcre

◮ LUT(ARED

bd ) = 2,266

available LUTs = 15,000 Speed LUTs ARED

mal speed

A′

mal error

ARED

bd

speed A′

bd error

100 Gbps 937 100 Gbps 38.4 Gbps 3.4e-18 400 Gbps 238 250 Gbps 8.7e-8 38.4 Gbps 1

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 21 / 23
slide-63
SLIDE 63

Future Work

Future work: learning of prob. automaton different automaton models (e.g. delayed input DFA) better cost function

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 22 / 23
slide-64
SLIDE 64

Summary

reduction of nondeterministic finite automata (NFAs) the reduction does NOT preserve language BUT guarantees maximum error w.r.t. a probabilistic distribution application in high-speed network intrusion detection

  • btained significant speed improvement w/ small error
ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 23 / 23
slide-65
SLIDE 65

Summary

reduction of nondeterministic finite automata (NFAs) the reduction does NOT preserve language BUT guarantees maximum error w.r.t. a probabilistic distribution application in high-speed network intrusion detection

  • btained significant speed improvement w/ small error

THANK YOU!

ˇ Ceˇ ska, Havlena, Hol´ ık, Leng´ al, Vojnar Approximate Reduction of Finite Automata TACAS’18 23 / 23