A Network Coding Approach A Network Coding Approach to IP Traceback - - PowerPoint PPT Presentation

a network coding approach a network coding approach to ip
SMART_READER_LITE
LIVE PREVIEW

A Network Coding Approach A Network Coding Approach to IP Traceback - - PowerPoint PPT Presentation

A Network Coding Approach A Network Coding Approach to IP Traceback Pegah Sattari, Minas Gjokas, Athina Markopoulou EECS, UC Irvine Outline Outline o Background on Traceback o Background on Traceback o Main idea PPM+NC o Practical PPM+NC


slide-1
SLIDE 1

A Network Coding Approach A Network Coding Approach to IP Traceback

Pegah Sattari, Minas Gjokas, Athina Markopoulou EECS, UC Irvine

slide-2
SLIDE 2

Outline Outline

  • Background on Traceback
  • Background on Traceback
  • Main idea PPM+NC
  • Practical PPM+NC
  • Practical PPM NC
  • Simulation Results
  • Conclusion and future work
slide-3
SLIDE 3

Where is malicious traffic coming from? Where is malicious traffic coming from?

c c c c

attackers

c c

. . . . . . . . .

c

attack gateways legitimate users Access Router Victim

Goal: traceback source and path of attack

slide-4
SLIDE 4

Prior Work on Traceback Prior Work on Traceback

  • Early ideas [Burch and Cheswick 1999]
  • Send specialized (ICMP) packets [Bellovin et al. 2001]
  • Routers keep logs of all packets [Snoeren et al. 2001] …
  • Packet Marking

g

  • routers mark packets with information about their ID, victim uses the

marks of several packets to reconstruct path

  • [Savage et al. 2001]: probabilistically mark fragments of IP addresses

A th ti ti h hi [S t l 2001] [Y t l 05] dj ti

  • Authentication + hashing [Song et al. 2001], [Yaar et al. 05], adjusting

marking probability, …

  • Algebraic Traceback
  • [Dean et al 2002]: encodes the information of n routers on the attack
  • [Dean et al. 2002]: encodes the information of n routers on the attack

path as coefficients of a polynomial of degree n-1.

  • [Das et al. 2010]: tracks changes in a single path, network coding
  • Information theoretical [Adler 2002]

[ ]

  • studied the tradeoff of #bits vs. #packets
slide-5
SLIDE 5

Traceback

via Probabilistic Packet Marking (PPM)

A R1

Rd

R2 Rd-1

Rd-1 Rd-1 Rd Rd Rd-1 Rd Rd-1..R1 R2 R2 Rd Rd R2 R2 Rd Rd

… …

R2 R2

slide-6
SLIDE 6

Outline Outline

  • Background on Traceback
  • Background on Traceback
  • Main idea

– Problem statement – PPM+NC

  • Practical PPM+NC

i l i R l

  • Simulation Results
  • Conclusion and future work
slide-7
SLIDE 7

Main Idea

Problem Statement

A R1

Rd

R2 Rd-1

Pm(d) Pm(2) Pm(d-1) Pm(1)

  • Probabilistic Packet Marking (PPM):

– Routers probabilistically mark packets with (partial) information about their address. Th l f PPM i bl h i i d – The goal of PPM is to enable the victim to recover d router IDs after receiving a sufficient number of packets. – PPM+NC tries to achieve the same goal with a smaller PPM+NC tries to achieve the same goal with a smaller #packets, by appropriately choosing the marking scheme at intermediate routers.

slide-8
SLIDE 8

Main Idea

PPM+NC

  • PPM is essentially a coupon collector’s problem

y p p m

– Collect all router ids {Rd, Rd-1, …. R2, R1} – A coupon collector’s problem with unequal probabilities:

  • The further a router is from the victim, the less likely that its mark

will not be overwritten as the packet moves along the path will not be overwritten as the packet moves along the path.

  • NC helps the coupon collector problem:

– NC increases the chance of getting an innovative coupon g g p – equally likely coupons: E[X] reduces from Θ(dlogd) to Θ(d)

slide-9
SLIDE 9

Main Idea

PPM+NC cont’d

linear combination random coefficients

∑ci.Ri c1 ck c2

linear combination random coefficients

  • Router i:

– instead of marking with its own id “Ri”, picks a random coefficient “ci”, and adds ci•Ri to the existing mark.

  • Victim:

– instead of ids themselves it receives random linear instead of ids themselves, it receives random linear combinations of router ids (∑ ci•Ri): – solves a system of equations and find the ids.

slide-10
SLIDE 10

Main Idea

PPM+NC for a single path

200 250 simulations PPM model PPM simulations PPM+NC

Setup:

  • path length d=1…31, field F4,

150 200 mber of packets model PPM+NC

p g

, f

4,

p=1/25, 500 realizations.

  • Metric of interest: number of

marks X needed to reconstruct the attack path

50 100 Average num

reconstruct the attack path

Observations:

  • E[XPPM+NC]<E[XPPM]

5 10 15 20 25 30 35 Path length

[

PPM+NC]

[

PPM]

  • Models perfectly agree with

simulation

slide-11
SLIDE 11

Main Idea

Multiple-path scenario as the union of multiple paths

  • Typically DDoS attacks is distributed:
  • Typically DDoS attacks is distributed

A1 A4 A3

A2

A5A6 A7

di t 4

A8

R15 R4 R7 R6 R5 R8 R14 R13 R12 R11 R10 R9

distance=4 distance=3

R1 R3 R2

distance=2 distance=1

R1

V

  • The attack path from {Ai} is the ordered list of routers
  • The attack path from {Ai} is the ordered list of routers

between {Ai} and V that the attack packet has gone through.

slide-12
SLIDE 12

Outline Outline

  • DDoS and Traceback
  • DDoS and Traceback
  • Main idea
  • Practical PPM+NC
  • Practical PPM NC

– Practical constraints – Marking procedure R t ti d – Reconstruction procedure – Processing costs

  • Simulation results
  • Conclusion and future work
slide-13
SLIDE 13

Practical PPM+NC

  • Limited number of bits (16 ID + 1 flag = 17)

Practical Constraints f ( f g )

– Mark with Fragments of IP addresses – f=4 fragments (of 8 bits each), 2-bit fragment offset, k=3 coefficients, of b=2 bits each, distance=1 bit. Total: 17 bits. – 8 bits used for the linear combination, 2 bits for the coefficients.

f b h k

  • Spoofing by the attacker

– Probabilistically overwrite the previous mark – Distance field (approximate traceback)

  • Identifying nodes vs. reconstructing the attack graph

– Distance field – Markings from consecutive routers Markings from consecutive routers

slide-14
SLIDE 14

Practical PPM+NC

Marking Procedure

E h b b l ll d d h h

  • Each router probabilistically decides whether to overwrite or not.
  • If overwrite:

– zero out the field+ mark with a fragment of the router ID.

  • If not overwrite & there is space:
  • If not_overwrite & there is space:

– add to the combination of the same fragment – increase distance field

slide-15
SLIDE 15

Practical PPM+NC

linear combination random coefficients

Tradeoff in the packet header ∑ci.Ri

j

c1 ck c2

linear combination random coefficients

fragment

  • ffset

dist

  • Ri

j: The jth fragment of Ri.

W b h b l ibl

  • We want both parts to be as large as possible:

– A linear combination of larger fragments. – A linear combination of as many fragments of IP addresses as y g possible (random coefficients).

  • Always an optimal k minimizes #packets. For bit

budget 17 it is k = 3 (our selection) budget 17, it is k = 3 (our selection).

slide-16
SLIDE 16

Practical PPM+NC

Tradeoff in the packet header, cont’d

900 Bit budget 16 Bit budget 17 Bit budget 18 600 700 800 packets Bit budget 18 Bit budget 19 Bit budget 20 Bit budget 21 Bit budget 22 Bit budget 23 Bit budget 24 300 400 500 Average number of Bit budget 24 Bit budget 25 1 2 3 4 5 6 7 8 9 10 100 200 Number of coefficients

  • Best choice: 8 bits for fragments (f=4), 2 bits for fragment offset, 3

coefficients (k=3), of 2 bits each (b=2), 1 bit for distance.

  • 17 bits in total, within the bit-budget.
slide-17
SLIDE 17

Practical PPM+NC

Once the victim receives the packet P it forms:

Reconstruction Procedure – Single Path

– Once the victim receives the packet P, it forms: cL.RL

j +cL−1.RL-1 j+cL−2.RL-2 j = P.linearCombination

– The unknowns are the fragments of the IP addresses: Ri

j , i=1…d, j=1…f

– The victim can solve the system of linear equations after receiving d·f innovative packets – Use fragment offset to order fragments of same router ID (same distance) – Path consists of router IDs ordered by distance Path consists of router IDs ordered by distance

slide-18
SLIDE 18

Practical PPM+NC

  • Multiple paths:

Reconstruction Procedure, cont’d

  • Multiple-paths:

– Multiple routers at the same distance from the victim. – Need to distinguish equations coming from different paths.

A1 A4 A3

A2

A5A6 A7A8

  • E g victim receives 2

R15

distance=4 distance=3

R R R R R8 R14 R13 R12 R11 R10 R9

  • E.g., victim receives 2

packets from distance=4

  • One from R8,R4,R2, the
  • ther from R15 R7 R3

distance=2

R4 R7 R6 R5 R3 R2

  • ther from R15,R7,R3
  • Do they belong to the

same triplet or not?!

distance=1

R1

V

slide-19
SLIDE 19

Practical PPM+NC

  • Two solutions:

Reconstruction Procedure, cont’d

  • Two solutions:

1. Use 8 bits (TOS field) to store a checksum that helps identify a triplet of marking routers

E h h h f P dd

  • E.g., each router pre-computes a hash of its IP address
  • The less bits we use, the larger the probability of collision

2. Assume the victim has knowledge of the map of its upstream routers [Song et al., Yaar et al.].

  • Given the distance value, fragment offset, and random

coefficients, the victim tries all possible triplets in the map and picks the one that matches.

  • Does not even solve a system of linear equations
slide-20
SLIDE 20

Practical PPM+NC

  • Benefit of the PPM+NC approach

Cost

  • Benefit of the PPM NC approach
  • Reconstruct the paths after receiving a smaller number of marked packets
  • Cost of PM+NC approach:
  • increased computational complexity and processing time.

mp mp y p g m

  • Need to generate more random numbers,

– both for the marking decision and for the random coefficients: – both for the marking decision and for the random coefficients:

  • nly when there is space
  • can be pre-computed and used for all packets
  • Routers need to compute linear combinations in F256

p

256

– can be done quickly using a transition (log) table

  • Victim needs to solve a system of linear equations or to try

addresses against a given linear combination g g

slide-21
SLIDE 21

Outline Outline

  • DDoS and Traceback
  • DDoS and Traceback
  • Main idea
  • Practical PPM+NC
  • Practical PPM NC
  • Simulation Results
  • Conclusion and future work
slide-22
SLIDE 22

Simulation Results

paths vs. trees

Single path, d=1…31 Binary tree, 3…127 nodes

  • Fair comparison against modified FMS [Savage et al. 2001], such that it

p g g uses 17bits +TTL-based distance.

  • p=1/25, 500 realizations
slide-23
SLIDE 23

Simulation Results

power-law graphs

Setup:

  • BRITE topology generator
  • Router-only mode GLP model

Router only mode, GLP model, preferential connectivity, incremental growth, random node placement.

  • #links added per new node=2

#links added per new node=2

  • generated a 150 node graph,

extracted a tree out of it, and tried different #attackers.

  • p=1/25 500 realizations

p=1/25, 500 realizations.

slide-24
SLIDE 24

Outline Outline

  • DDoS and Traceback
  • DDoS and Traceback
  • Main idea

– Problem statement – PPM+NC

  • Practical PPM+NC

P ti l t i t – Practical constraints – Marking procedure – Reconstruction procedure – Processing costs

  • Simulation results

C l si d F t k

  • Conclusion and Future work
slide-25
SLIDE 25

Conclusion Conclusion

  • A network coding based approach to PPM: marking
  • A network coding-based approach to PPM: marking

packets with random linear combinations of router IDs, instead of individual IDs.

  • Implemented the idea in practice, taking into

account the bit limitations and other constraints account the bit limitations and other constraints.

  • Simulated several attack scenarios Showed it
  • Simulated several attack scenarios. Showed it

significantly reduces number of required packets.

slide-26
SLIDE 26

NC + other PPM Schemes NC + other PPM Schemes

  • NC based marking is orthogonal to and can
  • NC-based marking is orthogonal to and can

be combined with:

– hashing-based PPM – hashing-based PPM – authentication schemes – adjusted probabilities j p

slide-27
SLIDE 27

Future Work

inter-path coding for multipath traceback

  • When network coding is deployed in the network
  • When network coding is deployed in the network

– use one mark f(R1, R2, R3) – instead of two g(R1, R3), h(R2, R3)

R1 R2

  • Potential Benefits

– Can signal coding point

R3

– Can distinguish among paths – Can signal the distance

  • Connections with the work on

topology inference + network coding coding

slide-28
SLIDE 28

Thank you! Thank you!

{psattari, athina} @uci.edu