A Digital Fountain Approach to Reliable Distribution of Bulk Data - - PowerPoint PPT Presentation

a digital fountain approach to reliable distribution of
SMART_READER_LITE
LIVE PREVIEW

A Digital Fountain Approach to Reliable Distribution of Bulk Data - - PowerPoint PPT Presentation

A Digital Fountain Approach to Reliable Distribution of Bulk Data John Byers, ICSI Michael Luby, ICSI Michael Mitzenmacher, Compaq SRC Ashu Rege, ICSI Application: Software Distribution New release of widely used software. Hundreds of


slide-1
SLIDE 1

A Digital Fountain Approach to Reliable Distribution of Bulk Data

John Byers, ICSI Michael Luby, ICSI Michael Mitzenmacher, Compaq SRC Ashu Rege, ICSI

slide-2
SLIDE 2

Application: Software Distribution

  • New release of widely used software.
  • Hundreds of thousands of clients or more.
  • Bulk data: tens or hundreds of MB
  • Heterogeneous clients:

– Modem users: hours – Well-connected users: minutes

slide-3
SLIDE 3

Primary Objectives

  • Scale to vast numbers of clients

– No ARQs or NACKs – Minimize use of network bandwidth

  • Minimize overhead at receivers:

– Computation time – Useless packets

  • Compatibility

– Networks: Internet, satellite, wireless – Scheduling policies, i.e. congestion control

slide-4
SLIDE 4

Impediments

  • Packet loss

– wired networks: congestion – satellite networks, mobile receivers

  • Receiver heterogeneity

– packet loss rates – end-to-end throughput

  • Receiver access patterns

– asynchronous arrivals and departures – overlapping access intervals

slide-5
SLIDE 5

Digital Fountain

Encoding Stream Received Message Source k k k Can recover file from any set of k encoding packets. Transmission Instantaneous Instantaneous

slide-6
SLIDE 6

Digital Fountain Solution

User 1 User 2 Transmission

5 hours 4 hours 3 hours 2 hours 1 hour 0 hours

File

slide-7
SLIDE 7

Is FEC Inherently Bad?

  • Faulty Reasoning

– FEC adds redundancy – Redundancy increases congestion and losses – More losses necessitate more transmissions – FEC consumes more overall bandwidth

  • But…

– Each and every packet can be useful to all clients – Each client consumes minimum bandwidth possible – FEC consumes less overall bandwidth by compressing bandwidth across clients

slide-8
SLIDE 8

DF Solution Features

  • Users can initiate the download at their discretion.
  • Users can continue download seamlessly after

temporary interruption.

  • Tolerates moderate packet loss.
  • Low server load - simple protocol.
  • Does scale well.
  • Low network load.
slide-9
SLIDE 9

Approximating a Digital Fountain

Source Received Message Decoding Time k (1 + c) k k Encoding Time Encoding Stream

slide-10
SLIDE 10

Approximating a DF: Performance Measures

  • Time Overhead:

– Time to decode (or encode) as a function of k.

  • Decoding Inefficiency:

packets needed to decode

k

slide-11
SLIDE 11

Work on Erasure Codes

  • Standard Reed-Solomon Codes

– Dense systems of linear equations. – Poor time overhead (quadratic in k) – Optimal decoding inefficiency of 1

  • Tornado Codes [LMSSS ‘97]

– Sparse systems of equations. – Fast encoding and decoding (linear in k) – Suboptimal decoding inefficiency

slide-12
SLIDE 12

Tornado Z: Encoding Structure

stretch factor = 2 k = 16,000 nodes = source data

= redundancy Irregular bipartite graph Irregular bipartite graph

k

slide-13
SLIDE 13

Encoding/Decoding Process

a b f ⊕ ⊕

a b c d g ⊕ ⊕ ⊕ ⊕

c e g h ⊕ ⊕ ⊕ b d e f g h ⊕ ⊕ ⊕ ⊕ ⊕

a

b c d e f

g h

slide-14
SLIDE 14

Timing Comparison

Encoding time, 1K packets Reed-Solomon Tornado Z Size 250 K 500 K 1 MB 2 MB 4 MB 8 MB 16 MB 4.6 sec. 19 sec. 93 sec. 442 sec. 30 min. 2 hrs. 8 hrs. 0.11 sec. 0.18 sec. 0.29 sec. 0.57 sec. 1.01 sec. 1.99 sec. 3.93 sec. Decoding time, 1K packets Reed-Solomon Tornado Z Size 250 K 500 K 1 MB 2 MB 4 MB 8 MB 16 MB 2.06 sec. 8.4 sec. 40.5 sec. 199 sec. 13 min. 1 hr. 4 hrs. 0.18 sec. 0.24 sec. 0.31 sec. 0.44 sec. 0.74 sec. 1.28 sec. 2.27 sec.

Tornado Z: Average inefficiency = 1.055 Both codes: Stretch factor = 2

slide-15
SLIDE 15

Cyclic Interleaving

File Interleaved Encoding

Transmission Encoding Copy 1 Encoding Copy 2

Blocks Encoded Blocks Tornado Encoding

slide-16
SLIDE 16

Cyclic Interleaving: Drawbacks

  • The Coupon Collector’s Problem

– Waiting for packets from the last blocks: – More blocks: faster decoding, larger inefficiency

T B blocks

slide-17
SLIDE 17

Scalability over File Size

Decoding Inefficiency, 500 Receivers, p = 0.1

1 1.1 1.2 1.3 1.4 1.5 1.6 100 1000 10000

File Size, KB Decoding Inefficiency

Interleaved, T = 20, Max. Interleaved, T = 20, Avg. Interleaved, T = 50, Max. Interleaved, T = 50, Avg. Tornado Z, Max. Tornado Z, Avg.

slide-18
SLIDE 18

Scalability over Receivers

Decoding Inefficiency on a 1MB File, p = 0.1

1 1.2 1.4 1.6 1.8 1 10 100 1000 10000

Receivers Decoding Inefficiency

Interleaved, T = 20 Interleaved, T = 50 Tornado Z

slide-19
SLIDE 19

Digital Fountain Prototype

  • Built on top of IP Multicast.
  • Tolerating heterogeneity:

– Layered multicast – Congestion control [VRC ‘98]

  • Experimental results over MBONE.
slide-20
SLIDE 20

Research Directions

  • Other applications for digital fountains

– Dispersity routing – Accessing data from multiple mirror sites in parallel

  • Improving the codes
  • Implementation and deployment

– Scale to large number of clients – Network interactions