The Power of Prediction: Cloud Bandwidth and Cost Reduction Eyal - - PowerPoint PPT Presentation

the power of prediction cloud bandwidth and cost reduction
SMART_READER_LITE
LIVE PREVIEW

The Power of Prediction: Cloud Bandwidth and Cost Reduction Eyal - - PowerPoint PPT Presentation

The Power of Prediction: Cloud Bandwidth and Cost Reduction Eyal Zohar Israel Cidon Osnat (Ossi) Mokryn Technion Tel-Aviv College Traffic Redundancy Elimination (TRE) Traffic redundancy stems from downloading same or similar information


slide-1
SLIDE 1

The Power of Prediction: Cloud Bandwidth and Cost Reduction

Eyal Zohar Israel Cidon Technion Osnat (Ossi) Mokryn Tel-Aviv College

slide-2
SLIDE 2

Traffic Redundancy Elimination (TRE)

Traffic redundancy stems from downloading same or similar information items. We found around 70% redundancy in end-clients traffic, compared with past traffic and local files.

  • SIGCOMM 2011
slide-3
SLIDE 3

TRE Importance

Moving to the cloud => higher e2e traffic. Cloud users pay for traffic used in practice => incentive to use TRE.

Cloud Provider Cloud Provider Cloud User Pay for Use End-user Application TRE Cloud Traffic

  • SIGCOMM 2011
slide-4
SLIDE 4

How TRE Works

Chunk 1 Chunk 2 Chunk 3 Byte stream

Anchor 1 Anchor 2 Anchor 3 Anchor 4

  • Sign. 1
  • Sign. 2

Sign 3 Rolling hash SHA-1 signature

Server parses the outgoing stream to content- based chunks and signs with SHA-1

Chunk 1 Chunk 2’ Chunk 3

Insertion example New bytes

  • SIGCOMM 2011
slide-5
SLIDE 5

Problems in Existing Solutions

In the cloud environment:

  • 1. High processing costs in the cloud.
  • 2. Scalability – remember each client.
  • 3. Elasticity - unaware of data from other sources.
  • 4. Do not handle long-term repeats (days/weeks).

Receiver Server 1 Server 2

  • SIGCOMM 2011
slide-6
SLIDE 6

Our Solution: PACK (Predictive ACK)

Redundancy detection by the client. Repeats appear in chains. Tries to match incoming chunks with a previously received chain or local file. Sends to the server predictions of the future data.

  • SIGCOMM 2011
slide-7
SLIDE 7

PACK: The Client Prediction

Each prediction: 1.TCP seq. – no server parsing 2.Hint – spare unnecessary SHA-1 3.SHA-1 signature

Chunk

SHA-1

  • Last-byte

hint TCP seq.

SIGCOMM 2011

Chunk 1 Chunk 2 Chunk 3

  • Sign. 1
  • Sign. 2

Sign 3 SHA-1 signature Stream chunks Chain of chunks Received Prediction

slide-8
SLIDE 8

3

PACK: Server Operation

The server compares the hint with the last-byte to sign. Upon a hint match it performs the expensive SHA-1. PACK saves cloud’s computational effort in the absence

  • f redundancy.

First receiver-based TRE: the server does not parse. It signs with >99% confidence.

Client Server Local storage 1 2 1 2 3 2,3? Chain 2,3V

  • SIGCOMM 2011
slide-9
SLIDE 9

PACK Benefits

Minimizes processing costs induced by TRE.

– Signs with SHA-1 in the presence of redundancy.

Receiver-based end-to-end TRE => suitable for cloud server elasticity and client mobility.

– Does not require the server to continuously maintain clients’ status.

  • SIGCOMM 2011
slide-10
SLIDE 10

Server Effort Experiment

Several data-sets in 3 modes: baseline no-TRE, PACK and a sender-based TRE.

0% 20% 40% 60% 80% 100% 120% 140% 0% 10% 20% 30% 40% 50% Single Server Cloud Operational Cost (100%=without TRE system) Redundancy Elimination Ratio EndRE-like PACK

25%-30% redundancy: common to many data-sets

  • SIGCOMM 2011

Sender-based

slide-11
SLIDE 11

0% 5% 10% 15% 20% 25% 30% 35% 0.0 0.5 1.0 1.5 2.0 2.5 3.0 PACK TRE (Removed Redundancy) All YouTube Traffic (Gbps) Time (24 hours) YouTube Traffic PACK TRE

YouTube Redundancy

Traces of 40k clients, captured at an ISP. Found 30% end-to-end (personal) redundancy.

  • SIGCOMM 2011
slide-12
SLIDE 12

0% 10% 20% 30% 40% 50% 60% 70% 80% 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 Average Redundancy of Daily Traffic Days Since Start Unlimited 1 Hour 24 Hours

Social network: eliminated 30% with one hour cache and 75% with a long-term cache.

  • Long-Term TRE

SIGCOMM 2011

slide-13
SLIDE 13

Gmail account with 1,000 Inbox messages. Found 32% static redundancy (higher when messages are read multiple times).

Cloud Email Redundancy

50 100 150 200 250 300 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Traffic Volume Per Month (MB) Month Redundant Non-redundant

  • SIGCOMM 2011
slide-14
SLIDE 14

Implementation

Linux with Netfilter Queue, 25k lines of C and Java, available for download. Receiver-sender protocol is embedded in the TCP Options field. Transparent use at both sides.

  • SIGCOMM 2011
slide-15
SLIDE 15

Processing Effort in the Client

SIGCOMM 2011

  • Laptop experiment: PACK-related CPU consumption is

~4% when playing HD video (9 Mbps with 30% redundancy). Smartphone experiment: PACK consumes ~3% of the battery power when processing 1 GB video (avg. monthly data plan). Virtual traffic saves the client the need to chunk or sign.

slide-16
SLIDE 16

New Chunking Algorithm

SIGCOMM 2011

  • Most existing solutions use Rabin fingerprint.
slide-17
SLIDE 17

New Chunking Algorithm

64 bits n n-1 n-2 n-3 n-4 n-5 n-6 n-7 n-8 n-40 n-41 n-42 n-43 n-44 n-45 n-46 n-47 Mask=00 00 8A 31 10 58 30 80

SIGCOMM 2011

slide-18
SLIDE 18

Summary

Current TRE solutions may not reduce cloud cost. PACK is the first receiver-based TRE – leverages the power of prediction. Minimizes processing costs induced by TRE. Suitable for cloud server migration and client mobility. Implementation is available for download.

  • SIGCOMM 2011