The Power of Prediction: Cloud Bandwidth and Cost Reduction Eyal - - PowerPoint PPT Presentation
The Power of Prediction: Cloud Bandwidth and Cost Reduction Eyal - - PowerPoint PPT Presentation
The Power of Prediction: Cloud Bandwidth and Cost Reduction Eyal Zohar Israel Cidon Osnat (Ossi) Mokryn Technion Tel-Aviv College Traffic Redundancy Elimination (TRE) Traffic redundancy stems from downloading same or similar information
Traffic Redundancy Elimination (TRE)
Traffic redundancy stems from downloading same or similar information items. We found around 70% redundancy in end-clients traffic, compared with past traffic and local files.
- SIGCOMM 2011
TRE Importance
Moving to the cloud => higher e2e traffic. Cloud users pay for traffic used in practice => incentive to use TRE.
Cloud Provider Cloud Provider Cloud User Pay for Use End-user Application TRE Cloud Traffic
- SIGCOMM 2011
How TRE Works
Chunk 1 Chunk 2 Chunk 3 Byte stream
Anchor 1 Anchor 2 Anchor 3 Anchor 4
- Sign. 1
- Sign. 2
Sign 3 Rolling hash SHA-1 signature
Server parses the outgoing stream to content- based chunks and signs with SHA-1
Chunk 1 Chunk 2’ Chunk 3
Insertion example New bytes
- SIGCOMM 2011
Problems in Existing Solutions
In the cloud environment:
- 1. High processing costs in the cloud.
- 2. Scalability – remember each client.
- 3. Elasticity - unaware of data from other sources.
- 4. Do not handle long-term repeats (days/weeks).
Receiver Server 1 Server 2
- SIGCOMM 2011
Our Solution: PACK (Predictive ACK)
Redundancy detection by the client. Repeats appear in chains. Tries to match incoming chunks with a previously received chain or local file. Sends to the server predictions of the future data.
- SIGCOMM 2011
PACK: The Client Prediction
Each prediction: 1.TCP seq. – no server parsing 2.Hint – spare unnecessary SHA-1 3.SHA-1 signature
Chunk
SHA-1
- Last-byte
hint TCP seq.
SIGCOMM 2011
Chunk 1 Chunk 2 Chunk 3
- Sign. 1
- Sign. 2
Sign 3 SHA-1 signature Stream chunks Chain of chunks Received Prediction
3
PACK: Server Operation
The server compares the hint with the last-byte to sign. Upon a hint match it performs the expensive SHA-1. PACK saves cloud’s computational effort in the absence
- f redundancy.
First receiver-based TRE: the server does not parse. It signs with >99% confidence.
Client Server Local storage 1 2 1 2 3 2,3? Chain 2,3V
- SIGCOMM 2011
PACK Benefits
Minimizes processing costs induced by TRE.
– Signs with SHA-1 in the presence of redundancy.
Receiver-based end-to-end TRE => suitable for cloud server elasticity and client mobility.
– Does not require the server to continuously maintain clients’ status.
- SIGCOMM 2011
Server Effort Experiment
Several data-sets in 3 modes: baseline no-TRE, PACK and a sender-based TRE.
0% 20% 40% 60% 80% 100% 120% 140% 0% 10% 20% 30% 40% 50% Single Server Cloud Operational Cost (100%=without TRE system) Redundancy Elimination Ratio EndRE-like PACK
25%-30% redundancy: common to many data-sets
- SIGCOMM 2011
Sender-based
0% 5% 10% 15% 20% 25% 30% 35% 0.0 0.5 1.0 1.5 2.0 2.5 3.0 PACK TRE (Removed Redundancy) All YouTube Traffic (Gbps) Time (24 hours) YouTube Traffic PACK TRE
YouTube Redundancy
Traces of 40k clients, captured at an ISP. Found 30% end-to-end (personal) redundancy.
- SIGCOMM 2011
0% 10% 20% 30% 40% 50% 60% 70% 80% 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 Average Redundancy of Daily Traffic Days Since Start Unlimited 1 Hour 24 Hours
Social network: eliminated 30% with one hour cache and 75% with a long-term cache.
- Long-Term TRE
SIGCOMM 2011
Gmail account with 1,000 Inbox messages. Found 32% static redundancy (higher when messages are read multiple times).
Cloud Email Redundancy
50 100 150 200 250 300 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Traffic Volume Per Month (MB) Month Redundant Non-redundant
- SIGCOMM 2011
Implementation
Linux with Netfilter Queue, 25k lines of C and Java, available for download. Receiver-sender protocol is embedded in the TCP Options field. Transparent use at both sides.
- SIGCOMM 2011
Processing Effort in the Client
SIGCOMM 2011
- Laptop experiment: PACK-related CPU consumption is
~4% when playing HD video (9 Mbps with 30% redundancy). Smartphone experiment: PACK consumes ~3% of the battery power when processing 1 GB video (avg. monthly data plan). Virtual traffic saves the client the need to chunk or sign.
New Chunking Algorithm
SIGCOMM 2011
- Most existing solutions use Rabin fingerprint.
New Chunking Algorithm
64 bits n n-1 n-2 n-3 n-4 n-5 n-6 n-7 n-8 n-40 n-41 n-42 n-43 n-44 n-45 n-46 n-47 Mask=00 00 8A 31 10 58 30 80
SIGCOMM 2011
Summary
Current TRE solutions may not reduce cloud cost. PACK is the first receiver-based TRE – leverages the power of prediction. Minimizes processing costs induced by TRE. Suitable for cloud server migration and client mobility. Implementation is available for download.
- SIGCOMM 2011