Flash Crowds in an Open CDN IMC 2011 (Short Paper) Patrick Wendell , - - PowerPoint PPT Presentation

flash crowds in an open cdn
SMART_READER_LITE
LIVE PREVIEW

Flash Crowds in an Open CDN IMC 2011 (Short Paper) Patrick Wendell , - - PowerPoint PPT Presentation

Going Viral: Flash Crowds in an Open CDN IMC 2011 (Short Paper) Patrick Wendell , U.C. Berkeley Michael J. Freedman, Princeton University 1 What is a Flash Crowd? Slashdot Effect, Going Viral Exponential surge in request


slide-1
SLIDE 1

Going Viral: Flash Crowds in an Open CDN

Patrick Wendell, U.C. Berkeley Michael J. Freedman, Princeton University

IMC 2011 (Short Paper)

1

slide-2
SLIDE 2

What is a Flash Crowd?

  • “Slashdot Effect”, “Going Viral”
  • Exponential surge in request rate

(precisely defined in paper)

2

slide-3
SLIDE 3

Key Questions

  • What are primary drivers of flash crowds?
  • How effective is cache cooperation

during crowds against CDNs?

  • How quickly do we need to provision

resources to meet crowd traffic?

3

slide-4
SLIDE 4

CoralCDN

  • Network of ~300 distributed caching proxies

Origin Server HTTP Clients

4

CoralCDN Proxies

slide-5
SLIDE 5

CoralCDN

  • Network of ~300 distributed caching proxies

Origin Server HTTP Clients

5

CoralCDN Proxies

  • 1. Local cache
  • 2. Peer cache
  • 3. Origin fetch
slide-6
SLIDE 6

The Data

  • Complete CoralCDN trace over 4 years
  • 33 Billion HTTP requests
  • Per-request logging

– <Time, URL, client IP , proxy IP , content cached?, ...>

slide-7
SLIDE 7

Source Data

Finding Crowds

Pruning Misuse 2,501 Crowds

Crowd Detection

3,553 Crowds 33 Billion HTTP Requests

7

slide-8
SLIDE 8

Crowd Sources

8

slide-9
SLIDE 9

Common Referrers

9

Referrer # Crowds digg.com 123 reddit.com 20 stumbleupon.com 15 google.com 11 facebook.com 10 dugmirror.com 8 duggback.com 4 twitter.com 4

slide-10
SLIDE 10

Common Referrers

10

Referrer # Crowds digg.com 123 reddit.com 20 stumbleupon.com 15 google.com 11 facebook.com 10 dugmirror.com 8 duggback.com 4 twitter.com 4

slide-11
SLIDE 11

Common Referrers

11

Referrer # Crowds digg.com 123 reddit.com 20 stumbleupon.com 15 google.com 11 facebook.com 10 dugmirror.com 8 duggback.com 4 twitter.com 4

slide-12
SLIDE 12

Common Referrers

12

Referrer # Crowds digg.com 123 reddit.com 20 stumbleupon.com 15 google.com 11 facebook.com 10 dugmirror.com 8 duggback.com 4 twitter.com 4

slide-13
SLIDE 13

CDN Caching Strategies

13

slide-14
SLIDE 14

Fully Cooperative Caching Greedy Caching

Cooperation in Caching

14

slide-15
SLIDE 15
  • Depends how clients distribute over proxies
  • Depends how many objects a crowd contains

Benefits of Cooperation?

15

vs.

GET A GET A

vs.

GET A GET B GET A GET B

slide-16
SLIDE 16

Clients Use Many Proxies

  • Clients globally distributed, even during crowds
  • Most caches participate in most crowds

16

Very few large, concentrated crowds

slide-17
SLIDE 17

Crowds Contain Many Objects

348 708 766 548 131 [0,10) [10,100) [100,1000) [1,000,10,000) 10,000+

17

URLs Per Crowd

slide-18
SLIDE 18

Benefits from Cooperation

18

4% 40% 16% 9% 8% 8% 8% 4% 2% 0% 0%

56% of crowds: some improvement 40% of crowds: major improvement

Absolute Hit Rate Improvement

slide-19
SLIDE 19

Provisioning Resources For Crowds

19

slide-20
SLIDE 20

Examples of Resource Provisioning

  • CDN: static content

– Expand cache set for particular domain – Ω(Seconds)

  • Cloud Computing Platform: dynamic service

– Spin up new VM instances – Ω(Minutes)

  • If you squint, these are similar problems

20

slide-21
SLIDE 21

Required Resource Spin-up Time

21

Spin-up % Crowds Underprovisioned 10 Minutes 75% 1 Minute 50% 10 Seconds 10%

1-2 Minutes

  • n EC2
slide-22
SLIDE 22

Conclusions

  • What are primary drivers of flash crowds?

– Aggregators and portals, but also social/search

  • How effective is cache cooperation during

crowds against CDNs?

– Large benefit for 40% of crowds

  • How fast do we need to provision resources

during crowds?

– Likely require sub-minute responsiveness

22

slide-23
SLIDE 23

Questions?

cs.berkeley.edu/~pwendell

23

slide-24
SLIDE 24

Extra Slides / Charts

24

slide-25
SLIDE 25

Actual Spin-up Times on EC2

25

slide-26
SLIDE 26

How Fast is Fast?

26

slide-27
SLIDE 27

Origin Hits Saved by Cooperation

27

slide-28
SLIDE 28

Bursty Redirection

28

slide-29
SLIDE 29

Clients Distributed Widely

29

slide-30
SLIDE 30

Detecting Crowds

  • 1. Rapid surge in request rate

ri+1 > 2ri for several i

  • 2. High rate of traffic relative to

inferred capacity rmax > ravg * 20

30

slide-31
SLIDE 31

Crowd Mitigation/Insurance

Content Mostly Static Content Mostly Dynamic

Caching CDNs Scalable Storage and Computation

31