selective early request termination selective early
play

Selective Early Request Termination Selective Early Request - PowerPoint PPT Presentation

Selective Early Request Termination Selective Early Request Termination for Busy Internet Services for Busy Internet Services Jingyu Zhou and Tao Yang Zhou and Tao Yang Jingyu Ask.com Ask.com University of California, Santa Barbara


  1. Selective Early Request Termination Selective Early Request Termination for Busy Internet Services for Busy Internet Services Jingyu Zhou and Tao Yang Zhou and Tao Yang Jingyu Ask.com Ask.com University of California, Santa Barbara University of California, Santa Barbara

  2. Multi- -tier Internet Services tier Internet Services Multi Index servers Index servers Query Query (partition 1) (partition 1) caches caches Firewall/ Firewall/ Web switch Web switch Local- Local -area area Index servers Index servers network network (partition 2) (partition 2) Query Query Frontends Frontends Index servers Index servers Doc servers (partition 3) Doc servers (partition 3)

  3. Multi- -thread Programming thread Programming Multi Model for Request Processing Model for Request Processing • Multi-threaded service tier – E.g., Apache, IIS, BEA WebLogic, and Thread 1 Neptune Thread 3 Thread 2 Get request ? Process request Thread 3 Send result Thread N

  4. Problem Statement Problem Statement • Service-level agreement – E.g., 99% requests within 1s • A QoS challenge to be met during – Flash-crowd type of high request rate – Size distribution shift: percentage of long requests increases

  5. Motivating Example: Motivating Example: Size Distribution Shift Size Distribution Shift • Settings – 50 requests/ s 50 800 – Two types of Throughput Response time requests: 5ms and 500ms Mean Response Time (ms) Throughput (requests/s) 40 600 – Long requests vary from 0.1% to 10% • Results 30 400 – Significant throughput loss – Magnitude increase of 20 200 response time – Admission control alone isn’t enough 10 0 0 0 2 2 4 4 6 6 8 8 10 10 Percentage of Long Requests

  6. Current Techniques Current Techniques • Admission control – Response time feedback (e.g., SEDA, Quorum) – Bounding request queue length (e.g., Neptune) – Policing TCP SYN packets (e.g., [ Voigt’01] • Adaptive service degradation – E.g., reduce image quality • Size-based scheduling – Only for static content – File size as estimator

  7. SERT Idea & Challenges SERT Idea & Challenges • Idea – Request-aware: differentiate long and short requests – Early termination: abort long requests during overload • Challenges – Detect long/ short dynamic requests – Adaptive selection of termination threshold – Resource accounting for safety – Simplicity in programming

  8. SERT Architecture SERT Architecture Threshold Request Controller Queue Set/Cancel Timer Timer & Thread Terminator Pool Terminate Resource Access I n v o k e Termination Resource Accounting Module Handler Resources ... Lock File Memory

  9. Resource Accounting Resource Accounting • Targets a class of requests that are – Read-only – Stateless • Resources – Memory: track heaps and memory mapped areas – Locks: use an integer counter – Sockets & file descriptors

  10. Threshold Controller Adjusts Threshold Controller Adjusts Termination Threshold Termination Threshold • Ideas – During light load allow execute longer: large threshold – During heavy load terminate earlier: small threshold – Load index p is throughput loss • Formula – Threshold= LB + F(p)× (UB-LB), where: timeout range is [ LB, UB] p < = LW ⎧ 1 ⎪ − ⎪ HW p α = ⎨ F ( p ) ( ) LW < p < HW − ⎪ HW LW ⎪ ⎩ 0 p > = HW

  11. Implementation & Usage Implementation & Usage • Intercept GLIBC/ Pthread functions – Memory, Pthread locks, etc. • POSIX signal for terminations • Use sigsetjmp()/ siglongjmp() • Neptune middleware uses SERT APIs • Applications link the SERT library with no code changes

  12. SERT APIs SERT APIs • Start timer thread and set signal type extern int SERT_init_timer(int signum); • Start & end of a request extern void SERT_start(); extern void SERT_end(); • Set timeout value and controller parameters extern void SERT_set_args(struct sert_arg * ); • Set the rollback point extern void SERT_register_rollbackpoint(void * );

  13. A Pseudo- - code Example code Example A Pseudo void worker() { while (1) { Request * request = get_request(); jump_buf env; if (sigsetjmp(&env, 1) = = 0) { SERT_register_rollbackpoint(&env); } else { / * longjmp back, resources has already been deallocated * / continue; } SERT_start(); process_request(request); SERT_end(); send_result(request); } }

  14. Experimental Settings Experimental Settings • Hardware – 9 dual PIII 1.4GHz machines – Each has 4 GB RAM, 10K RPM SCSI disk – Fast Ethernet • Applications from Ask.com – Index matching: find web pages containing key words; heavy-tailed; 2.1 GB warm data in memory – Ranking: rank page importance; exponential; in memory Ave. (ms) 90% (ms) Max. (ms) App. Index Match 23.6 46 2,732 Ranking 93 212 14,035

  15. Size Distribution Shift Size Distribution Shift AC SERT Request Rate 80 ← Pattern shift begins (30s) Pattern shift ends (155s) → Throughput 60 40 20 0 0 20 40 60 80 100 120 140 160 180 Time (s) Response Time (s) 6 5 4 3 2 1 0 0 20 40 60 80 100 120 140 160 180 Time (s) • During shift, about 10% requests are 500+ ms • SERT – 209.1% higher throughput – 54.7% response time reduction

  16. Ranking Service Evaluation Ranking Service Evaluation Underloaded Underloaded Overloaded Overloaded 300 10 800 70 Mean Response Time (ms) Mean Response Time (ms) Throughput Loss Percent AC AC Throughput Loss Percent 9 SERT SERT 700 60 250 8 600 50 7 200 6 500 40 5 400 30 150 4 3 300 20 100 2 200 10 1 50 0 100 0 0 0 50 50 100 100 100 100 120 120 140 140 160 160 180 180 200 200 Load (%) Load (%) Load (%) Load (%)

  17. Evaluation of Threshold Evaluation of Threshold Controller for Ranking Service Controller for Ranking Service 70 800 15 Throughput Loss Percent 3.0 60 700 0.5 Response Time (ms) Adapt 50 600 40 500 30 400 20 300 10 200 0 100 80 100 120 140 160 180 200 80 100 120 140 160 180 200 Load (%) Load (%) • Adaptive controller vs. fixed threshold of 0.5s, 3.0s, 15s

  18. Evaluation of Threshold Evaluation of Threshold Controller for Index Matching Controller for Index Matching 60 450 8.0 Throughput Loss Percent 3.0 Response Time (ms) 1.5 50 400 Adapt 40 350 30 300 20 250 10 0 200 100 150 200 100 150 200 Load (%) Load (%) • Adaptive controller vs. fixed threshold of 0.5s, 3.0s, 15s

  19. Related Work Related Work • Real-time database systems [ Kuo’00,Lin’90,Shu’94] – Higher priority transaction aborts lower ones – UNDO/ REDO log for recovery • Recoverable memory libraries – Recoverable virtual memory [ Saty.’94] , Rio Vista [ Lowell’97] – Application modifications needed • Process checkpointing and rollback – Fault tolerance[ Li’90] , program replay [ Srinivasan’04] and debugging [ Qin’05]

  20. Conclusions Conclusions • Contribution: an early termination scheme for busy Internet services – Dynamically select termination threshold – Safely terminate requests early – Provide API for multi-threaded services • Future work – Perform cooperative early-termination across different nodes and tiers

  21. Questions? Questions?

  22. CDF of Response Time during CDF of Response Time during Size Distribution Shift Size Distribution Shift E.g., completed within one second – SERT 81.7% – AC 45.3%

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend