globetp template based
play

GlobeTP: Template-Based Database Replication for Scalable - PowerPoint PPT Presentation

GlobeTP: Template-Based Database Replication for Scalable Web Applications Page 1 of 18 Tobias Groothuyse, Swaminathan Sivasubramanian, and Guillaume Pierre. In procedings of WWW 2007, May 8-12, Go Back 2007, Banff,


  1. GlobeTP: Template-Based ◭◭ ◮◮ Database Replication for Scalable ◭ ◮ Web Applications Page 1 of 18 Tobias Groothuyse, Swaminathan Sivasubramanian, and Guillaume Pierre. In procedings of WWW 2007, May 8-12, Go Back 2007, Banff, Alberta, Canada. Full Screen Dina Adel Said Close dsaid@vt.edu Quit

  2. Problem Definition ◭◭ ◮◮ • How to provide a scalable infrastructure ◭ ◮ for hosting dynamically generated web content? Page 2 of 18 • Past Solutions: Go Back 1. Cache generated pages 2. Distribute the computational across Full Screen multiple application servers 3. Cache the results of DB queries. Close • Problems: Bottleneck resides in the throughput of Quit the origin DB.

  3. Problem Definition (cont.) ◭◭ ◮◮ • Solution: Use DB Replication. ◭ ◮ • Problem: Doesn’t scale linearly because Page 3 of 18 all update, delete, insert (UDI) queries are performed to each DB relipca. Go Back • Past solutions: 1. Increase the throughput of each indi- Full Screen vidual sever 2. Partial Replication Close Quit

  4. Partial Replication ◭◭ ◮◮ • Past Solutions: ◭ ◮ – Depending on the application program- mer Gao et al. [2003] Page 4 of 18 – GlobeDB: Sivasubramanian et al. [2005]. Go Back ∗ Record-level replication granularity ∗ Provides excellent query latency Full Screen ∗ A central sever maintains all the updates then sends batch updates to other servers. ∗ Does not improve the thoughput because the Close central server provides a bottleneck. Quit

  5. DBTP: Template-Based solution ◭◭ ◮◮ • The nature of web applications belong to ◭ ◮ small number of query templates. • Query template: parameterized SQL Page 5 of 18 query where parameters are passed at run time. Go Back • By knowing these templates, table place- Full Screen ments are selected to insure maximum throughput and reasonable latency. Close Quit

  6. Models ◭◭ ◮◮ • Application Model: ◭ ◮ – The application programmer is required to specify explicity the application templates. Page 6 of 18 • System Model: Go Back Full Screen Close Quit

  7. Main problems to consider ◭◭ ◮◮ 1. Cluster Identification: Ensure that the ◭ ◮ placement of tables would find at least one server to execute each query tem- Page 7 of 18 plate. 2. Consider all the defined templates, read Go Back or UDI, and determine the best place- ment to provide the maximum through- Full Screen put. Close 3. Define a load balancing algorithm that al- lows read queries to distribute efficiently. Quit

  8. Data Placement: Cluster ◭◭ ◮◮ Identification ◭ ◮ • Goal: Determines the set of tables that is needed to be replicated together so that Page 8 of 18 templates function correctly. Meanwhile, number of servers that must execute the Go Back UDI query should be minimized. • Characterize each query template: Full Screen 1. Whether it is read or UDI Close 2. The set of tables that it accesses. Quit

  9. Data Placement: Load Analysis ◭◭ ◮◮ • Determines the load received by each of ◭ ◮ the cluster. • Determines the load on Table Clusters: Page 9 of 18 – Read or UDI query – Frequency of template occurrence Go Back – Computational complexity for executing this query: Full Screen ∗ Use DB systems tools to estimate the actual execution time. ∗ Run the query in a live system. Close • Determines the load on DB servers (Read or UDI query) Quit

  10. Data Placement: Cluster Placement ◭◭ ◮◮ • Determines the placement of the cluster ◭ ◮ across the set of DB servers load achieved by each replica is minimized. Page 10 of 18 • Using exhaustive search O (2 N ∗ T /N !) , where T is No. of tables and N number of Nodes. Go Back Full Screen Close Quit

  11. Query Routing ◭◭ ◮◮ • Round Robin (RR): Efficient if all coming ◭ ◮ queries have the same cost. • RR-QID: RR by Query ID Page 11 of 18 – Each Query template is identified by its QID. – Each queue is associated with the set of DB Go Back servers that can server a certain QID. – RR fashion is implemented for each queue. Full Screen • Cost-based Routing – Upon arrival of incoming query, the query Close router estimates the current load on each DB server. – The Query is scheduled to the least loaded DB Quit server (that can serve the query).

  12. Experiments ◭◭ ◮◮ • Compare Globe-TP with full DB replica- ◭ ◮ tion using: – TPC-W: standard e-commerce benchmark Page 12 of 18 – RUBBoS: bulletin-board benchmark modeled after slashdot.org Go Back Full Screen Close Quit

  13. Experiments (cont.) ◭◭ ◮◮ • Query latency distributions using 4 ◭ ◮ servers. Page 13 of 18 Go Back Full Screen Close Quit

  14. Experiments (cont.) ◭◭ ◮◮ • Maximum achievable throughputs with ◭ ◮ 90% of queries processed within 100ms. Page 14 of 18 Go Back Full Screen Close Quit

  15. Advantages ◭◭ ◮◮ • Easily coupled with a distributed DB ◭ ◮ query cache. • Does not require any modification in the Page 15 of 18 application itself. Go Back Full Screen Close Quit

  16. Disadvantages ◭◭ ◮◮ • Does not support transactions. However, ◭ ◮ it can be implemented through query router. Page 16 of 18 • Limitation due to table granularity par- tial replication. Go Back • Fault Tolerance issues. Full Screen • Does not take into consideration the long- term load variations that must be ex- Close pected when operating a popular dy- namic web site. Quit

  17. ◭◭ ◮◮ References Lei Gao, Mike Dahlin, Amol Nayate, Jiandan Zheng, and Arun Iyengar. Application specific data ◭ ◮ replication for edge services. In WWW ’03: Proceedings of the 12th international conference on World Wide Web , 449–460, Budapest, Hungary. 2003. ISBN 1-58113-680-3. Swaminathan Sivasubramanian, Gustavo Alonso, Guillaume Pierre, and Maarten van Steen. Globedb: autonomic data replication for web applications. In WWW ’05: Proceedings of the 14th international Page 17 of 18 conference on World Wide Web , 33–42, Chiba, Japan. 2005. ISBN 1-59593-046-9. Go Back Full Screen Close Quit

  18. ◭◭ ◮◮ ◭ ◮ Page 18 of 18 Thank you Go Back dsaid@vt.edu Full Screen Close Quit

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend