streamglobe
play

StreamGlobe Adaptive Query Processing and Optimization in Streaming - PowerPoint PPT Presentation

Lehrstuhl Informatik III: Datenbanksysteme StreamGlobe Adaptive Query Processing and Optimization in Streaming P2P Environments A. Kemper, R. Kuntschke, and B. Stegmaier TU Mnchen Fakultt fr Informatik Lehrstuhl III: Datenbanksysteme


  1. Lehrstuhl Informatik III: Datenbanksysteme StreamGlobe Adaptive Query Processing and Optimization in Streaming P2P Environments A. Kemper, R. Kuntschke, and B. Stegmaier TU München – Fakultät für Informatik Lehrstuhl III: Datenbanksysteme http://www-db.in.tum.de/research/projects/StreamGlobe

  2. Lehrstuhl Informatik III: Datenbanksysteme Outline � Motivation � StreamGlobe � The StreamGlobe Approach � Architecture Overview � Current and Future Research � Conclusion 09/05/2006 StreamGlobe 2

  3. Lehrstuhl Informatik III: Datenbanksysteme Exemplary Initial Situation WLAN � Network Consists of peers � B Given or grown topology � � Data Sources A Provide XML data stream � Possibly infinite streams � (e.g., sensor measurements) � User requests Request a Continuous queries � Query language XQuery � Registered at a peer � Request ab Request a 09/05/2006 StreamGlobe 3

  4. Lehrstuhl Informatik III: Datenbanksysteme General Traditional Approach Register requests 1. B Establish data transfer 2. → Peers may connect arbitrarily Process / Execute 3. A requests Routing of streams 4. → Map streams to network Request a Request ab Request a 09/05/2006 StreamGlobe 4

  5. Lehrstuhl Informatik III: Datenbanksysteme General Traditional Approach (ctd.) Drawbacks � B Transmission of useless data 1. Redundant transmissions 2. A Multiple request evaluation 2 3. 1 � Network congestion and 3 processing overhead Request a 3 Request ab Request a 09/05/2006 StreamGlobe 5

  6. Lehrstuhl Informatik III: Datenbanksysteme Why StreamGlobe? � Other Systems / previous work E.g. Cougar, TelegraphCQ, Multicast techniques: � Focus on specific aspects (e.g., query optimization) � Tailored to specific domains � StreamGlobe � Contribution is combination of techniques: In-network query processing combined with routing � Constitutes a generic infrastructure � Independent of domain � Efficient data stream transformation and distribution 09/05/2006 StreamGlobe 6

  7. Lehrstuhl Informatik III: Datenbanksysteme Outline � Motivation � StreamGlobe � The StreamGlobe Approach � Architecture Overview � Current and Future Research � Conclusion 09/05/2006 StreamGlobe 7

  8. Lehrstuhl Informatik III: Datenbanksysteme The StreamGlobe Approach Intelligent Routing B Multicast routing techniques � Data Stream Clustering Push query execution into � A network ab a Multi-query optimization � � Reduce network traffic Request a � Avoid redundant transmissions � Reduce processing cost Request a Request ab 09/05/2006 StreamGlobe 8

  9. Lehrstuhl Informatik III: Datenbanksysteme Basic Concepts � P2P Network Topology � No arbitrary communication → Communication via transfer paths � No fixed P2P topology � Classification of peers � Thin-Peers � Super-Peers � Constitution of a super-peer backbone � Hierarchical organization → Speaker-peer responsible for certain subnet 09/05/2006 StreamGlobe 9

  10. Lehrstuhl Informatik III: Datenbanksysteme StreamGlobe Peer Architecture Based upon Open Grid � XQuery XML Services Architecture (OGSA) Subscriptions Data Streams Integration similar to OGSA- � register DAI or OGSA-DQP Layers as grid-services � StreamGlobe Interface Availability according to peer � capabilities Management Optimization Message exchange via RPC � Metadata and notifications Query Engine Data stream transfer via direct � TCP connections Globus Toolkit 09/05/2006 StreamGlobe 10

  11. Lehrstuhl Informatik III: Datenbanksysteme Optimization Goals � Registration of arbitrary subscriptions at any peer 1. Achieve good distribution of data streams 2. Optimize evaluation of many subscriptions 3. Achievement � Pushing query execution into the network � → (1) and (3) Multiquery optimization � → (3) Early filtering of data streams resp. evaluation of subscriptions � → (2) Data stream clustering � → (2) 09/05/2006 StreamGlobe 11

  12. Lehrstuhl Informatik III: Datenbanksysteme Multi-Query Optimization Performed by speaker-peer � Request a Request ab Request a Analyze subscriptions and � streams Common subqueries � Query a Filter a Filter b Query ab Re-usability of streams � Based on properties of � subscriptions / streams Computes � Filters and queries � Data stream clustering � Execution locations � 09/05/2006 StreamGlobe 12

  13. Lehrstuhl Informatik III: Datenbanksysteme Query Execution � Basic concepts � Streaming evaluation and push-based techniques � Preclude unbounded buffering by requiring window constraints � Extensibility by means of mobile code � Evaluation of subscriptions with FluX � Designed for streaming processing of XQuery � Event-based extension to XQuery � Usage of schema information for buffer minimization → Visit my talk at the VLDB: Tomorrow, Research Session 6: XML(II) 09/05/2006 StreamGlobe 13

  14. Lehrstuhl Informatik III: Datenbanksysteme Outline � Motivation � StreamGlobe � The StreamGlobe Approach � Architecture Overview � Current and Future Research � Conclusion 09/05/2006 StreamGlobe 14

  15. Lehrstuhl Informatik III: Datenbanksysteme Current and Future Research � Current Research � Optimization techniques � Extension of FluX � Future Research � Quality-of-Service management � Explicit load balancing � Load shedding techniques � Construction of overlay network … 09/05/2006 StreamGlobe 15

  16. Lehrstuhl Informatik III: Datenbanksysteme Conclusion StreamGlobe � Exploiting in-network query processing capabilities � In combination with data stream clustering � Minimization of network traffic � Query execution with FluX � Efficient and scalable execution of subscriptions � Multi-query optimization � Parallelization and load balancing in the network 09/05/2006 StreamGlobe 16

  17. Lehrstuhl Informatik III: Datenbanksysteme Related Work Aberer, Cudré-Mauroux, Datta, Despotovic, Hauswirth, Punceva, Schmidt. “P-Grid: a self- � organizing structured P2P system” . SIGMOD Record 32(3), 2003 Arasu, Babcock, Babu, Datar, Ito, Motwani, Nishizawa, Srivastava, Thomas, Varma, � Widom. “STREAM: The Stanford Stream Data Manager” . Data Engineering Bulletin 26(1), 2003 Carney, Cetintemel, Cherniack, Convey, Lee, Seidman, Stonebraker, Tatbul, Zdonik. � “Monitoring Streams – A New Class of Data Management Applications” . VLDB 2002 Chandrasekaran, Cooper, Deshpande, Franklin, Hellerstein, Hong, Krishnamurthy, � Madden, Raman, Reiss, Shah. “TelegraphCQ: Continuous Dataflow Processing for an Uncertain World” . CIDR 2003 Cherniack, Balakrishnan, Balazinska, Carney, Cetintemel, Xing, Zdonik. “Scalable � Distributed Stream Processing” . CIDR 2003 Krämer, Seeger. “PIPES – A Public Infrastructure for Processing and Exploring Streams” . � SIGMOD 2004 Madden, Shah, Hellerstein, Raman. “Continuously Adaptive Continuous Queries over � Streams” . SIGMOD 2002 Sellis. “Multiple-Query Optimization” . TODS 1988 � Yang, Garcia-Molina. “Designing a Super-Peer Network” . ICDE 2003 � Yao, Gehrke. “The Cougar Approach to In-Network Query Processing in Sensor Networks” . � SIGMOD Record 31(3), 2002 09/05/2006 StreamGlobe 17

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend