the mixed workload ch benchmark hybrid oltp olap database
play

The mixed workload CH-BenCHmark Hybrid OLTP&OLAP Database - PowerPoint PPT Presentation

Chair of Informatics III: Database Systems The mixed workload CH-BenCHmark Hybrid OLTP&OLAP Database Systems y y Real-Time Business Intelligence Analytical information at your fingertips Analytical information at your


  1. Chair of Informatics III: Database Systems The mixed workload CH-BenCHmark � Hybrid OLTP&OLAP Database Systems y y � Real-Time Business Intelligence � Analytical information at your fingertips � Analytical information at your fingertips Richard Cole (ParAccel), Florian Funke (TU München), ( ), ( ), Leo Giakoumakis (Microsoft), Wey Guy (Microsoft), Alfons Kemper (TU München), Stefan Krompass (TU p ( ), p ( München), Harumi Kuno (HP Labs), Raghunath Nambiar (Cisco), Thomas Neumann (TU München), Meikel Poess ( ), ( ), (Oracle), Kai-Uwe Sattler (TU Ilmenau), Eric Simon (SAP), Florian Waas (Greenplum)

  2. Chair of Informatics III: Database Systems O t Outcome of the Dagstuhl Seminar f th D t hl S i Fall 2010 � Robust Query Processing � Organized by Goetz Graefe et al. g y � Breakout Working Group � Workload Management � Workload Management � Headed by: Harumi Kuno

  3. Chair of Informatics III: Database Systems State of the Art Separate Transaction State of the Art: Separate Transaction (OLTP) and Query (OLAP) Systems

  4. Chair of Informatics III: Database Systems G Goal: Real Time Business Intelligence l R l Ti B i I t lli � Querying the Transactional Data

  5. Chair of Informatics III: Database Systems H Hasso Plattner (SAP): Keynote at SIGMOD 09 Pl tt (SAP) K t t SIGMOD 09

  6. Chair of Informatics III: Database Systems U Use cases for low latency analytics [Curt f l l t l ti [C t Monash‘s Blog (April 11, 2011), Teradata] � Operational reporting � BI dashboards � Claims processed p � 7 X 24 real time � 7 X 24 real time operations � Inventory instant status � Financial peak � Financial peak � Machine generated � Machine generated periods data � Month end, quarter end � Month end quarter end � Rapid response � Rapid response � Fast analytics � Cyber Security � Short and long term Sh t d l t threats Frankly, I think low-latency monitoring is going to be one of the hot areas over y y g g g the next few years. “Real-time” is cool, and big monitors with constantly changing graphics are cooler yet. [C.M.]

  7. Chair of Informatics III: Database Systems The Best of Both Worlds … Th B t f B th W ld …. one size fits all – again?? BestOfBothWorlds BestOfBothWorlds ++ OLAP ++ OLTP MonetDB / M tDB / V ltDB / VoltDB / Vectorwise/ TimesTen / TREX/ Vertica TREX/ Vertica P*Time P Time -- OLTP -- OLAP

  8. TPC-C and TPC-H d TPC H Missing in TPC-C Schemas TPC C Chair of Informatics III: Database Systems

  9. Chair of Informatics III: Database Systems C&H BenCHmark schema

  10. Mixed OLTP&OLAP Workload Mixed OLTP&OLAP Workload Chair of Informatics III: Database Systems

  11. Chair of Informatics III: Database Systems Re-use existing TPC-C-Benchmark Kit Re-use existing TPC-C-Benchmark Kit

  12. Chair of Informatics III: Database Systems Re-use existing TPC-C-Benchmark Kit Re-use existing TPC-C-Benchmark Kit •Multiple OLTP clients •No wait-time in between requests N it ti i b t t •Deviating from original TPC-C •High throughput for smaller DB •High throughput for smaller DB

  13. Chair of Informatics III: Database Systems No Keying/Think-Time � Clients generate one � Cli N K i /Thi k Ti t t request after another as fast as possible 0 sec 0 sec. 0 sec 0 sec. •10 clients=terminals per Warehouse •10 clients=terminals per Warehouse

  14. Chair of Informatics III: Database Systems Re-use existing TPC-C-Benchmark Kit Re-use existing TPC-C-Benchmark Kit N ti Nation (25) S Supplier li 100k Region (5)

  15. Chair of Informatics III: Database Systems Re-use existing TPC-C-Benchmark Kit Re-use existing TPC-C-Benchmark Kit •No updates because new data is generated by OLTP data is generated by OLTP •Modified TPC-H queries •Different schema N ti Nation (25) S Supplier li 100k Region (5)

  16. Chair of Informatics III: Database Systems All 5 TPC C T All 5 TPC-C Transactions (no waiting time) ti ( iti ti ) • New-Order • Order-Status • Payment P t • Stock-Level St k L l • Delivery All 22 TPC-H Queries e.g., Query 5 : Intra Country – Revenue by local Suppliers within a Region per Nation within a Region, per Nation

  17. Chair of Informatics III: Database Systems Complete Query Suite Complete Query Suite Q1: Generate orderline overview � Q2: Most important supplier/item-combinations (those � that have the lowest stock level for certain parts in a certain region) certain region) Q3: Unshipped orders with highest value for customers � within a certain state within a certain state Q4: Orders that were partially shipped late � Q5: Revenue volume achieved through local suppliers Q5: Revenue volume achieved through local suppliers � � Q6: Revenue generated by orderlines of a certain � quantity quantity Q7: Bi-directional trade volume between two nations � Q8: Market share of a given nation for customers of a Q8: Market share of a given nation for customers of a � � given region for a given part type

  18. Chair of Informatics III: Database Systems Complete Query Suite Complete Query Suite Q9: Profit made on a given line of parts,broken out by � supplier nation and year li ti d Q10: Customers who received their ordered products late � Q11: Most important (high order count compared to the Q11 M t i t t (hi h d t d t th � sum of all ordercounts) parts supplied by suppiers of a particular nation particular nation Q12: Determine whether selecting less expensive modes � of shipping is negatively affecting the critical-priority of shipping is negatively affecting the critical priority orders by causing more parts to be received late by customers Q13: Relationships between customers and the size of � their orders Q14: Market response to a promotion campaign �

  19. Chair of Informatics III: Database Systems Complete Query Suite Complete Query Suite Q15: Determines the top supplier � Q16: Number of suppliers that can supply particular parts � Q17: Average yearly revenue that would be lost if orders � were no longer filled for small quantities of certain parts l fill d f ll titi f t i t Q18: Rank customers based on their placement of a large � q antit order quantity order Q19: Machine generated data mining (revenue report for � disjunctive predicate) disjunctive predicate) Q20: Suppliers in a particular nation having selected parts � that may be candidates for a promotional offer that may be candidates for a promotional offer Q21: Suppliers who were not able to ship required parts in � a timely manner a timely manner Q22: Geographies with customers who may be likely to � make a purchase

  20. Chair of Informatics III: Database Systems Performance and Quality Metrics Performance Quality � OLTP Throughput g p � Isolation Level � NewOrder Tx per minute � Serializable for OLTP � Except Stock-Level � Query Response Times � Query isolation level � Geometric Mean One query stream � Read uncommitted (dirty) � Multiple query streams Multiple query streams � � � Read committed R d itt d � Query Throughput � Serializable � Multiple parallel streams � Multiple parallel streams � Snapshot � Snapshot Freshness of the Snapshot � #Queries per hour � In #missed transactions � � Response time guarantees R ti t � derived from TPC-C

  21. Chair of Informatics III: Database Systems Fi First Results from PostgreSQL to Demonstrate t R lt f P t SQL t D t t the Reporting (out of the box � no fine-tuning)

  22. Chair of Informatics III: Database Systems Fi First Results from PostgreSQL: „Powertest“ t R lt f P t SQL P t t“

  23. Chair of Informatics III: Database Systems Fi First Results from PostgreSQL: „OLTP centric“ t R lt f P t SQL OLTP t i “

  24. Chair of Informatics III: Database Systems Fi First Results: „balanced OLTP & OLAP“ t R lt b l d OLTP & OLAP“

  25. l “ First Results: „Queries only“ Chair of Informatics III: Database Systems i Q lt t R Fi

  26. Chair of Informatics III: Database Systems T Tuning Dimensions i Di i • challenge for workload management •Multi-objective control •Admission control •Resource allocation •Memory M •cores

  27. Chair of Informatics III: Database Systems H How to Gain Performance for Mixed t G i P f f Mi d Workload Processing: Snapshotting and Main Memory DBMS Snapshot Snapshot OLAP Workload OLTP Workload Most current Most current database state •Versioning: run OLAP on time versions of the data V i i OLAP ti i f th d t •Twin block: run OLAP on Tx-consistent snapshot •Shadowing •Tuple level •Tuple level •Page level � exploit hardware support for for Virtual Memory Snapshot (HyPer)

  28. Chair of Informatics III: Database Systems F t Future Work W k � Fine-tune (tighten) the benCHmark specification � Query Parameters � Performance metrics � Performance metrics � Account for dynamically growing database cardinality � Isolation levels � Freshness guarantees � Get TPC.org interested to follow up � Get TPC.org interested to follow up � Industry representatives

  29. in HyPer How it works … How it works Chair of Informatics III: Database Systems

  30. Chair of Informatics III: Database Systems H Hardware Supported Data Access and d S t d D t A d Copy on Update

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend