2 workloa d
play

2 Workloa d? 3 OLTP 4 OLAP OLTP 4 OLAP OLTP Streaming 4 - PowerPoint PPT Presentation

2 Workloa d? 3 OLTP 4 OLAP OLTP 4 OLAP OLTP Streaming 4 Scan- OLAP OLTP Streaming oriented 4 Scan- OLAP OLTP Streaming Archiving oriented 4 Scan- Log- OLAP OLTP Streaming Archiving processing oriented 4 Scan- Log-


  1. How does it work? • Exploit UDFs provided by existing systems Query Processor • Inject pieces of code Relations • Hack layouts into the UDFs UDF Storage Layer • UDF as mapping between logical and physical view of data Physical Representation File 1 File 2 File 3 .... File n 27

  2. How does it work? • Exploit UDFs provided by existing systems Query Processor • Inject pieces of code Relations • Hack layouts into the UDFs UDF Storage Layer • UDF as mapping between logical and physical view of data Physical Representation • Novel use of UDFs File 1 File 2 File 3 .... File n 27

  3. Use Case 1 : OLAP in Row-stores 28

  4. OLTP OLAP 29

  5. OLTP OLAP 29

  6. OLTP OLAP? • Can we push the limits of row stores? 29

  7. Trojan Columns Relation Customer name phone market_segment smith 2134 automobile john 3425 household kim 6756 furniture joe 9878 building mark 4312 building steve 2435 automobile jim 5766 household ian 8789 household 30

  8. Trojan Columns Relation Customer name phone market_segment smith 2134 automobile john 3425 household kim 6756 furniture joe 9878 building mark 4312 building steve 2435 automobile jim 5766 household ian 8789 household Physical Table Customer_trojan segment_ID attribute_ID blob_data 1 name smith, john, kim, joe 1 phone 2134, 3425, 6756, 9878 1 market_segment automobile, household, furniture, building 2 name mark, steve, jim, ian 2 phone 4312, 2435, 5766, 8789 2 market_segment building, automobile, household, household 30

  9. Trojan Columns Relation Customer name phone market_segment smith 2134 automobile john 3425 household kim 6756 furniture joe 9878 building mark 4312 building steve 2435 automobile jim 5766 household ian 8789 household Physical Table Customer_trojan segment_ID attribute_ID blob_data 1 name smith, john, kim, joe 1 phone 2134, 3425, 6756, 9878 1 market_segment automobile, household, furniture, building 2 name mark, steve, jim, ian 2 phone 4312, 2435, 5766, 8789 2 market_segment building, automobile, household, household 30

  10. Trojan Columns Relation Customer name phone market_segment smith 2134 automobile john 3425 household kim 6756 furniture joe 9878 building mark 4312 building steve 2435 automobile jim 5766 household ian 8789 household Physical Table Customer_trojan segment_ID attribute_ID blob_data 1 name smith, john, kim, joe 1 phone 2134, 3425, 6756, 9878 1 market_segment automobile, household, furniture, building 2 name mark, steve, jim, ian 2 phone 4312, 2435, 5766, 8789 2 market_segment building, automobile, household, household 30

  11. Trojan Columns Relation Customer name phone market_segment smith 2134 automobile john 3425 household kim 6756 furniture joe 9878 building mark 4312 building steve 2435 automobile jim 5766 household ian 8789 household Physical Table Customer_trojan segment_ID attribute_ID blob_data 1 name smith, john, kim, joe 1 phone 2134, 3425, 6756, 9878 1 market_segment automobile, household, furniture, building 2 name mark, steve, jim, ian 2 phone 4312, 2435, 5766, 8789 2 market_segment building, automobile, household, household 30

  12. Example: TPC-H Query 6 Result Result Result γ γ γ agg (extendedprice * discount) agg (extendedprice * discount) agg (extendedprice * discount) selectUDF σ σ σ shipdate BETWEEN shipdate BETWEEN shipdate BETWEEN ‘1994-01-01’ AND ‘1995-01-01’ ‘1994-01-01’ AND ‘1995-01-01’ ‘1994-01-01’ AND ‘1995-01-01’ AND discount BETWEEN AND discount BETWEEN AND discount BETWEEN 0.05 AND 0.07 0.05 AND 0.07 0.05 AND 0.07 AND quantity < 24 AND quantity < 24 AND quantity < 24 π π π quantity, discount quantity, discount quantity, discount extendedprice, shipdate extendedprice, shipdate extendedprice, shipdate scanUDF S CAN S CAN S CAN lineitem lineitem lineitem shipd scanUDF selectUDF σ σ 31 σ ate BETWEEN

  13. Example: TPC-H Query 6 Result Result Result γ γ γ agg (extendedprice * discount) agg (extendedprice * discount) agg (extendedprice * discount) selectUDF σ σ σ shipdate BETWEEN shipdate BETWEEN shipdate BETWEEN ‘1994-01-01’ AND ‘1995-01-01’ ‘1994-01-01’ AND ‘1995-01-01’ ‘1994-01-01’ AND ‘1995-01-01’ AND discount BETWEEN AND discount BETWEEN AND discount BETWEEN 0.05 AND 0.07 0.05 AND 0.07 0.05 AND 0.07 AND quantity < 24 AND quantity < 24 AND quantity < 24 π π π quantity, discount quantity, discount quantity, discount extendedprice, shipdate extendedprice, shipdate extendedprice, shipdate scanUDF S CAN S CAN S CAN lineitem lineitem lineitem shipd scanUDF selectUDF σ σ 31 σ ate BETWEEN

  14. Example: TPC-H Query 6 Result Result Result γ γ γ agg (extendedprice * discount) agg (extendedprice * discount) agg (extendedprice * discount) selectUDF σ σ σ shipdate BETWEEN shipdate BETWEEN shipdate BETWEEN ‘1994-01-01’ AND ‘1995-01-01’ ‘1994-01-01’ AND ‘1995-01-01’ ‘1994-01-01’ AND ‘1995-01-01’ AND discount BETWEEN AND discount BETWEEN AND discount BETWEEN 0.05 AND 0.07 0.05 AND 0.07 0.05 AND 0.07 AND quantity < 24 AND quantity < 24 AND quantity < 24 π π π quantity, discount quantity, discount quantity, discount extendedprice, shipdate extendedprice, shipdate extendedprice, shipdate scanUDF S CAN S CAN S CAN lineitem lineitem lineitem shipd scanUDF selectUDF σ σ 31 σ ate BETWEEN

  15. C-Store Benchmark * 30 Standard Row Trojan Columns Query Time (sec) 20 10 0 Q1 Q2 Q3 Q4 Q5 Q6 Q7 71.74058 72.41696 32 * Mike Stonebraker et. al. C-Store: A Column Oriented DBMS. VLDB 2005

  16. TPC-H Benchmark * 100 Standard Row Trojan Columns 75 Query Time (sec) 50 25 0 Q1 Q6 Q12 Q14 33 * tpch.org/tpch

  17. TPC-H Benchmark * 900 Standard Row Trojan Columns 675 Query Time (sec) 450 225 0 Q3 Q5 Q10 Q19 34 * tpch.org/tpch

  18. Micro-Benchmark 16 2.13 2.13 2.11 2.06 1.55 0.47 0.06 # referenced attributes (r) 15 4.64 4.62 4.55 4.27 2.57 0.55 0.06 13 5.00 5.00 4.94 4.61 2.70 0.57 0.06 11 5.79 5.82 5.75 5.24 2.87 0.56 0.06 9 6.39 6.38 6.25 5.79 3.11 0.54 0.06 7 7.00 6.96 6.80 6.23 3.17 0.56 0.06 5 10.96 10.94 10.55 9.27 3.75 0.57 0.06 3 12.86 13.57 13.22 11.03 4.16 0.56 0.06 1 17.43 17.61 16.61 13.57 4.39 0.57 0.06 1E-06 1E-05 1E-04 1E-03 1E-02 1E-01 1E+00 selectivity (fraction of tuples accessed) 35

  19. versus Column Stores 100 Standard Row Trojan Columns DBMS-Y 75 Query Time (sec) 50 25 0 Q1 Q6 Q12 Q14 36

  20. Trojan Columns Advantages • Column + Row storage • Much better performance • Closed source system 37

  21. Use Case 2: Big Data Analytics 38

  22. Web-log Processing • Scan Tasks User visits from different countries 39

  23. Web-log Processing • Scan Tasks • Selection Tasks User visits with duration greater than 10s 39

  24. Web-log Processing • Scan Tasks • Selection Tasks • Projection Tasks URL and duration of each user visit 39

  25. Web-log Processing • Scan Tasks • Selection Tasks • Projection Tasks • Join Tasks Average PageRank visited by each user IP 39

  26. Web-log Processing T Data Load Phase T 1 L PPartblock T 6 Replicate Replicate Replicate Replicate Replicate Replicate H1 Fetch Fetch Fetch H2 H3 Fetch Fetch Fetch H4 Store Store Store . . . Store Store Store . . . Scan Scan Scan PPartsplit PPartsplit PPartsplit T 1 T 5 T 2 T 4 T 3 T 6 M1 Union M2 M3 M4 RecReaditemize RecReaditemize MMapmap MMapmap PPartmem Map Phase LPartsh LPartsh LPartsh LPartsh Sortcmp Sortcmp Sortcmp Sortcmp SortGrpgrp SortGrpgrp SortGrpgrp SortGrpgrp MMapcombine MMapcombine MMapcombine . . . MMapcombine . . . Store Store Store Mergecmp SortGrpgrp • Scan Tasks MMapcombine Store Store • Selection Tasks PPartsh PPartsh Shu ffl e Phase Fetch Fetch Fetch Fetch • Projection Tasks Buffer Buffer Buffer Buffer Store Store Merge • Join Tasks Store Mergecmp . . . Reduce Phase SortGrpgrp MMapreduce 39 R1 Store R2

  27. Trojan Index DataSet ... ... SData T H F Trojan Index Indexed Split i • Each HDFS block sorted • Each block contains an index 40

  28. Trojan Index HDFS Blocks DataSet ... ... SData T Trojan Index H F Indexed Split i DataSet ... ... SData T Trojan Index H F Indexed Split i DataSet ... ... SData T Trojan Index H F Indexed Split i DataSet ... ... SData T Trojan Index H F Indexed Split i • Each HDFS block sorted • Each block contains an index • Index access in UDF 40

  29. Trojan Index Creation T Data Load Phase T 1 L PPartblock T 6 Replicate Replicate Replicate Replicate Replicate Replicate H1 Fetch Fetch Fetch H2 H3 Fetch Fetch Fetch H4 Store Store Store . . . Store Store Store . . . Scan Scan Scan PPartsplit PPartsplit PPartsplit T 1 T 5 T 2 T 4 T 3 T 6  map (key k , value v ) �� M1 Union M2 M3 M4    RecReaditemize RecReaditemize   [( getSplitID() ⌅ prj ai ( k ⌅ v ) , k ⌅ v )]   MMapmap MMapmap  PPartmem (key ik vset ivs ) ��  Map Phase   LPartsh LPartsh LPartsh LPartsh cmp (key k 1 , key k 2) �� compare ( k 1 . a , k 2 . a ) Sortcmp Sortcmp Sortcmp Sortcmp SortGrpgrp SortGrpgrp SortGrpgrp SortGrpgrp MMapcombine MMapcombine MMapcombine . . . grp (key k 1 , key k 2) �� compare ( k 1 . splitID , k 2 . splitID ) MMapcombine . . . Store Store Store Mergecmp SortGrpgrp MMapcombine sh (key k , value v , int numPartitions ) �� k . splitID % numPartitions Store Store PPartsh PPartsh Shu ffl e Phase Fetch Fetch Fetch Fetch Buffer Buffer Buffer Buffer   Store Store Merge    Store  ⌅ ⌅    Mergecmp . . . Reduce Phase reduce (key ik , vset ivs ) ��  SortGrpgrp     MMapreduce   [( ivs ⌅ indexBuilder ai ( ivs ))]  R1 Store R2 41 T � T � 1 2

  30. Trojan Index Access T Data Load Phase T 1 L PPartblock T 6 Replicate Replicate Replicate Replicate Replicate Replicate H1 Fetch Fetch Fetch H2 H3 Fetch Fetch Fetch H4 Store Store Store . . . Store Store Store . . . Scan Scan Scan PPartsplit PPartsplit PPartsplit T 1 T 5 T 2 T 4 T 3 T 6 Algorithm 1 : Trojan Index / Trojan Join split UDF M1 Union M2 M3 M4 Input : JobConf job, Int numSplits Output : logical data splits RecReaditemize RecReaditemize 1 FileSplit [] splits; MMapmap MMapmap 2 File [] files = GetFiles( job ) ; PPartmem 3 foreach file in files do Map Phase Path path = file. getPath() ; 4 LPartsh LPartsh LPartsh LPartsh 5 InputStream in = GetInputStream( path ) ; Sortcmp Sortcmp Sortcmp Sortcmp 6 Long o ff set = file. getLength() ; 7 while o ff set > 0 do SortGrpgrp SortGrpgrp SortGrpgrp SortGrpgrp in. seek( o ff set- FOOTER SIZE ) ; 8 MMapcombine MMapcombine MMapcombine . . . MMapcombine . . . 9 Footer footer = ReadFooter( in ) ; 10 Long splitSize = footer. getSplitSize() ; Store Store Store 11 o ff set - = (splitSize + FOOTER SIZE ); Mergecmp BlockLocations blocks = GetBlockLocations( path,o ff set ) ; 12 13 FileSplit newSplit = CreateSplit( path,o ff set,splitSize,blocks ) ; SortGrpgrp 14 splits. add( newSplit ) ; MMapcombine 15 end Store Store 16 end 17 return splits; PPartsh PPartsh Shu ffl e Phase Fetch Fetch Fetch Fetch Buffer Buffer Buffer Buffer Store Store Merge Store Mergecmp . . . Reduce Phase SortGrpgrp MMapreduce R1 Store R2 42 T � T � 1 2

  31. Trojan Index Access T Data Load Phase T 1 L PPartblock T 6 Replicate Replicate Replicate Replicate Replicate Replicate H1 Fetch Fetch Fetch H2 H3 Fetch Fetch Fetch H4 Store Store Store . . . Store Store Store . . . Scan Scan Scan PPartsplit PPartsplit PPartsplit T 1 T 5 T 2 T 4 T 3 T 6 Algorithm 2 : Trojan Index itemize .initialize UDF M1 Union M2 M3 M4 Input : FileSplit split, JobConf job RecReaditemize RecReaditemize 1 Global FileSplit split = split; 2 Key lowKey = job. getLowKey() ; MMapmap MMapmap 3 Global Key highKey = job. getHighKey() ; PPartmem 4 Int splitStart = split. getStart() ; Map Phase 5 Global Int splitEnd = split. getEnd() ; LPartsh LPartsh LPartsh LPartsh 6 Header h = ReadHeader( split ) ; Sortcmp Sortcmp Sortcmp Sortcmp 7 Overlap type = h. getOverlapType( lowKey,highKey ) ; Global Int o ff set; 8 SortGrpgrp SortGrpgrp SortGrpgrp SortGrpgrp 9 if type == LEFT CONTAINED or type == FULL CONTAINED or type == MMapcombine MMapcombine MMapcombine . . . MMapcombine . . . POINT CONTAINED then 10 Index i = ReadIndex( split ) ; Store Store Store 11 o ff set = splitStart + i. lookup( lowKey ) ; Mergecmp 12 else if type == RIGHT CONTAINED or type == SPAN then SortGrpgrp 13 o ff set = splitStart; 14 else MMapcombine 15 // NOT CONTAINED, skip the split; Store Store 16 o ff set = splitEnd; 17 end PPartsh PPartsh 18 Seek( o ff set ) ; Shu ffl e Phase Fetch Fetch Fetch Fetch Buffer Buffer Buffer Buffer Store Store Merge Store Mergecmp . . . Reduce Phase SortGrpgrp MMapreduce R1 Store R2 42 T � T � 1 2

  32. Trojan Index Access T Data Load Phase T 1 L PPartblock T 6 Replicate Replicate Replicate Replicate Replicate Replicate H1 Fetch Fetch Fetch H2 H3 Fetch Fetch Fetch H4 Store Store Store . . . Store Store Store . . . Scan Scan Scan PPartsplit PPartsplit PPartsplit T 1 T 5 T 2 T 4 T 3 T 6 Algorithm 3 : Trojan Index itemize .next UDF M1 Union M2 M3 M4 Input : KeyType key, ValueType value Output : has more records RecReaditemize RecReaditemize 1 if o ff set < splitEnd then MMapmap MMapmap 2 Record nextRecord = ReadNextRecord( split ) ; PPartmem 3 o ff set += nextRecord.size(); Map Phase 4 if nextRecord.key < highKey then LPartsh LPartsh LPartsh LPartsh 5 SetKeyValue( key, value, nextRecord ) ; Sortcmp Sortcmp Sortcmp Sortcmp 6 return true; SortGrpgrp SortGrpgrp SortGrpgrp SortGrpgrp 7 end 8 end MMapcombine MMapcombine MMapcombine . . . MMapcombine . . . 9 return false; Store Store Store Mergecmp SortGrpgrp MMapcombine Store Store PPartsh PPartsh Shu ffl e Phase Fetch Fetch Fetch Fetch Buffer Buffer Buffer Buffer Store Store Merge Store Mergecmp . . . Reduce Phase SortGrpgrp MMapreduce R1 Store R2 42 T � T � 1 2 size

  33. Selection Analytical Task * Hadoop Hadoop++(256MB) 140 HadoopDB Hadoop++(1GB) HadoopDB Chunks 120 runtime [seconds] 100 80 60 40 20 0 10 nodes 50 nodes 100 nodes 43 * Pavlo et. al. A Comparison of Approaches to large-Scale Data Analysis. SIGMOD 2009

  34. Trojan Index Advantages • Scan + Index data accesses • Parallel index lookups • Non-invasive system changes • Much better performance • Each HDFS block sorted • Each block contains an index • Index access in UDF 44

  35. Trojan Join DataSet ... ... Ht SData T k Hs SData S k Ht SData T k+1 Hs SData S k+1 F co-group j co-group j+1 Co-Partitioned Split i • Each HDFS block co- partitioning over two relations • Join relations are co-located 45

  36. Trojan Join HDFS Blocks DataSet ... ... Ht SData T k Hs SData S k Ht SData T k+1 Hs SData S k+1 F co-group j co-group j+1 Co-Partitioned Split i DataSet ... ... Ht SData T k Hs SData S k Ht SData T k+1 Hs SData S k+1 F co-group j co-group j+1 Co-Partitioned Split i DataSet ... ... Ht SData T k Hs SData S k Ht SData T k+1 Hs SData S k+1 F co-group j co-group j+1 Co-Partitioned Split i DataSet ... ... Ht SData T k Hs SData S k Ht SData T k+1 Hs SData S k+1 F co-group j co-group j+1 Co-Partitioned Split i • Each HDFS block co- partitioning over two relations • Join relations are co-located • Co-partitioned join in UDF physical 45 to co-partitioned co-partition

  37. Join Analytical Task * 2500 Hadoop Hadoop++(256MB) HadoopDB Hadoop++(1GB) 2000 runtime [seconds] 1500 1000 500 0 10 nodes 50 nodes 100 nodes 46 * Pavlo et. al. A Comparison of Approaches to large-Scale Data Analysis. SIGMOD 2009

  38. Trojan Join Advantages • Re- + Co- partitioned join • Parallel join processing • Non-invasive system changes • Much better performance • Each HDFS block co- partitioning over two relations • Join relations are co-located • Co-partitioned join in UDF 47

  39. Trojan Layouts Replica 1 Replica 2 Replica 3 • Each HDFS block in row or column • Each block replica in different layout • Pick right layout in UDF 48

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend