database system implementation
play

DATABASE SYSTEM IMPLEMENTATION GT 4420/6422 // SPRING 2019 // - PowerPoint PPT Presentation

DATABASE SYSTEM IMPLEMENTATION GT 4420/6422 // SPRING 2019 // @JOY_ARULRAJ LECTURE #17: QUERY EXECUTION & SCHEDULING 2 TODAYS AGENDA Process Models Query Parallelization Data Placement Scheduling 3 QUERY EXECUTION A query plan is


  1. DATABASE SYSTEM IMPLEMENTATION GT 4420/6422 // SPRING 2019 // @JOY_ARULRAJ LECTURE #17: QUERY EXECUTION & SCHEDULING

  2. 2 TODAY’S AGENDA Process Models Query Parallelization Data Placement Scheduling

  3. 3 QUERY EXECUTION A query plan is comprised of operators . An operator instance is an invocation of an operator on some segment of data. A task is the execution of a sequence of one or more operator instances.

  4. 4 PROCESS MODEL A DBMS’s process model defines how the system is architected to support concurrent requests from a multi-user application. A worker is the DBMS component that is responsible for executing tasks on behalf of the client and returning the results. ARCHITECTURE OF A DATABASE SYSTEM Foundations and Trends in Databases 2007

  5. 5 PROCESS MODELS Approach #1: Process per DBMS Worker Approach #2: Process Pool Approach #3: Thread per DBMS Worker

  6. 6 PROCESS PER WORKER Each worker is a separate OS process. → Relies on OS scheduler. → Use shared-memory for global data structures. → A process crash doesn’t take down entire system. → Examples: IBM DB2, Postgres, Oracle Dispatcher Worker

  7. 7 PROCESS PER WORKER Each worker is a separate OS process. → Relies on OS scheduler. → Use shared-memory for global data structures. → A process crash doesn’t take down entire system. → Examples: IBM DB2, Postgres, Oracle Dispatcher Worker

  8. 8 PROCESS PER WORKER Each worker is a separate OS process. → Relies on OS scheduler. → Use shared-memory for global data structures. → A process crash doesn’t take down entire system. → Examples: IBM DB2, Postgres, Oracle Dispatcher Worker

  9. 9 PROCESS PER WORKER Each worker is a separate OS process. → Relies on OS scheduler. → Use shared-memory for global data structures. → A process crash doesn’t take down entire system. → Examples: IBM DB2, Postgres, Oracle Dispatcher Worker

  10. 10 PROCESS PER WORKER Each worker is a separate OS process. → Relies on OS scheduler. → Use shared-memory for global data structures. → A process crash doesn’t take down entire system. → Examples: IBM DB2, Postgres, Oracle Dispatcher Worker

  11. 11 PROCESS PER WORKER Each worker is a separate OS process. → Relies on OS scheduler. → Use shared-memory for global data structures. → A process crash doesn’t take down entire system. → Examples: IBM DB2, Postgres, Oracle Dispatcher Worker

  12. 12 PROCESS POOL A worker uses any process that is free in a pool → Still relies on OS scheduler and shared memory. → Bad for CPU cache locality. → Examples: IBM DB2, Postgres (2015) Dispatcher Worker Pool

  13. 13 PROCESS POOL A worker uses any process that is free in a pool → Still relies on OS scheduler and shared memory. → Bad for CPU cache locality. → Examples: IBM DB2, Postgres (2015) Dispatcher Worker Pool

  14. 14 PROCESS POOL A worker uses any process that is free in a pool → Still relies on OS scheduler and shared memory. → Bad for CPU cache locality. → Examples: IBM DB2, Postgres (2015) Dispatcher Worker Pool

  15. 15 THREAD PER WORKER Single process with multiple worker threads. → DBMS has to manage its own scheduling. → May or may not use a dispatcher thread. → Thread crash (may) kill the entire system. → Examples: IBM DB2, MSSQL, MySQL, Oracle (2014) Worker Threads

  16. 16 THREAD PER WORKER Single process with multiple worker threads. → DBMS has to manage its own scheduling. → May or may not use a dispatcher thread. → Thread crash (may) kill the entire system. → Examples: IBM DB2, MSSQL, MySQL, Oracle (2014) Worker Threads

  17. 17 PROCESS MODELS Using a multi-threaded architecture has several advantages: → Less overhead per context switch. → Don’t have to manage shared memory. The thread per worker model does not mean that you have intra-query parallelism. I am not aware of any new DBMS built in the last 7-8 years that doesn’t use threads.

  18. 18 SCHEDULING For each query plan, the DBMS has to decide where, when, and how to execute it. → How many tasks should it use? → How many CPU cores should it use? → What CPU core should the tasks execute on? → Where should a task store its output? The DBMS always knows more than the OS.

  19. 19 INTER-QUERY PARALLELISM Improve overall performance by allowing multiple queries to execute simultaneously. → Provide the illusion of isolation through concurrency control scheme. The difficulty of implementing a concurrency control scheme is not significantly affected by the DBMS’s process model.

  20. 20 INTRA-QUERY PARALLELISM Improve the performance of a single query by executing its operators in parallel. Approach #1: Intra-Operator (Horizontal) → Operators are decomposed into independent instances that perform the same function on different subsets of data. Approach #2: Inter-Operator (Vertical) → Operations are overlapped in order to pipeline data from one stage to the next without materialization.

  21. 21 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 p ⨝ s s A B

  22. 22 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 p ⨝ s s A B

  23. 23 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 p ⨝ s s A 1 A 2 A 3 A B

  24. 24 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 p ⨝ s s A 1 A 2 A 3 A B 1 2 3

  25. 25 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 p ⨝ s s A 1 A 2 A 3 A B 1 2 3

  26. 26 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 p ⨝ s s s s s A 1 A 2 A 3 A B 1 2 3

  27. 27 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 p ⨝ s s s s s A 1 A 2 A 3 A B 1 2 3

  28. 28 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 p p p p ⨝ s s s s s A 1 A 2 A 3 A B 1 2 3

  29. 29 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 p Build HT Build HT Build HT p p p ⨝ s s s s s A 1 A 2 A 3 A B 1 2 3

  30. 30 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 p Exchange Build HT Build HT Build HT p p p ⨝ s s s s s A 1 A 2 A 3 A B 1 2 3

  31. 31 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id ⨝ AND A.value < 99 AND B.value > 100 p Exchange Build HT Build HT Build HT p p p ⨝ s s s s s A 1 A 2 A 3 A B 1 2 3

  32. 32 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id ⨝ AND A.value < 99 AND B.value > 100 p Exchange Build HT Build HT Build HT p p p ⨝ s s s s s A 1 A 2 A 3 B 1 B 2 B 3 A B 1 2 3 1 2 3

  33. 33 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id ⨝ AND A.value < 99 AND B.value > 100 p Exchange Build HT Build HT Build HT p p p ⨝ s s s s s A 1 A 2 A 3 B 1 B 2 B 3 A B 1 2 3 1 2 3

  34. 34 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id ⨝ AND A.value < 99 AND B.value > 100 p Exchange Build HT Build HT Build HT p p p p p p ⨝ s s s s s s s s A 1 A 2 A 3 B 1 B 2 B 3 A B 1 2 3 1 2 3

  35. 35 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id ⨝ AND A.value < 99 AND B.value > 100 p Exchange Probe HT Probe HT Probe HT Build HT Build HT Build HT p p p p p p ⨝ s s s s s s s s A 1 A 2 A 3 B 1 B 2 B 3 A B 1 2 3 1 2 3

  36. 36 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B Exchange WHERE A.id = B.id ⨝ AND A.value < 99 AND B.value > 100 p Exchange Probe HT Probe HT Probe HT Build HT Build HT Build HT p p p p p p ⨝ s s s s s s s s A 1 A 2 A 3 B 1 B 2 B 3 A B 1 2 3 1 2 3

  37. 37 INTER-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 p ⨝ s s A B

  38. 38 INTER-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 p ⨝ s s A B

  39. 39 INTER-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 p ⨝ s s 1 ⨝ for r 1 ∊ outer: for r 2 ∊ inner: A B emit (r 1 ⨝ r 2 )

  40. 40 INTER-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id 2 p AND A.value < 99 AND B.value > 100 for r ∊ incoming: p emit ( p r) ⨝ s s 1 ⨝ for r 1 ∊ outer: for r 2 ∊ inner: A B emit (r 1 ⨝ r 2 )

  41. 41 INTER-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id 2 p AND A.value < 99 AND B.value > 100 for r ∊ incoming: p emit ( p r) ⨝ s s 1 ⨝ for r 1 ∊ outer: for r 2 ∊ inner: A B emit (r 1 ⨝ r 2 )

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend