an index advisor using deep reinforcement learning
play

An Index Advisor Using Deep Reinforcement Learning Hai Lan 1 , - PowerPoint PPT Presentation

An Index Advisor Using Deep Reinforcement Learning Hai Lan 1 , Zhifeng Bao 1 , Yuwei Peng 2 1 RMIT University 2 Wuhan University 1 Index Selection Problem (ISP) 2 Index Selection Problem (ISP) Choosing the right indexes to build is one of


  1. An Index Advisor Using Deep Reinforcement Learning Hai Lan 1 , Zhifeng Bao 1 , Yuwei Peng 2 1 RMIT University 2 Wuhan University 1

  2. Index Selection Problem (ISP) 2

  3. Index Selection Problem (ISP) • Choosing the right indexes to build is one of the central issues in database tuning. • Problem Definition: • Select a set of indexes (index configuration) to be built to maximize the performance of the given workload with some constraints. • Constraints: storage usage, index number, and so on. 2

  4. Index Selection Problem (ISP) • Choosing the right indexes to build is one of the central issues in database tuning. • Problem Definition: • Select a set of indexes (index configuration) to be built to maximize the performance of the given workload with some constraints. • Constraints: storage usage, index number, and so on. • Index interaction : an interaction exists between an index a and an index b if the benefit of a is affected by the existence of b and vice - versa . SELECT * FROM t WHERE a < 10 OR b < 10; (1) An index on a ✗ (2) An index on b ✗ (3) An index on a and an index on b ✓ 2

  5. Prior Work 3

  6. Prior Work Category Work Cost Index type Alog Cons IIA ✗ Non - Learning AutoAdmin [ VLD ’97] Estimated cost S / M Greedy index number method ✗ ILP [ ICDE ’07] Estimated cost S / M ILP storage ✓ ISRM [ ICDE 19] Estimated cost S / M Greedy storage Learning - based AI Meet AI [ SIGMOD ’19] Learning - model S / M Not sure Greedy index number method Estimated cost ✓ Welborn et al [ arxiv ’19] Not mention S / M DQN no ✓ DRL - Index [ ICDEW ’20] Estimated cost S DQN Not mention IIA means index interaction . Cons means constraints . Alog means search algorithm . S means single column index . M means multi - column index . Welborn ’ s work only focuses on single table . 3 DRL - index is not implemented yet .

  7. Prior Work Category Work Cost Index type Alog Cons IIA ✗ Non - Learning AutoAdmin [ VLD ’97] Estimated cost S / M Greedy index number method ✗ ILP [ ICDE ’07] Estimated cost S / M ILP storage ✓ ISRM [ ICDE 19] Estimated cost S / M Greedy storage Learning - based AI Meet AI [ SIGMOD ’19] Learning - model S / M Not sure Greedy index number method Estimated cost ✓ Welborn et al [ arxiv ’19] Not mention S / M DQN no ✓ DRL - Index [ ICDEW ’20] Estimated cost S DQN Not mention IIA means index interaction . Cons means constraints . Alog means search algorithm . S means single column index . M means multi - column index . Welborn ’ s work only focuses on single table . 3 DRL - index is not implemented yet .

  8. Prior Work Category Work Cost Index type Alog Cons IIA ✗ Non - Learning AutoAdmin [ VLD ’97] Estimated cost S / M Greedy index number method ✗ ILP [ ICDE ’07] Estimated cost S / M ILP storage ✓ ISRM [ ICDE 19] Estimated cost S / M Greedy storage Learning - based AI Meet AI [ SIGMOD ’19] Learning - model S / M Not sure Greedy index number method Estimated cost ✓ Welborn et al [ arxiv ’19] Not mention S / M DQN no ✓ DRL - Index [ ICDEW ’20] Estimated cost S DQN Not mention Our Goal : (1) Handle complex queries on multiple tables (2) Recommend multi - column indexes (3) Capture the index interaction IIA means index interaction . Cons means constraints . Alog means search algorithm . S means single column index . M means multi - column index . Welborn ’ s work only focuses on single table . 3 DRL - index is not implemented yet .

  9. Our Method - Overview 4

  10. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance 4

  11. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance 4

  12. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance workload 4

  13. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload 4

  14. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload The algorithm to select an index from candidates according to current workload and index configuration 4

  15. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload steps The algorithm to select an index from candidates according to current workload and index configuration 4

  16. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload steps The algorithm to select an index from candidates according to current workload and index configuration • T is determined by the constraints . 4

  17. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload steps The algorithm to select an index from candidates according to current workload and index configuration • T is determined by the constraints . • Framework 4

  18. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload steps The algorithm to select an index from candidates according to current workload and index configuration • T is determined by the constraints . • Framework Index Candidates Rules Workload Sample 4

  19. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload steps The algorithm to select an index from candidates according to current workload and index configuration • T is determined by the constraints . • Framework Index Candidates Rules Workload Sample action DQN 4 Agent

  20. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload steps The algorithm to select an index from candidates according to current workload and index configuration • T is determined by the constraints . Create • Framework transform Index Candidates Rules Workload Sample action DQN 4 Agent

  21. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload steps Environment DB The algorithm to select an index from candidates according to What - If Caller current workload and index configuration • T is determined by the constraints . Create • Framework transform Index Candidates Rules Workload Sample action DQN 4 Agent

  22. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload steps Environment DB The algorithm to select an index from candidates according to What - If Caller current workload and index configuration • T is determined by the constraints . Create • Framework Reward Next state transform Index Candidates Rules Workload Sample action DQN 4 Agent

  23. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload steps Environment DB The algorithm to select an index from candidates according to What - If Caller current workload and index configuration • T is determined by the constraints . Create • Framework Reward Next state transform Index Candidates Rules Workload Sample action DQN 4 Agent

  24. Our Method - Rules 5

  25. Our Method - Rules 5

  26. Our Method - Rules 5

  27. Our Method - Rules SELECT t 1. a 7 FROM t 1, t 2 WHERE t 1. a 1 = t 2. b 1 AND t 1. a 2 = t 2. b 2 AND t 1. a 3 = 4 AND t 2. b 3 < 10 ORDER BY t 1. a 5, t 1. a 6 5

  28. Our Method - Rules SELECT t 1. a 7 FROM t 1, t 2 WHERE t 1. a 1 = t 2. b 1 AND t 1. a 2 = t 2. b 2 AND t 1. a 3 = 4 AND t 2. b 3 < 10 ORDER BY t 1. a 5, t 1. a 6 J : t 1. a 2, t 1. a 3, t 2. b 2, t 2. b 3 EQ : t 1. a 3 RANGE : t 2. b 3 O : t 1. a 5, t 1. a 6 USED : t 1.( a 1- a 7), t 2.( b 1- b 3) 5

  29. Our Method - Rules SELECT t 1. a 7 FROM t 1, t 2 WHERE t 1. a 1 = t 2. b 1 AND t 1. a 2 = t 2. b 2 AND t 1. a 3 = 4 AND t 2. b 3 < 10 ORDER BY t 1. a 5, t 1. a 6 t 1. a 1, t 1. a 2, t 1. a 3, t 2. b 1, t 2. b 2, t 2. b 3 Rule 1 J : t 1. a 2, t 1. a 3, t 2. b 2, t 2. b 3 EQ : t 1. a 3 RANGE : t 2. b 3 O : t 1. a 5, t 1. a 6 USED : t 1.( a 1- a 7), t 2.( b 1- b 3) 5

  30. Our Method - Rules SELECT t 1. a 7 FROM t 1, t 2 WHERE t 1. a 1 = t 2. b 1 AND t 1. a 2 = t 2. b 2 AND t 1. a 3 = 4 AND t 2. b 3 < 10 ORDER BY t 1. a 5, t 1. a 6 t 1. a 1, t 1. a 2, t 1. a 3, t 2. b 1, t 2. b 2, t 2. b 3 Rule 1 J : t 1. a 2, t 1. a 3, t 2. b 2, t 2. b 3 Rule 2 EQ : t 1. a 3 ( t 1. a 5, t 1. a 6) RANGE : t 2. b 3 O : t 1. a 5, t 1. a 6 USED : t 1.( a 1- a 7), t 2.( b 1- b 3) 5

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend