minimisation de la m emoire vs minimisation du volume d e
play

Minimisation de la m emoire VS minimisation du volume dE/S dans - PowerPoint PPT Presentation

Minimisation de la m emoire VS minimisation du volume dE/S dans les m ethodes de factorisation de matrices creuses Abdou Guermouche, LaBRI Bordeaux May 2010 Context Solving sparse linear Typical matrix: BRGM systems matrix 3 .


  1. Minimisation de la m´ emoire VS minimisation du volume d’E/S dans les m´ ethodes de factorisation de matrices creuses Abdou Guermouche, LaBRI Bordeaux May 2010

  2. Context Solving sparse linear Typical matrix: BRGM systems matrix • 3 . 7 × 10 6 variables • 156 × 10 6 non zeros in A • 4 . 5 × 10 9 non zeros in LU • 26 . 5 × 10 12 flops Ax = b ⇒ Direct methods: A = LU Abdou Guermouche, May 2010 2/43

  3. Context Solving sparse linear Typical matrix: BRGM systems matrix • 3 . 7 × 10 6 variables • 156 × 10 6 non zeros in A • 4 . 5 × 10 9 non zeros in LU • 26 . 5 × 10 12 flops Ax = b ⇒ Direct methods: A = LU Abdou Guermouche, May 2010 2/43

  4. Context Physical constraint Software challenge Core memory • Implementation of an out-of-core execution Memory required scheme within MUMPS Memory crash Abdou Guermouche, May 2010 2/43

  5. Context Out-of-core Software challenge Core memory Disks • Implementation of an out-of-core execution Memory required scheme within MUMPS Use of disks Abdou Guermouche, May 2010 2/43

  6. Outline Multifrontal method Active memory minimization Algorithm (Liu’s Algorithm) Memory issues Limitation of the approach New multifrontal schedules and algorithms Flexible allocation scheme A new memory minimization algorithm Results Total memory minimization How about Volume of I/O? Computing Volume of I/O Minimizing I/O volume Towards an out-of-core flexible allocation Conclusion and Future work Abdou Guermouche, May 2010 3/43

  7. Outline Multifrontal method Active memory minimization Algorithm (Liu’s Algorithm) Memory issues Limitation of the approach New multifrontal schedules and algorithms Flexible allocation scheme A new memory minimization algorithm Results Total memory minimization How about Volume of I/O? Computing Volume of I/O Minimizing I/O volume Towards an out-of-core flexible allocation Conclusion and Future work Abdou Guermouche, May 2010 4/43

  8. The multifrontal method (Duff, Reid’83) 1 2 3 4 5 1 2 3 4 5 0 0 0 0 1 1 0 0 0 0 2 0 2 0 A= L+U−I= 0 0 0 3 3 0 5 5 4 0 0 4 0 0 0 0 0 0 5 5 Non−zero Fill−in 4 4 5 Storage divided into two parts: 3 Factors • Factors systematically written to 3 4 1 disk; 1 4 2 • Active Storage kept in memory. 2 5 3 Contribution block Active Stack of Factors frontal contribution Elimination tree matrix blocks Active Storage Abdou Guermouche, May 2010 5/43

  9. The multifrontal method (Duff, Reid’83) 5 5 4 4 5 Storage divided into two parts: 3 Factors • Factors systematically written to 3 4 1 disk; 1 4 2 • Active Storage kept in memory. 2 5 3 Contribution block Active Stack of Factors frontal contribution Elimination tree matrix blocks Active Storage Abdou Guermouche, May 2010 5/43

  10. The multifrontal method (Duff, Reid’83) 5 5 4 4 5 Storage divided into two parts: 3 Factors • Factors systematically written to 3 4 1 disk; 1 4 2 • Active Storage kept in memory. 2 5 3 Contribution block Active Stack of Factors frontal contribution Elimination tree matrix blocks Active Storage Abdou Guermouche, May 2010 5/43

  11. The multifrontal method (Duff, Reid’83) 5 5 4 4 5 Storage divided into two parts: 3 Factors • Factors systematically written to 3 4 1 disk; 1 4 2 • Active Storage kept in memory. 2 5 3 Contribution block Active Stack of Factors frontal contribution Elimination tree matrix blocks Active Storage Abdou Guermouche, May 2010 5/43

  12. The multifrontal method (Duff, Reid’83) 5 5 4 4 5 Storage divided into two parts: 3 Factors • Factors systematically written to 3 4 1 disk; 1 4 2 • Active Storage kept in memory. 2 5 3 Contribution block Active Stack of Factors frontal contribution Elimination tree matrix blocks Active Storage Abdou Guermouche, May 2010 5/43

  13. The multifrontal method (Duff, Reid’83) 5 5 4 4 5 Storage divided into two parts: 3 Factors • Factors systematically written to 3 4 1 disk; 1 4 2 • Active Storage kept in memory. 2 5 3 Contribution block Active Stack of Factors frontal contribution Elimination tree matrix blocks Active Storage Abdou Guermouche, May 2010 5/43

  14. The multifrontal method (Duff, Reid’83) 5 5 4 4 5 Storage divided into two parts: 3 Factors • Factors systematically written to 3 4 1 disk; 1 4 2 • Active Storage kept in memory. 2 5 3 Contribution block Active Stack of Factors frontal contribution Elimination tree matrix blocks Active Storage Abdou Guermouche, May 2010 5/43

  15. Memory behaviour (serial postorder traversal) 3 1 2 Abdou Guermouche, May 2010 6/43

  16. Memory behaviour (serial postorder traversal) 3 1 2 Abdou Guermouche, May 2010 6/43

  17. Memory behaviour (serial postorder traversal) 3 1 2 Abdou Guermouche, May 2010 6/43

  18. Memory behaviour (serial postorder traversal) 3 1 2 Abdou Guermouche, May 2010 6/43

  19. Memory behaviour (serial postorder traversal) 3 1 2 Abdou Guermouche, May 2010 6/43

  20. Memory behaviour (serial postorder traversal) 3 1 2 Abdou Guermouche, May 2010 6/43

  21. Memory behaviour (serial postorder traversal) 3 1 2 Abdou Guermouche, May 2010 6/43

  22. Memory behaviour (serial postorder traversal) 3 1 2 Abdou Guermouche, May 2010 6/43

  23. Sequential case results Memory peak Memory peak Worst case. Best case. → Algorithms to find the optimal tree traversal have been proposed Abdou Guermouche, May 2010 7/43

  24. Sequential case results Memory peak Memory peak Worst case. Best case. → Algorithms to find the optimal tree traversal have been proposed Abdou Guermouche, May 2010 7/43

  25. Sequential case: Memory behavior (2/2) Consider a parent node in the tree: • n is the number of children. • j denotes the j th child of the node. • cb j is the size of the contribution block of cb n cb 1 child j . cb 2 • m is the memory size of the frontal matrix of ... n 1 2 the parent. • A (resp. A j ) is the amount of active memory needed to process the parent (resp. child j ). The assembly step requires a storage: n � m + cb j j = 1 Abdou Guermouche, May 2010 8/43

  26. Sequential case: Memory behavior (2/2) Consider a parent node in the tree: • n is the number of children. • j denotes the j th child of the node. • cb j is the size of the contribution block of cb n cb 1 child j . cb 2 • m is the memory size of the frontal matrix of ... n 1 2 the parent. • A (resp. A j ) is the amount of active memory needed to process the parent (resp. child j ). The storage required to process child j is: j − 1 � A j + cb k k = 1 Abdou Guermouche, May 2010 8/43

  27. Sequential case: Memory behavior (2/2) Consider a parent node in the tree: • n is the number of children. • j denotes the j th child of the node. • cb j is the size of the contribution block of cb n cb 1 child j . cb 2 • m is the memory size of the frontal matrix of ... n 1 2 the parent. • A (resp. A j ) is the amount of active memory needed to process the parent (resp. child j ). A is thus defined by: j − 1 n � � A = max ( max j = 1 , n ( A j + cb k ) , m + cb j ) k = 1 j = 1 Abdou Guermouche, May 2010 8/43

  28. Outline Multifrontal method Active memory minimization Algorithm (Liu’s Algorithm) Memory issues Limitation of the approach New multifrontal schedules and algorithms Flexible allocation scheme A new memory minimization algorithm Results Total memory minimization How about Volume of I/O? Computing Volume of I/O Minimizing I/O volume Towards an out-of-core flexible allocation Conclusion and Future work Abdou Guermouche, May 2010 9/43

  29. Liu’s Algorithm Liu’s Theorem (Tree pebbling theorem) The minimum of max j ( x j + � j − 1 i = 1 y j ) is obtained when the sequence ( x i , y i ) is sorted in decreasing order of x i − y i , Consequence: An optimal child sequence is obtained by rearranging the children nodes in decreasing order of A i − cb i . Algorithm: • Bottom-up greedy process. • Apply Liu’s theorem at each level of the tree. Abdou Guermouche, May 2010 10/43

  30. Outline Multifrontal method Active memory minimization Algorithm (Liu’s Algorithm) Memory issues Limitation of the approach New multifrontal schedules and algorithms Flexible allocation scheme A new memory minimization algorithm Results Total memory minimization How about Volume of I/O? Computing Volume of I/O Minimizing I/O volume Towards an out-of-core flexible allocation Conclusion and Future work Abdou Guermouche, May 2010 11/43

  31. Limitation of the Classical scheme Allocation of the father Memory peak Memory peak Allocation of the father Classical approach. Flexible scheme. → Decoupling the allocation and the computations can improve the memory behavior Abdou Guermouche, May 2010 12/43

  32. Limitation of the Classical scheme Allocation of the father Memory peak Memory peak Allocation of the father Classical approach. Flexible scheme. → Decoupling the allocation and the computations can improve the memory behavior Abdou Guermouche, May 2010 12/43

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend