optimizations for intensive signal processing
play

Optimizations for intensive signal processing applications on - PowerPoint PPT Presentation

Optimizations for intensive signal processing applications on Systems-on-Chip Calin Glitia September 6, 2010 PhD defense Calin Glitia 1/34 Intensive signal processing Detection systems Multimedia PhD defense Calin Glitia 2/34


  1. Uniform sub-array accesses (patterns) Tiler o : origin of the reference pattern F : Fitting matrix – the shape of the tiles in the array 5 0 0 8 PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34

  2. Uniform sub-array accesses (patterns) Tiler o : origin of the reference pattern F : Fitting matrix – the shape of the tiles in the array P : Paving matrix – uniform spacing of the tiles 5 0 0 8 Formal specification: � r � o + ( P F ) · mod s array i PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34

  3. Uniform sub-array accesses (patterns) Tiler o : origin of the reference pattern F : Fitting matrix – the shape of the tiles in the array P : Paving matrix – uniform spacing of the tiles 5 0 0 8 Formal specification: � r � o + ( P F ) · mod s array i PhD defense – Calin Glitia Modeling intensive signal processing applications 11/34

  4. Common motif-based accesses Pattern examples 5 4 4 3 3 0 0 0 0 0 5 5 5 5 0 3 0 0 0 0 PhD defense – Calin Glitia Modeling intensive signal processing applications 12/34

  5. Common motif-based accesses Pattern examples 5 4 4 3 3 0 0 0 0 0 5 5 5 5 0 3 0 0 0 0 Paving example 4 4 4 0 0 0 � 0 � 1 � 2 0 9 0 9 0 9 � � � r = r = r = 2 2 2 � � � � 1 0 5 4 4 4 F = s pattern = 0 1 3 0 0 0 � � � � 0 10 o = s array = 0 � 0 9 0 � 1 9 0 � 2 9 � � � 0 5 r = r = r = 1 1 1 4 4 4 � 0 3 � � 3 � P = s repetition = 1 0 3 0 0 0 � 0 � 1 � 2 0 9 0 9 0 9 � � � r = r = r = 0 0 0 PhD defense – Calin Glitia Modeling intensive signal processing applications 12/34

  6. Summary A RRAY -OL Specification Data- fi ow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

  7. Summary A RRAY -OL Specification Data- fi ow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism Rules that allow static analysis PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

  8. Summary A RRAY -OL Specification Data- fi ow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism Rules that allow static analysis Limitations Numerical values for the multidimensional spaces/accesses PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

  9. Summary A RRAY -OL Specification Data- fi ow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism Rules that allow static analysis Limitations Numerical values for the multidimensional spaces/accesses Cycles not allowed in the dependence graph PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

  10. Summary A RRAY -OL Specification Data- fi ow oriented visual formalism Express the regularity of computations/data accesses Exploit the parallelism Rules that allow static analysis Limitations Numerical values for the multidimensional spaces/accesses Cycles not allowed in the dependence graph Extension: inter-repetition dependences PhD defense – Calin Glitia Modeling intensive signal processing applications 13/34

  11. Need for uniform dependences State construction Transfer data between different instances of the same repetition. PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34

  12. Need for uniform dependences State construction Transfer data between different instances of the same repetition. Examples: Sum, Integrate PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34

  13. Need for uniform dependences State construction Transfer data between different instances of the same repetition. Examples: Sum, Integrate Integrate – Flatten ... +[0] +[1] +[2] ... 0 ... PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34

  14. Need for uniform dependences State construction Transfer data between different instances of the same repetition. Examples: Sum, Integrate Integrate – Flatten ... +[0] +[1] +[2] ... 0 ... Uniform data dependences between instances of a repetition PhD defense – Calin Glitia Modeling intensive signal processing applications 14/34

  15. Uniform dependences Integrate Inter-repetition dependence ( ∞ ) + p out ( ∞ ) p in ( ∞ ) default PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

  16. Uniform dependences Integrate Inter-repetition dependence 1 Data dependence: p out → p in ( ∞ ) + p out ( ∞ ) p in ( ∞ ) default PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

  17. Uniform dependences Integrate Inter-repetition dependence 1 Data dependence: p out → p in ( ∞ ) + 2 Dependence vector inside the p out ( ∞ ) p in repetition space ( ∞ ) d = � 1 � default PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

  18. Uniform dependences Integrate Inter-repetition dependence 1 Data dependence: p out → p in ( ∞ ) + 2 Dependence vector inside the p out ( ∞ ) p in () repetition space 0 ( ∞ ) 3 Initial values: default link for dependences that exit the repetition space default PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

  19. Uniform dependences Integrate Inter-repetition dependence 1 Data dependence: p out → p in ( ∞ ) + 2 Dependence vector inside the p out ( ∞ ) p in () repetition space 0 ( ∞ ) 3 Initial values: default link for dependences that exit the repetition space d = � 1 � default Calin Glitia, Philippe Dumont, and Pierre Boulet. Array-OL with delays, a domain specific specification language for multidimensional intensive signal processing. Multidimensional Systems and Signal Processing , 2009. PhD defense – Calin Glitia Modeling intensive signal processing applications 15/34

  20. Initial values Initial values – default link d = � 2 , 0 � PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34

  21. Initial values Initial values – default link Same initial value PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34

  22. Initial values Initial values – default link Same initial value Different values – Tiler � � F = � 0 � o = 0 � 1 � 0 P = 0 1 PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34

  23. Initial values Initial values – default link Same initial value Different values – Tiler Different default links – Exclusive tilers F = � 1 � o = � − 4 � P = � 4 0 � � 1 � F = � 0 � o = � 4 0 � P = PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34

  24. Initial values Initial values – default link Same initial value Different values – Tiler Different default links – Exclusive tilers F = � 1 � o = � − 4 � P = � 4 0 � � 1 � F = � 0 � o = � 4 0 � P = PhD defense – Calin Glitia Modeling intensive signal processing applications 16/34

  25. Complex dependences Dependence constructions: Multiple default links PhD defense – Calin Glitia Modeling intensive signal processing applications 17/34

  26. Complex dependences Dependence constructions: Multiple default links Multiple dependences on a repetition space PhD defense – Calin Glitia Modeling intensive signal processing applications 17/34

  27. Complex dependences Dependence constructions: Multiple default links Multiple dependences on a repetition space Dependences connected through the hierarchy Dependences on the complete repetition space PhD defense – Calin Glitia Modeling intensive signal processing applications 17/34

  28. Impact on the parallelism PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34

  29. Impact on the parallelism The repetition space is split in parallel hyper-planes PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34

  30. Impact on the parallelism The repetition space is split in parallel hyper-planes Pipeline execution following the distance vector PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34

  31. Impact on the parallelism The repetition space is split in parallel hyper-planes Pipeline execution following the distance vector Scheduling uniform loops Alain Darte and Yves Robert. Constructive methods for scheduling uniform loop nests. IEEE Trans. Parallel Distributed Systems , 5(8):814–822, 1994. PhD defense – Calin Glitia Modeling intensive signal processing applications 18/34

  32. Outline Array Oriented Language Application Architecture Inter repetition dependence Mapping Code Generation PhD defense – Calin Glitia Modeling intensive signal processing applications 19/34

  33. Outline Modeling and Analysis of Real-Time Embedded Systems Array Oriented Language Application Architecture Inter repetition dependence Mapping Code Generation PhD defense – Calin Glitia Modeling intensive signal processing applications 19/34

  34. Modeling and Analysis of Real-Time Embedded Systems Profile UML – standard OMG Model Driven Engineering Co-design: application, architecture, mapping Repetitive Structure Modeling All the A RRAY -OL concepts are included Proposed by the DaRT team PhD defense – Calin Glitia Modeling intensive signal processing applications 20/34

  35. Model repeated inter-connected architecture topologies Physical connections between architecture components Compact expression PhD defense – Calin Glitia Modeling intensive signal processing applications 21/34

  36. Model repeated inter-connected architecture topologies Physical connections between architecture components Compact expression Cyclic uniform inter-connections PhD defense – Calin Glitia Modeling intensive signal processing applications 21/34

  37. Summary inter-repetition dependences + 1 Expression of state constructions PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34

  38. Summary inter-repetition dependences + 1 Expression of state constructions 2 Complex dependences through the hierarchy PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34

  39. Summary inter-repetition dependences + 1 Expression of state constructions 2 Complex dependences through the hierarchy 3 Parallelism – pipeline PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34

  40. Summary inter-repetition dependences + 1 Expression of state constructions 2 Complex dependences through the hierarchy 3 Parallelism – pipeline 4 Repeated inter-connected architectures PhD defense – Calin Glitia Modeling intensive signal processing applications 22/34

  41. Outline Array Oriented Language Application Architecture Inter repetition dependence Mapping Code Generation PhD defense – Calin Glitia From a high-level specification to the execution 23/34

  42. Outline Modeling and Analysis of Real-Time Embedded Systems Array Oriented Language Application Architecture Inter repetition dependence Mapping High-level refactoring – data-parallel transformations – strategies Code Generation PhD defense – Calin Glitia From a high-level specification to the execution 23/34

  43. Execution Logical space and time as mixed dimensions of multidimensional structure Specification: expresses the data dependences between all the data elements that transits the system And a partial execution order between all the execution of the tasks in of the system PhD defense – Calin Glitia From a high-level specification to the execution 24/34

  44. Execution Logical space and time as mixed dimensions of multidimensional structure Specification: expresses the data dependences between all the data elements that transits the system And a partial execution order between all the execution of the tasks in of the system EFFICIENT execution Optimized code generation Projection of specification into physical space and time PhD defense – Calin Glitia From a high-level specification to the execution 24/34

  45. Execution Logical space and time as mixed dimensions of multidimensional structure Specification: expresses the data dependences between all the data elements that transits the system And a partial execution order between all the execution of the tasks in of the system EFFICIENT execution Optimized code generation Projection of specification into physical space and time Adapt a specification to the execution High-level refactoring Execution that re fi ects the specification PhD defense – Calin Glitia From a high-level specification to the execution 24/34

  46. Projection into space and time Multi-dimensional structures repetition spaces data structures PhD defense – Calin Glitia From a high-level specification to the execution 25/34

  47. Projection into space and time Multi-dimensional structures repetition spaces data structures ⇓ ⇓ in space in time PhD defense – Calin Glitia From a high-level specification to the execution 25/34

  48. Projection into space and time Multi-dimensional structures repetition spaces data structures ⇓ ⇓ in space in time ← → linked (trade-off) PhD defense – Calin Glitia From a high-level specification to the execution 25/34

  49. Projection into space and time Multi-dimensional structures repetition spaces data structures ⇓ ⇓ in space in time ← → linked (trade-off) Take into account the execution constraints Data dependences Available resources PhD defense – Calin Glitia From a high-level specification to the execution 25/34

  50. Projection example Horizontal Filter Vertical Filter ( 240 , 1080 , ∞ ) ( 720 , 120 , ∞ ) ( 1920 , 1080 , ∞ ) ( 720 , 1080 , ∞ ) ( 720 , 480 , ∞ ) ( 13 ) ( 3 ) ( 14 ) ( 4 ) PhD defense – Calin Glitia From a high-level specification to the execution 26/34

  51. Projection example Horizontal Filter Vertical Filter ( 240 , 1080 , ∞ ) ( 720 , 120 , ∞ ) ( 1920 , 1080 , ∞ ) ( 720 , 1080 , ∞ ) ( 720 , 480 , ∞ ) ( 13 ) ( 3 ) ( 14 ) ( 4 ) PhD defense – Calin Glitia From a high-level specification to the execution 26/34

  52. Projection example Horizontal Filter Vertical Filter ( 240 , 1080 , ∞ ) ( 720 , 120 , ∞ ) ( 1920 , 1080 , ∞ ) ( 720 , 1080 , ∞ ) ( 720 , 480 , ∞ ) ( 13 ) ( 3 ) ( 14 ) ( 4 ) Maximal parallelism Memory size Infinite data structures – Blocking points PhD defense – Calin Glitia From a high-level specification to the execution 26/34

  53. Projection example Horizontal Filter Vertical Filter ( 240 , 1080 , ∞ ) ( 720 , 120 , ∞ ) ( 1920 , 1080 , ∞ ) ( 720 , 1080 , ∞ ) ( 720 , 480 , ∞ ) ( 13 ) ( 3 ) ( 14 ) ( 4 ) Pipeline Execution Order PhD defense – Calin Glitia From a high-level specification to the execution 26/34

  54. Projection example ( 240 , 120 , ∞ ) Horizontal Filtre Vertical Filter ( 14 ) ( 3 ) ( 1920 , 1080 , ∞ ) ( 720 , 480 , ∞ ) ( 14 , 13 ) ( 3 , 14 ) ( 3 , 4 ) ( 13 ) ( 3 ) ( 14 ) ( 4 ) Fusion of successive repetitions Minimize the arrays – macro-patterns Distribution of the common repetition Each processor its macro-patterns in memory PhD defense – Calin Glitia From a high-level specification to the execution 26/34

  55. Projection example ( 240 , 120 , ∞ ) Horizontal Filtre Vertical Filter ( 14 ) ( 3 ) ( 1920 , 1080 , ∞ ) ( 720 , 480 , ∞ ) ( 14 , 13 ) ( 3 , 14 ) ( 3 , 4 ) ( 13 ) ( 3 ) ( 14 ) ( 4 ) Re-computations When intermediate values are consumed by multiple repetitions Trade-off Recompute values Keep in memory – increase of memory size PhD defense – Calin Glitia From a high-level specification to the execution 26/34

  56. High-level transformations Adapt a specification to execution change the granularity of the repetitions array sizes reductions PhD defense – Calin Glitia From a high-level specification to the execution 27/34

  57. High-level transformations Adapt a specification to execution change the granularity of the repetitions array sizes reductions “High-level” loop transformations repetition = visual representation of data-parallel loop nest fusion, change paving, tiling, collapse, . . . Calin Glitia and Pierre Boulet. High level loop transformations for multidimensional signal processing embedded applications. In International Symposium on Systems, Architectures, Modeling, and Simulation (SAMOS VIII) , Samos, Greece, July 2008. PhD defense – Calin Glitia From a high-level specification to the execution 27/34

  58. Optimization strategies – memory size reduction r 2 r 3 r 4 r 5 r 1 r 6 r 7 MAXIMAL reduction of the intermediate arrays PhD defense – Calin Glitia From a high-level specification to the execution 28/34

  59. Optimization strategies – memory size reduction r 2 r 3 r 4 r 5 r 1 r 6 r 7 MAXIMAL reduction of the intermediate arrays Fusion of multiple repetitions Minimizes only the last intermediate array Re-computations! PhD defense – Calin Glitia From a high-level specification to the execution 28/34

  60. Optimization strategies – memory size reduction r 2 r 3 r 4 r 5 r 1 r 6 r 7 MAXIMAL reduction of the intermediate arrays Fusion of multiple repetitions Minimizes only the last intermediate array Re-computations! Complete fusion? Too much re-computations Limited array reduction PhD defense – Calin Glitia From a high-level specification to the execution 28/34

  61. Optimization strategies – memory size reduction r 2 r 3 r 4 r 5 r 1 r 6 r 7 MAXIMAL reduction of the intermediate arrays Strategy that limits the re-computations using result from complete fusion and two-by-two fusions where re-computations are introduces and minimal achievable array reduction PhD defense – Calin Glitia From a high-level specification to the execution 28/34

  62. Optimization strategies – memory size reduction r 12 r 2 r 3 r 4 r 5 r 1 r 67 r 14 r 6 r 7 MAXIMAL reduction of the intermediate arrays Repetitions Repetitions Re-computations Reduction factor before fusion after fusion (product) of the output arrays  � �  8 × 128 × 96 10 × 8 9 . 29 1228 . 8 119 × 119 × 96 1 1 96   96 × 80 × 80 × 96   1 96 80 × 80   96 1 1 1 128 × 96 × 80 128 × 96 × 80 1 1 119 × 128 × 96 � � 1 12288 119 128 × 96 × 128 × 96 1 1 1 PhD defense – Calin Glitia From a high-level specification to the execution 28/34

  63. Optimization strategies – memory size reduction r 12 r 2 r 3 r 4 r 5 r 1 r 67 r 14 r 6 r 7 MAXIMAL reduction of the intermediate arrays Calin Glitia, Pierre Boulet, ´ Eric Lenormand, and Michel Barreteau. Repetitive model refactoring strategy for the design space exploration of intensive signal processing applications. Journal of Systems Architecture, Special Issue: Hardware/Software CoDesign . PhD defense – Calin Glitia From a high-level specification to the execution 28/34

  64. And the inter-repetition dependences? Why ? To allow the use of the refactoring tools on models with uniform dependences PhD defense – Calin Glitia From a high-level specification to the execution 29/34

  65. And the inter-repetition dependences? Why ? To allow the use of the refactoring tools on models with uniform dependences Typically: ⇒ PhD defense – Calin Glitia From a high-level specification to the execution 29/34

  66. And the inter-repetition dependences? Why ? To allow the use of the refactoring tools on models with uniform dependences Typically: ⇒ Algorithm The global accesses and dependences MUST remain unchanged Automatically compute new dependences after a transformation PhD defense – Calin Glitia From a high-level specification to the execution 29/34

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend