dynamic compilation for reducing dynamic compilation for
play

Dynamic Compilation for Reducing Dynamic Compilation for Reducing - PowerPoint PPT Presentation

Dynamic Compilation for Reducing Dynamic Compilation for Reducing Energy Consumption of I/O- -Intensive Intensive Energy Consumption of I/O Applications Applications Seung Woo Son, , Guangyu Guangyu Chen, Chen, Alok Choudhary Choudhary


  1. Dynamic Compilation for Reducing Dynamic Compilation for Reducing Energy Consumption of I/O- -Intensive Intensive Energy Consumption of I/O Applications Applications Seung Woo Son, , Guangyu Guangyu Chen, Chen, Alok Choudhary Choudhary Seung Woo Son Alok Mahmut Kandemir Mahmut Kandemir Dept. t. of of ECE ECE Dep Dept. t. of of CSE CSE Dep Northwestern University Northwestern University Pennsylvania State University Pennsylvania State University choudhar@ choudhar @ece.northwestern ece.northwestern.edu .edu {sson,gchen,kandemir}@cse.psu.edu { sson,gchen,kandemir}@cse.psu.edu The 18th International Workshop on Languages and Compilers for Parallel Computing (LCPC 05) October 20~22, 2005

  2. Outline Outline Motivation Motivation Dynamic Compilation Dynamic Compilation Our Dynamic Compilation Framework Our Dynamic Compilation Framework – Dynamic compiler/linker Dynamic compiler/linker – – Metadata manager Metadata manager – – Layout manager Layout manager – – High High- -level I/O library level I/O library – Experimental Results Experimental Results Conclusion Conclusion 10/22/2005 10/22/2005 LCPC 2005 2 2 LCPC 2005

  3. Motivation Motivation Tera- -scale high scale high- -performance computing performance computing Tera has enabled scientists to tackle very has enabled scientists to tackle very large and computationally challenging large and computationally challenging problems problems – Data Data- -intensive, I/O intensive, I/O- -intensive, and energy intensive, and energy – consuming consuming To cope with larger problems and data To cope with larger problems and data sizes, models and applications need to sizes, models and applications need to be dynamic in nature be dynamic in nature 10/22/2005 10/22/2005 LCPC 2005 3 3 LCPC 2005

  4. I/O is bottleneck I/O is bottleneck *Source: Terascale Data Management, LLNL. 10/22/2005 10/22/2005 LCPC 2005 4 4 LCPC 2005

  5. Energy Consumption? Energy Consumption? Others Processor 10% Cooling 17% 40% Memory I/O 7% 26% *Source: Mike Rosenfield, ACEED, February 2003. 10/22/2005 10/22/2005 LCPC 2005 5 5 LCPC 2005

  6. Related Work Related Work Academic/industry- -based dynamic compilers based dynamic compilers Academic/industry – Dynamo, DAISY, PIN, Dynamo, DAISY, PIN, DyC DyC, , … … – All efforts focused on enhancing the All efforts focused on enhancing the performance, i.e., their goal is to reduce the performance, i.e., their goal is to reduce the execution cycles execution cycles Recently, dynamic voltage/frequency scaling Recently, dynamic voltage/frequency scaling technique proposed using dynamic compilation technique proposed using dynamic compilation -> focused on reducing processor > focused on reducing processor’ ’s energy s energy - consumption [MICRO- -38] 38] consumption [MICRO 10/22/2005 10/22/2005 LCPC 2005 6 6 LCPC 2005

  7. Our Goal Our Goal To capture high- -level dynamic behaviors level dynamic behaviors To capture high in the I/O- -intensive applications using intensive applications using in the I/O dynamic compilers dynamic compilers Propose a dynamic compilation Propose a dynamic compilation framework for I/O- -intensive intensive framework for I/O applications applications – Dynamic compiler/linker, metadata manager, Dynamic compiler/linker, metadata manager, – high- -level I/O library, and layout manager level I/O library, and layout manager high 10/22/2005 10/22/2005 LCPC 2005 7 7 LCPC 2005

  8. Why Dynamic Compilation? Why Dynamic Compilation? Dynamic compilation exploits run- -time time Dynamic compilation exploits run state to generate code that is specific state to generate code that is specific to run- -time behavior time behavior to run Large- -scale scientific applications scale scientific applications Large exhibit the changes in data access exhibit the changes in data access patterns patterns – Simulation runs, post Simulation runs, post- -processing, and processing, and – analysis analysis – Large quantities of data are generated and Large quantities of data are generated and – frequent data layout changes occur frequent data layout changes occur 10/22/2005 10/22/2005 LCPC 2005 8 8 LCPC 2005

  9. Application Codes Application Codes Application Description Data Number of Energy Name Size Phase consumed Changes (J) AST Astrophysics 153.3GB 38 57,322 FFT Fast Fourier 96.6GB 19 39,451 Transform Cholesky Sparse 87.4GB 27 36,076 Cholesky Factorization Visuo 3D Visualization 95.5GB 31 42,905 SCF 3.0 Quantum 106.1GB 11 49,518 Chemistry RSense 2.0 Remote Sensing 104.0GB 46 51,114 Database 10/22/2005 10/22/2005 LCPC 2005 9 9 LCPC 2005

  10. Framework overview Framework overview Application Dynamic Compiler/Linker Mini Database Layout Manager (Metadata Manager) HLL Parallel, Hierarchical Storage System 10/22/2005 10/22/2005 LCPC 2005 10 10 LCPC 2005

  11. Dynamic Compiler/linker Dynamic Compiler/linker Steering Dynamic Unit Compiler Compilation Request Dynamic Performance Linker Linking Tracer Request Performance Statistics Suggestions to Data Access Layout Manager Pattern 10/22/2005 10/22/2005 LCPC 2005 11 11 LCPC 2005

  12. Optimization Rules Optimization Rules Opt rule Optimization Opt rule Optimization CIO Collective I/O CIO Collective I/O MCIO Multi- -collective I/O collective I/O MCIO Multi SP Sequential Prefetching Prefetching SP Sequential STD Strided Prefetching Prefetching STD Strided POL Replacement Policy Selection POL Replacement Policy Selection SSU Setting Striping Unit SSU Setting Striping Unit DM Data Migration DM Data Migration DP Data Purging DP Data Purging PRE Prestaging PRE Prestaging SUB Subfiling SUB Subfiling 10/22/2005 10/22/2005 LCPC 2005 12 12 LCPC 2005

  13. Optimization Rules Optimization Rules Collective I/O (CIO) Collective I/O (CIO) – Invoked if access pattern of the data is Invoked if access pattern of the data is – different from its storage pattern, and different from its storage pattern, and multiple processors are used to access the multiple processors are used to access the data data Subfiling (SUB) (SUB) Subfiling – Invoked if a small Invoked if a small subregion subregion of a file is of a file is – accessed with high temporal locality accessed with high temporal locality 10/22/2005 10/22/2005 LCPC 2005 13 13 LCPC 2005

  14. Example: CIO Example: CIO column-wise column-major access pattern storage layout Parallel and Independent I/O Collective and I/O column-wise row-major access pattern storage layout 10/22/2005 10/22/2005 LCPC 2005 14 14 LCPC 2005

  15. Experiment – – application codes application codes Experiment Application Description Data Number of Energy Name Size Phase consumed Changes (J) AST Astrophysics 153.3GB 38 57,322 FFT Fast Fourier 96.6GB 19 39,451 Transform Cholesky Sparse 87.4GB 27 36,076 Cholesky Factorization Visuo 3D Visualization 95.5GB 31 42,905 SCF 3.0 Quantum 106.1GB 11 49,518 Chemistry RSense 2.0 Remote Sensing 104.0GB 46 51,114 Database 10/22/2005 10/22/2005 LCPC 2005 15 15 LCPC 2005

  16. Simulation parameters Simulation parameters Parallel processors: total 16 Parallel processors: total 16 – 1.8 GHz with 2MB 4 1.8 GHz with 2MB 4- -way set way set- -associative cache, 1GB associative cache, 1GB – main memory main memory – Energy consumption measured using Energy consumption measured using Wattch Wattch – [ISCA’ ’00] 00] [ISCA Parallel disks Parallel disks – 8*18GB disks with low 8*18GB disks with low- -power mode (spin power mode (spin- -down) down) – – TPM disk power model [ISCA TPM disk power model [ISCA’ ’03] 03] – Interconnect Interconnect – 2D mesh 2D mesh – – Infiniband Infiniband switch/link power model [ISLPED switch/link power model [ISLPED’ ’03] 03] – 10/22/2005 10/22/2005 LCPC 2005 16 16 LCPC 2005

  17. Architecture Considered Architecture Considered P/M Tape Interprocessor Communication Subsystem Network Network P/M I/O P/M P/M Disk Subsystem 10/22/2005 10/22/2005 LCPC 2005 17 17 LCPC 2005

  18. Normalized Energy Consumption Normalized Energy Consumption AST Normalized Energy (%) 110 100 90 80 70 <AVERAGE> 60 50 40 Hand-Optimized: 19.8% 30 20 Our Approach: 16.1% 10 0 L D T O O P D L S M P E B O O T S S S I I T D R U D C C S E S S P P P S M + B + L O O I C P M + O I C M 10/22/2005 10/22/2005 LCPC 2005 18 18 LCPC 2005

  19. Breakdown of Dynamic Breakdown of Dynamic Compilation Energy Compilation Energy Dynamic Compiler Dynamic Linker Performance Tracer Steering Unit 10/22/2005 10/22/2005 LCPC 2005 19 19 LCPC 2005

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend