AVALON
Algorithms and Software Architectures for Distributed & High Performance Computing Platforms
Christian Perez LIP, ENS Lyon
2014, September 18
AVALON Algorithms and Software Architectures for Distributed & - - PowerPoint PPT Presentation
AVALON Algorithms and Software Architectures for Distributed & High Performance Computing Platforms Christian Perez LIP, ENS Lyon 2014, September 18 Agenda Team Members Avalon Research Activities Overview of Some Research Activities
2014, September 18
2
3
Faculty Members (8) (4 INRIA, 1 CNRS, 2 UCBL, 1 ENSL)
PhD students (7)
Gaston Berger (Sénégal)
Engineers (3+4+1)
Postdoc / Temporary Researcher
Temporary Teacher-Researcher
Assistant
4
(IaaS, PaaS)
CPU/data-intensive Scientific Applications
Computing platforms
reliability, QoS, etc.
Spot instance Objectives
5
(IaaS, PaaS)
CPU/data-intensive Scientific Applications
Computing platforms
reliability, QoS, etc.
Spot instance Objectives
6
(IaaS, PaaS)
CPU/data-intensive Scientific Applications
Computing platforms
reliability, QoS, etc.
Spot instance Objectives
7
Applications
Super- computers (Exascale)
Large scale
Desktop Grids
Volatility
Clouds
(IaaS, PaaS)
On demand
Grids (EGI)
Heterogeneity
8
9
Estimate Energy Consumption of Fault Tolerance Protocols during HPC executions", CCGrid2013, the 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Delft, the Netherlands, May 13-16, 2013
10
11
9/19/201400 MOIS 2011 Avalon Team Presentation @ INRIA Seminar 12
19/09/201400 MOIS 2011 Avalon Team Presentation @ INRIA Seminar 13/30
19/09/201400 MOIS 2011 Avalon Team Presentation @ INRIA Seminar 14/30
19/09/201400 MOIS 2011 Avalon Team Presentation @ INRIA Seminar 15/30
19/09/201400 MOIS 2011 Avalon Team Presentation @ INRIA Seminar 16/30
17
18
19
20
21
Legacy application CAMEL
New application PaaSage Integrated Development Environment
Speculative profiler Speculative profiler Reasoner Extra functional adaption Design time
Metadata
Community expertise Platform specific mapping Execution monitoring Execution control
Metadata sharing Metadata collection Execution
23
LA MA LA LA LA
Server front end Master Agent Local Agent Client
MA MA MA MA Corba
http://graal.ens-lyon.fr/DIET DIET client Client layer DIET agents Scheduling layer
MasterAgent(MA), LocalAgent (LA)
DIET server Service layer ServerDeamon (SeD) DIET client Client layer DIET agents Scheduling layer
MasterAgent(MA), LocalAgent (LA)
DIET server Service layer ServerDeamon (SeD)
ACI MD GDS, ANR LEGO, ANR GWENDIA, Grid’5000
Avalon Team, Inria, LIP ENS Lyon
24
Derived from Verizon study that says 80% of problems come from config. problems
25
26
27
28
29
29
FileSelect InputSplitter Pull<PIS> FileSliceSelect Push<PIS> Push<PSI> Map<ISSI> WordCountMap Push<PSI> Pull<PSLI> Push<PSLI> Push<PSI> Reduce<SIPSI> FileSelect Go FileSelect Go Go Master Go Runner Splitter Merging Buffer Runner Wordcount Reduce Mapper Wordcount Writer Reducer Push<PSLI> Merging Buffer Runner Pull<PSLI> Push<PSLI> Push User Push Provider Pull<PIS> Push<PIS> Push<PSI> Runner WordReader Splitter Merging Buffer Runner Push<PSI> Pull<PSLI> Push<PSLI> Map<ISSI> WordCountMap Mapper Push Provider Merging Buffer Runner Pull<PSLI> Push User GoUser GoUser GoUser Go Provider GoProvider Reduce<SIPSI> GoProvider FileSliceSelect User PushResult User Demux Wordcount Reduce Reducer PushResult Provider
FileSliceSelect Provider WordReader
30
FileSelect Pull<PIS> BSSliceSelect Push<PIS> Go FileSelect Go Master Runner
ImporterBS BSInputSplitter BSImport
bs_input_block_size bs_cfg bs_id bs_page_size bs_replica_count
Map<ISSI> WordCountMap Reduce<SIPSI> Splitter Merging Buffer Wordcount Reduce Mapper Reducer Runner Push<PSI> GoUser Merging Buffer Runner Push User Pull<PIS> Push<PIS> Push<PSI> Runner Splitter Merging Buffer Runner Push<PSI> Pull<PSLI> Push<PSLI> Map<ISSI> WordCountMap Mapper Push Provider Merging Buffer Runner Pull<PSLI> Push User Reduce<SIPSI> GoProvider PushResult User Wordcount Reduce Reducer Push Provider WordReader BS Wordcount Writer Demux PushResult Provider GoUser GoUser Go Provider BSSliceSelect User WordReader BS BSSliceSelect Provider GoProvider
31
32
33
34 Throughput of WordCount application on Grid’5000 (512 nodes) up to 2 TB
Execution time reduced by up to 47%!
Time of map phase and shuffle w.r.t number of mappers and reducers
35
36