PRESENTATION OUTLINE: Adaptation of a parallel Random Jungle (RJ) - - PDF document

presentation outline adaptation of a parallel random
SMART_READER_LITE
LIVE PREVIEW

PRESENTATION OUTLINE: Adaptation of a parallel Random Jungle (RJ) - - PDF document

PRESENTATION OUTLINE: Adaptation of a parallel Random Jungle (RJ) algorithm empowered by Coarse Grain Parallel computing for Cloud environment for Genome-Wide Association Studies(GWAS). Maria Pospelova School of Computer Science Carleton


slide-1
SLIDE 1

PRESENTATION OUTLINE: Adaptation of a parallel Random Jungle (RJ) algorithm empowered by Coarse Grain Parallel computing for Cloud environment for Genome-Wide Association Studies(GWAS).

Maria Pospelova School of Computer Science Carleton University Ottawa, Canada K1S 5B6 maria.pospelova@carleton.ca March 17, 2014

1 GWAS

  • What is GWAS
  • Why GWAS important: applications
  • Personalized medicine
  • How GWAS conducted

2 SNP

  • What are SNPs
  • How are they collected
  • What is the meaning on SNP correlations
  • fourth thing

3 Challenges of GWAS

  • Data set size
  • Dependence of variables: amplification and masking effects: epistasis
  • ”Genetic hitchhiking”
  • Rare variations
  • Sample size
  • Missing variables: incomplete data sets

1

slide-2
SLIDE 2

4 Current GWAS approaches

  • Deterministic
  • Nondeterministic

5 Deterministic GWAS

  • Example
  • Pros/cons

6 Nondeterministic GWAS

  • Example
  • Pros/cons

7 Random Forest as an example of nondeterministic approach

  • Decision Tree methods overview
  • Ensemble learning approach
  • Random Forest: historical

8 Random Forest

  • Main Prinsiples
  • Algorithm
  • Pros : strong points + its popularity and practical applications
  • Cons : challenges, especially fading GWAS problems

9 Random Jungle

  • RJ as extension of RF tailored for GWAS
  • Its main features and strong sides
  • Note: in MPI: multiple single messages passes around

10 Coarse Grained Model

  • Refresh on the concept

2

slide-3
SLIDE 3

11 Project Concept

  • Suggest modification of RJ towards collective communications
  • Illustrate complexity of the source code

12 Cluster run

  • About cluster used
  • Results
  • Analysis

13 Cloud

  • What is Cloud
  • Currently available clouds
  • Programming concepts : Map Reduce - Hadoop and MPI

14 Starcluster

  • History/origin
  • Quick review of the tool
  • How to create MPI cluster with the tool

15 Hadoop

  • Alternative RF implementations
  • Example: RF on Hadoop

16 Cloud run

  • About Cloud used
  • Results
  • Analysis

17 Conclusion

  • Discuss results
  • Suggest farther steps in research of/and improvement

3