parallel and hybrid evolutionary algorithm in python
play

Parallel and Hybrid Evolutionary Algorithm in Python E. Kieffer UL - PowerPoint PPT Presentation

Parrallel Computing & University of Luxembourg Optimization Group Parallel and Hybrid Evolutionary Algorithm in Python E. Kieffer UL HPC Userssession -- UL HPC school 2017 Contents n Context and motivation n Clustering of the


  1. Parrallel Computing & University of Luxembourg Optimization Group Parallel and Hybrid Evolutionary Algorithm in Python E. Kieffer UL HPC Users’session -- UL HPC school 2017

  2. Contents n Context and motivation n Clustering of the Parkinson Disease Map n Bi-level Clustering approach n Python tools on the UL HPC Platform n CPLEX solver n SCOOP library n DEAP library n Experiments & Validation n Experiments on the Parkinson Disease Map n Comparison with Hierarchical Clustering

  3. CONTEXT & MOTIVATION

  4. Parkinson Disease Map • Large (hyper-)Graph • Extract Knowledge • First experiments with standard Clustering approach • Hierarchical Clustering • Several metric (e.g. GO, NET, EU) • Hard to combine

  5. Bi-level Clustering n Clustering often based on a two phase algorithm: n Find cluster representatives n Assign data to clusters n Generally the same metric is used for both steps n Consider these two steps as two nested optimization problems with different metrics n Metric: n Euclidean distance n Network distance n Distance based on Gene/Disease Ontology n Use Evolutionary Algorithm (EA) to solve the Bi-level Clustering problem n Use MOEA to detect the number of clusters

  6. Bi-level Optimization n Bi-levels ßà Nested problems n A problem constraining another one à NP-hard even for convex levels Upper-level Lower-level

  7. Bi-level Clustering

  8. Parallel and hybrid EA HPC

  9. PYTHON TOOLS ON THE UL HPC PLATFORM

  10. Using CPLEX on the UL HPC n IBM ILOG CPLEX Optimizer's mathematical programming technology. n One of the most efficient solver on the market: n CPLEX available for HPC user with IBM Academic Initiative membership n Need first to register to the IBM Academic Initiative: n https://developer.ibm.com/academic/ n Forward the membership confirmation mail to the HPC admins n To use CPLEX on the cluster: n $ module use $PROJECTWORK/cplex/soft/modules $ module load CPLEX

  11. Parallel Evaluations with SCOOP n Scalable COncurrent Operations in Python n is a distributed task module n concurrent parallel programming n on various environments, from heterogeneous grids to supercomputers n Command to execute a python script using SCOOP n python -m scoop --hostfile $OAR_NODEFILE -n 16 --ssh-executable “oarsh” hello.py n Parameters: n --hostfile: path to the file contains all hostnames n --ssh-executable: the command to access nodes (here oarsh) n -n: the number of workers from __future__ import print_function from scoop import futures import socket def helloWorld ( value ): Hello.py return "Hello World from{0}" . format ( socket . gethostname ()) if __name__ == "__main__" : returnValues = list ( futures . map ( helloWorld , range ( 16 ))) print( "\n" . join ( returnValues ))

  12. Example

  13. DEAP library for Evolutionary Computation in Python n https://github.com/DEAP/deap n Rapid prototyping and testing of ideas n Parallelization mechanism based on SCOOP n CMA-ES algorithm

  14. EXPERIMENTS & VALIDATION

  15. Clustering results

  16. Bi-level Clustering Enrichment analysis: hypergeometric test Enrichment analysis: hypergeometric test % '(% & )(& 𝑄 𝑌 = 𝑙 = n genes in a cluster ' N genes altogether m genes ) (background) in a GO term k genes in a cluster Adapted from: Florian Markowetz and in a GO term Network Biology Lent 2010 A cluster represents a sample of n genes from a total population of N genes. It is know that the considered GO term contains m genes. What is the probability to have the same k genes in our cluster and in the considered GO term ?

  17. Bi-level Clustering Enrichment of Disease Ontology terms p value cutoff 0.001 350 distance 01_net_go_ward 300 02_eu_go_ward 03_eu_net_ward unique_terms 250 04_clusteringNETEU 05_clusteringEUNET 06_clusteringGOEU 200 07_clusteringEUGO 08_clusteringGONET 09_clusteringNETGO 150 10_expert 100 2 10 20 30 40 50 60 70 80 90 clusters

  18. Conclusions n Knowledge extraction on the Parkinson Disease MAP n Bi-level clustering model n Solve the model with Hybrid and Parallel EA n Experiments required a lot of resources à UL HPC Platform n Hybrid à CPLEX solver n Parallel à SCOOP library for parallel evaluations n Evolutionary Computation à DEAP library

  19. Questions ? Thank you for your attention PS9 (13h30 – 15h30): Advanced Prototyping with python presented by Clement Parisot

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend