1 4/23/2015
From construction to deployment of LifeWatchGreece: The potential - - PowerPoint PPT Presentation
From construction to deployment of LifeWatchGreece: The potential - - PowerPoint PPT Presentation
From construction to deployment of LifeWatchGreece: The potential role of EGI - LW Competence Centre by Emmanouela Panteri Contributors: Christos Arvanitidis, Nicolas Bailly, Sarah Faulwetter,, Jacques Lagnel, George Perantinos, Anastasis
2 4/23/2015
LifeWatchGreece e-Infrastructure
Insert footer here LifeWatchGreece
LWG e-infrastructure:
- Multi-server e-infrastructure
currently deployed in HCMR, Crete
- Hosts biodiversity data and
applications Applications:
- e-services: searching datasets/ data
- r one-shot analyses
- vLabs: interfaces for advanced
selection of datasets/data, and more elaborated suites of analyses
series of web tools (vLabs or e-services) for the public
3 4/23/2015
Application development in 2 steps:
- Independent development of a web
application (by the team)
- Integration to the infrastructure / portal
Access Control
- Landing page: list of applications
- One-time sign-up for accessing all apps
- A few applications require more
credentials: the computer-intensive
- nes
- User Rights management
Graphical Interface
- A common graphical interface
frame/wrapper introducing all applications
Accessed by the LifeWatchGreece Portal
LifeWatchGreece
portal.lifewatchgreece.eu
4 4/23/2015
LWG e-Infrastructure: advantages
LifeWatchGreece
- Applications developed in any programming language (PHP,
Java EE, .NET, ...)
- Design, development and maintenance of applications
independent from each other: a common standard only for data exchange (DwC, …)
- Each application run in independent execution environment:
scalable VMs number if needed with more apps.
- Compartmented security: affected application does not
compromise others
- Core developers involved only at integration stage
LifeWatchGreece
5 4/23/2015
LWG e-Infrastructure: advantages
LifeWatchGreece
- Other integration methods: iframes, integrating graphically
commercial apps
- Open source applications integration possible with few
adaptations both at access level and graphical level, especially when under MVC architecture
- Moreover, most CMSes can be easily integrated, at least at
the access control level
- Certain javascript and CSS frameworks provided by default
through libraries in order to enforce the consistency of the user interface throughout the portal
LifeWatchGreece
6 4/23/2015
LWG Portal diagram
LifeWatchGreece
LWG: Application Layer, Data Layer, Cluster, Communication
7 4/23/2015
LWG e-Infrastructure: What is missing
LifeWatchGreece
- No user workspace
- Currently, files not retrievable from one session to the other, from
- ne tool to the other.
- Could EGI Competence Center provide such functionality?
Workspace development will increase significantly the storage requirements.
- Would require some work between LWG infrastructure and EGI-CC
(e.g., space allocation after sign up)
LifeWatchGreece
8 4/23/2015
Mainly focused on OMICs NGS data analysis:
- Transcriptomics (RNA-Seq)
- Genomics (Eukaryote and bacterial)
- Metagenomics (microbial community)
- Metabarcoding
- RAD-Sequencing
More than 170 bioinformatics packages covering:
- Genomes & transcriptomes de novo assembly
- Functional and structural genes annotation
- Sequence similarity (parallel BLAST) and mapping
- Population genetics
- Phylogeny reconstruction
- Statistics (250 R packages installed)
- Genetic markers mining/analysis
HPC bioinformatics platform
LifeWatchGreece
43 users from 11 institutes in 5 countries (Greece, Italy, France, Norway, Portugal) More than 8000 jobs submitted during the last month
9 4/23/2015
- 9 worker nodes
- 108 cores,
- 784 GB RAM,
- 30TB storage
- 10 Gbps ethernet network
- Gentoo Linux
- Resource Manager: Torque/Maui
- storage: XFS/NFS
- storage users quota
HPC bioinformatics platform upgrade
LifeWatchGreece
- 13 worker nodes
- 300 CPU cores
- 2.5 TB RAM
- 120 TB storage
- 40 Gbps Infiniband network
- Centos linux/debian
- Resource Manager: SLURM
- Storage: Lustre and ZFS/NFS
- Storage group/users quota
- LXC Virtualization
- User management via LDAP
~3x Performance
Software (open source) Hardware
Languages: GCC, ICC/IFC, R, BioPerl Biopython, ruby, Biojava.... parallelization: openMPI, OpenMP and pthreads Database servers: MySQL, PostgreSQL, ...
10 4/23/2015
Bioinformatic challenges
LifeWatchGreece
RNA-Seq data analysis =>360 Mreads Optimised and parallelised pipeline Sequence similarity search: parallel BLAST =>10,000 queries Runtime on the biocluster (h) Runtime on a PC (1 CPU) Assembly requires~35 0GB shared RAM 12 (10 threads) >120 h Annotation BLAST 96 (94 jobs) >> 3 month InterPro 32 (48 jobs) 1.5 month Mapping 4 (12*10 threads) >10 days Total ~ 6 days >5 months
Nb CPUs 1 94 blastn (nt) Speedup / Runtime (h) 1.0 / 6.1 days 105.4 / 1.4 h blastx (nr) Speedup / Runtime (h) 1.0 / 11.6 days 88.8 / 3.1 h
11 4/23/2015
eServices and vLabs: the R-vLab
LifeWatchGreece
How can EGI Competence Center help LWG e-infrastructure to increase its computational power?
- Uses the “R” programming
language
- Supports an integrated and
- ptimized online R environment
(data manipulation and computational speed-up)
- Allows to overcome severe
computational power deficit, e. g.: Calculation on large matrices of several biodiversity indices and of multivariate analyses
12 4/23/2015
eServices and vLabs: the R-vLab
LifeWatchGreece
~20 fold speed-up Conventional Mantel compared to Parallel Mantel
13 4/23/2015
- Micro-computed tomography
- Non-destructive method of 3D x-ray
microscopy
- Creation of 3D models of objects
from a series of x-ray projection images MicroCT offers:
- Collection of virtual galleries
- f taxa displayed and disseminated
- Manipulation of the 3D models through
a series of online tools
- Download of datasets for local
manipulations
eServices and vLabs: MicroCT
LifeWatchGreece
How can EGI Competence Center help LWG e-infrastructure for the storage and image manipulation, incl. 3D models?
14 4/23/2015
MicroCT: current issues
In general:
- Potential large increase of the number image galleries especially
from museum specimen collections (several orders of magnitude)
- Need for 3D metadata standards: dissemination and searching
- Need for 3D data annotations protocols and tools
- Need for searching tools over the spread catalogues of galleries
(centralized or distributed) In LWG
- MicroCT generates many image files: storage issue
- Processing and manipulating images are CPU intensive: computing
issue
LifeWatchGreece
15 4/23/2015
Harvesting various other repositories such as:
- Taxonomic: CoL and PESI (and components: FADA, EMRS,
E+MP), WoRMS, EEA/EUNIS, ...
- Occurrences: GBIF, OBIS, ...
- Species traits: PolyTraits, FishBase, SeaLifeBase, eModNet, ...
- Bibliography: RefBank, BHL, AnimalBase, ...
- Citizen Science: iNaturalist, ...
- Workflows: BioVel, ...
Install mirror websites: FishBase, RefBank, GNI Develop Web Services for disseminating LWG data:
- Concerns about performance due to Web services use
LifeWatchGreece
LWG and EGI Competence Center
Processing power and storage requirements
16 4/23/2015
Linked Data / Linking Open Data
LifeWatchGreece LifeWatchGreece
LifeWatchGreece principle: make data available to everybody A number of datasets as RDF under triplestores are ready
Diagram from http://lod-cloud.net/
17 4/23/2015
LifeWatchGreece
LifeWatchGreece Research Infrastructure , funded by the GSRT (Greek government: structural funds), is the national effort to address the above requirement and to support relevant studies. To materialize its aim, LWG RI adheres to the central lifewatch.eu guidelines, and attempts to ally all the Greek scientific human resources working on biodiversity data and data
- bservatories.
Coordinated by the Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC, www. imbbc.hcmr.gr) of the Hellenic Center for Marine Research (HCMR, www.hcmr.gr), LWG includes 49 partner institutions covering a wide range of scientific disciplines (terrestrial, marine and freshwater biology, zoology, botany, geography, forestry, agriculture, genetics, biotechnology, pharmacy, aquaculture, education and law).
LifeWatchGreece