Exploring Architecture Options for a Federated, Cloud-based Systems - PowerPoint PPT Presentation

Exploring Architecture Options for a Federated, Cloud-based Systems Biology Knowledgebase Ian Gorton, Jenny Liu, Jian Yin 1

Systems Biology Systems Biology Integrated study of organisms as a whole Obtain, integrate, and analyze complex data from multiple experimental sources using interdisciplinary tools Requirements Large amount of data Different types of tools Large amount of computation resources 2

Systems Biology Knowledgebase Drawbacks of the current approach Threshold of entrance can be high Little reusing and sharing of the data and tools, wasteful repetitive effort to develop similar software tools Results are hard to replicated Seamlessly sharing and integration of data and software tools between multiple institution are attractive The goal of system biology knowledgebase is to exploit cloud computing technologies to enable sharing of data and software tools 3

Why Cloud Computing Enable sharing of data and software tools Dynamic allocation of computing resources Many software tools can be converted to run on top of cloud computing services such as Hadoop 4

Outline Introduction System Architecture Prototype of selected components Case study Hadoop based systems biology tools Conclusion 5

Centralize verse Federated Advantages of centralized approach Ease of integration More efficient computing resource allocations However, many institute may want to retain controls of their data and tools Federated approach Leverage specialized computing resources across organizations 6

Architecture Overview Workflow Tools, Web Portals, Desktop Apps User Access Layer cURL php java python scripts RESTful API Layer Infrastructure Middleware and Data and Resource Layer Database Adaptors Workflow Utilities Directories Kbase Interface Layer (for flexible federation of Kbase Data and Compute resources) Federation Semantic Access Interface Layer Layer Cloud-based Cloud HPC-based Kbase data APIs computations computations core Example Federated Cloud Resources storage e.g. S3 e.g. EC2 e.g. Clusters 7

Components Location independent components Uniformed interfaces Easy composition Execution can be monitored with JBPM 8

Secure Communication Security must be ensured for communication across institutions Only SSL traffic are allowed through firewall Requiring all the components to use SSL could be difficult Use SOCKS to minimize code changes of components 9

Example Original code URL url = new URL(urlname); Modified code SocketAddress addr = new InetSocketAddress("localhost", 8182); Proxy proxy = new Proxy(Proxy.Type.SOCKS, addr); URL url = new URL(urlname); // Create the URL URLConnection uc = url.openConnection(proxy); 10

Prototype Protein fasta file Script: translate DNA (.faa file) (.fna file) in six frames Query & copy the .fna file Query & copy the parameter file .faa file GenBank Polygraph Query & copy the .dta files Query & copy the post-process .fna file and the script .gbk file peptide file Proteomics data (dta files) Visualization tool Visualization at a user’s local workstation Advanced Visualizations 1 VESPA

Hadoop Based Polygraph Polygraph is a proteomics application to identify peptides from MS data Initially implemented with MPI Loosely coupled and suitable for Hadoop Small amount of effort to adapt it to run on top of Hadoop 12

Running Polygraph 13

Experimental Results 14

Comparison MPI-base implementation is highly tuned and thus more efficient Hadoop based approach is more flexible Most cloud computing providers provide Hadoop service Flexibility for leveraging various amounts of computing resource without changing code Can produce results even with one machine More machines can speed up the computation Many system biology applications can be adapted to Map Reduce paradigm 15

Conclusion Sharing data, software tools, and computing resources is essential for systems biology Cloud computing can provide the ideal platforms Many applications are loosely coupled and can be adapted to run in cloud computing environments Federated approach provides more flexibility Uniformed interfaces enable easy integration 16

Exploring Architecture Options for a Federated, Cloud-based Systems - PowerPoint PPT Presentation

Exploring Architecture Options for a Federated, Cloud-based Systems Biology Knowledgebase Ian Gorton, Jenny Liu, Jian Yin 1 Systems Biology Systems Biology Integrated study of organisms as a whole Obtain, integrate, and analyze complex data

Exploring the IPY with NOAA Exploring the IPY with NOAA Exploring the IPY with NOAA Exploring

Docker in the EGI Docker in the EGI Federated Cloud Federated Cloud Carlos Gimeno

Federated Learning Min Du Postdoc, UC Berkeley Outline q Preliminary: deep learning and SGD q

Differentially-Private Federated Linear Bandits Introduction Federated Learning Contextual

Analyzing Federated Learning through an Adversarial Lens Arjun Nitin Bhagoji 1 , Supriyo

Fair Resource Allocation in Federated Learning Tian Li (CMU) , Maziar Sanjabi (Facebook AI), Ahmad

Federated Machine Learning via Over-the-Air Computation Yuanming Shi ShanghaiTech University 1

Exotic Options: An Overview Exotic options: Options whose characteristics vary from standard call

Building a Private Cloud Cloud Infrastructure Using Opensource Building a Private Cloud OSCON

KAFKA STREAMS CLOUD MONITORING AWS CLOUD MONITORING AWS APP CLOUD MONITORING AWS HTTP APP

OpenStack on the EGI Federated Cloud Enol En l Fe Fernndez Cloud ud Archi hitect EGI

Cryptography Seny Kamara Cryptography Group Microsoft Research Outline Cloud Architecture

SNR SNR- -cloud interaction cloud interaction cloud interaction SNR SNR cloud interaction

Cloud Cloud Cloud Cloud network Edge Edge Edge Edge as a Edge Edge Edge Edge Edge

Cloud Ross Mallace Commercial Director Cloud/SaaS Cloud is here. ALL By 2020 most core

Embracing Cloud Ian Apperley Agenda A little about me What is Cloud and where did it come

A Graphical Method For Reducing and Relating Models in Systems Biology Fran cois Fages Joint

Models For All Nicolas Le Novre Characterising dynamical behaviours Models

The Bio-PEPA Tool Suite Jane Hillston School of Informatics and Centre for Systems Biology at

Introduction to Complex Systems Summer 2017, Prof. Dirk Brockmann Dirk Brockmann email :

Equilibrium Model Selection Tom Radivoyevitch Assistant Professor Epidemiology and Biostatistics

Model-checking in systems biology - From Micro to Macro 1 / 62 00001 - 00:00:01 Model-checking

Realization theory for systems biology Mihly Petreczky CNRS Ecole Central Lille, France

Probabilistic modeling of sensor artifacts in critical care Norm Aleks and Stuart J. Russell