The e-Science Initiative in the UK and the Need for International - - PowerPoint PPT Presentation
The e-Science Initiative in the UK and the Need for International - - PowerPoint PPT Presentation
The e-Science Initiative in the UK and the Need for International Collaboration Tony Hey tony.hey@epsrc.ac.uk A Definition of e-Science e-Science is about global collaboration in key areas of science, and the next generation of
A Definition of e-Science
‘e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.’ John Taylor Director General of Research Councils Office of Science and Technology
UK e-Science Funding
Second Phase: 2003 –2006
- Application Projects
– £96M – All areas of science and engineering
- Core Programme
– £16M – Core Grid Middleware – DTI follow-on? First Phase: 2001 –2004
- Application Projects
– £74M – All areas of science and engineering
- Core Programme
– £15M + £20M (DTI) – Collaborative industrial projects
e-Science Core Programme
Overall Rationale: – Assist development of essential, well- engineered, generic, Grid middleware usable by both e-scientists and industry – Provide necessary infrastructure support for UK e-Science Research Council projects – Collaborate with the international e-Science and Grid communities – Work with UK industry to develop industrial-strength Grid middleware
Cambridge Newcastle Edinburgh Oxford Glasgow Manchester Cardiff Southampton London Belfast DL RAL Hinxton
UK e-Science Grid
e-Science Centres of Excellence
- Birmingham/Warwick – Modelling
- Bristol – Media
- UCL – Networking
- White Rose Grid – Leeds, York, Sheffield
- Lancaster – Social Science
- Leicester – Astronomy
- Reading - Environment
Cambridge Newcastle Edinburgh Oxford Glasgow Manchester Cardiff Soton London Belfast DL RL Hinxton
UK e-Science Grid – Next Steps
Two activities in parallel
- Deploy ‘production GT2 Grid’ based on four
dedicated nodes plus the two UK Supercomputer Facilities
Use same middleware base as EGEE if possible Set up Grid Operational Centre with operational
security team
Gain experience from a genuine user community
- Develop ‘OGSA Grid’
Funded two evaluation OGSA Grid projects Extend to e-Science Centres Work with EGEE project
Motivations
- Scientific community developed the Web as
a collaboration technology Transformed modern business world!
- John Taylor brought the HP vision of the
information utility to the scientific context Global infrastructure for scientific R&D
- Scientific community is now developing the
Grid as a collaboration technology Will this be relevant to business …?
Maintenance Centre Global Network eg: SITA Internet, e-mail, pager DS&S Engine Health Center
Data centre
DAME Project
In flight data Ground Station Airline
Nucleotide Annotation Workflows Download sequence from Reference Server Save to Distributed Annotation Server Interactive Editor & Visualisation Execute distributed annotation workflow
NCBI EMBL TIGR SNP Inter Pro SMART SWISS PROT GO KEGG
1800 clicks 500 Web access 200 copy/paste 3 weeks work in 1 workflow and few second execution
Discovery Net Project
eDiaMoND Project
Mammograms have different appearances, depending on image settings and acquisition systems Temporal mammography
Standard Mammo Format Standard Mammo Format
Computer Aided Detection 3D View
Powering the Virtual Universe
http://www.astrogrid.ac.uk
(Edinburgh, Belfast, Cambridge, Leicester, London, Manchester, RAL)
Multi-wavelength showing the jet in M87: from top to bottom – Chandra X-ray, HST optical, Gemini mid-IR, VLA radio. AstroGrid will provide advanced, Grid based, federation and data mining tools to facilitate better and faster scientific
- utput.
Picture credits: “NASA / Chandra X-ray Observatory / Herman Marshall (MIT)”, “NASA/HST/Eric Perlman (UMBC), “Gemini Observatory/OSCIR”, “VLA/NSF/Eric Perlman (UMBC)/Fang Zhou, Biretta (STScI)/F Owen (NRA)”
p13 Printed: 24/11/2003
Image from ESO Image + IRIS data
Gamma Ray Bursts Gamma Ray Bursts
- D. Ducros, ESA
Collate data from multiple telescopes
- ver months -
meta data issues Localise GRB alert in minutes – as fade rapidly. SWIFT satellite
- bserves gamma
ray burst Compare against SN light curves – bump shows eveidence for a SN in the GRB (Price et al, 2002) Interaction with
- bservatory pipe-
lines Cross reference multi- λ data – ID pre-cursor and or environment Large computational photometric redshift calcs on multi-λ > gives distance Reprocessing of ionospheric STP data change coords from earth to celestial
myGrid: An in silico experiment = a web of interconnected information and components
Provenance record of workflow runs Provenance of the workflow
- template. Related
workflows. People Ontologies describing workflows Services used Notes Data in and out Literature
myGrid
Candidate gene pool Genotype Assay Design System
Select a SNP from candidate gene. Is this SNP associated with Disease?
Primer Design Gene ID Restriction Fragment Length Polymorphism experiment SNP SN P SN P Use primers designed by myGrid to amplify region flanking SNP on the gene Emboss Eprimer application in SoapLab Selection of restriction enzyme Talisman SNP Emboss Restrict in SoapLab
3D Protein Structure
What is the structure of the protein product encoded by my candidate gene? PDB
Query PDB & display protein structure using Rasmol Obtain information about protein & extract information about active site Swiss-Prot AMBIT Interpro
AMBIT
Determine whether coding SNPs affects the active site of the protein
Annotation Pipeline
What is known about my candidate gene?
Medline OMIM GO BLAST EMBL DQP
Query
Data-Centric Grids
Data Complexity Computational Complexity
Workflow in eScience and eBusiness
- Open versus closed worlds
– Design tools – Semantics and metadata
- Verification and publication
– Visualisation – Publication
- Static versus dynamic workflows
– Provenance
- Volume and Type of Data
– Large and structured data
Computer Science for e-Science
- EPSRC funding £9M CS research programme
- 18 Projects funded to date including:
- Ontologies
- Incomplete data sets
- Autonomic architectures
- Data publishing & curation,
- Provenance,
- QoS and SLAs
- Links to applications in Bioinformatics, particle
physics, materials modelling, maths etc Most leading CS groups engaged (> 50% in 5* rated departments)
Open Grid Services Architecture
- Development of Web Services
- OGSA will provide
Naming /Authorization / Security / Privacy/… Projects looking at higher level services: Workflow, Transactions, DataMining, Knowledge Discovery… Exploit Synergy: Commercial Internet with Grid Services
OGSA – DAI Project
- Initial £2M project with IBM and Oracle and Edinburgh,
Manchester and Newcastle Centres
- Production versions released July 2003 of:
– XML Database Interface (Xindice) – Relational Database Interface (DB2, Oracle, MySQL)
- Prototype version released of:
– Distributed Query Service
- Second phase of project now approved (£1.5M)
– Continued development and more functionality OGSA-DAI Team in Edinburgh now part of the Globus Alliance
The UK e-Science Experience
- UK e-Science Core Programme
– £20M for collaborative industrial R&D
Over 60 UK companies participating Over £30M industrial contributions
- Engineering, Pharmaceutical, Petrochemical
- IT companies, Commerce, Media
Core Programme: Phase 2
- 1. UK e-Science Grid/Centres and e-Science
Institute
- 2. Grid Support Centre and Network Monitoring
- 3. Core Middleware engineering
- 4. National Data Curation Centre
- 5. e-Science Exemplars/New Opportunities
- 6. Outreach and International involvement
Research Prototype Middleware to Production Quality
- Research projects are not funded to do the
regression testing, configuration and QA required to produce production quality middleware
- Common rule of thumb (Brooks) is that it requires
at least 10 times more effort to take ‘proof of concept’ research software to production quality Key issue for UK e-Science projects is to ensure that there is some documented, maintainable, robust grid middleware by the end of the 5 year £250M initiative
A UK Open Middleware Infrastructure Institute
- Repository for UK-developed Open Source
‘e-Science/Cyber-infrastructure’ Middleware
- Compliance testing for GGF/WS standards
- Documentation, specification and QA
- Fund work to bring ‘research project’ software
up to ‘production strength’
- Fund Middleware projects for identified ‘gaps’
- Work with US NSF, EU Projects and others
- Supported from major IT companies
Security Technology Roadmap
- Identified areas requiring further funding
classified in terms of ‘Short’, ‘Medium’ and ‘Long’ time frames
- JISC/JCSR will fund Short/Medium Term
security projects from the Roadmap
- Preparing £3M call for ‘Authorization Models
and Virtual Organisations’
- Exploring explicit link with Internet2 NSF NMI
project based on Shibboleth and PERMIS
- OMII/CP and EPSRC will consider Long Term
security R&D projects
UK Data Curation Centre
- In next 5 years e-Science projects will produce
more scientific data than has been collected in the whole of human history
- After 10 years can guarantee that the operating
and spreadsheet program and the hardware used to store data will not exist Establishing Centre to research and develop technologies and best practice for curating digital data Need to liaise closely with individual research communities, data archive centres and digital library community
e-Science Timeframes
2001 2002 2003 2004 2005 2006 2007 SR2000 * * * SR2002 * * * SR2004 * * * AAA Service * * LHC/LCG *
International Collaboration
- Support of Global Grid Forum
- NSF Cyberinfrastructure activities
– NSF and Internet2 joint security project – Edinburgh part of Globus Alliance – Collaboration with other major groups – Condor, SRB, …
- EU Activities
– Participation in EGEE – UK participation in 2nd call Grid proposals – ‘GridCoord’ SSA
- Bilateral collaborations