The e-Science Initiative in the UK and the Need for International - - PowerPoint PPT Presentation

the e science initiative in the uk and the need for
SMART_READER_LITE
LIVE PREVIEW

The e-Science Initiative in the UK and the Need for International - - PowerPoint PPT Presentation

The e-Science Initiative in the UK and the Need for International Collaboration Tony Hey tony.hey@epsrc.ac.uk A Definition of e-Science e-Science is about global collaboration in key areas of science, and the next generation of


slide-1
SLIDE 1

The e-Science Initiative in the UK and the Need for International Collaboration

Tony Hey tony.hey@epsrc.ac.uk

slide-2
SLIDE 2

A Definition of e-Science

‘e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.’ John Taylor Director General of Research Councils Office of Science and Technology

slide-3
SLIDE 3

UK e-Science Funding

Second Phase: 2003 –2006

  • Application Projects

– £96M – All areas of science and engineering

  • Core Programme

– £16M – Core Grid Middleware – DTI follow-on? First Phase: 2001 –2004

  • Application Projects

– £74M – All areas of science and engineering

  • Core Programme

– £15M + £20M (DTI) – Collaborative industrial projects

slide-4
SLIDE 4

e-Science Core Programme

Overall Rationale: – Assist development of essential, well- engineered, generic, Grid middleware usable by both e-scientists and industry – Provide necessary infrastructure support for UK e-Science Research Council projects – Collaborate with the international e-Science and Grid communities – Work with UK industry to develop industrial-strength Grid middleware

slide-5
SLIDE 5

Cambridge Newcastle Edinburgh Oxford Glasgow Manchester Cardiff Southampton London Belfast DL RAL Hinxton

UK e-Science Grid

slide-6
SLIDE 6

e-Science Centres of Excellence

  • Birmingham/Warwick – Modelling
  • Bristol – Media
  • UCL – Networking
  • White Rose Grid – Leeds, York, Sheffield
  • Lancaster – Social Science
  • Leicester – Astronomy
  • Reading - Environment
slide-7
SLIDE 7

Cambridge Newcastle Edinburgh Oxford Glasgow Manchester Cardiff Soton London Belfast DL RL Hinxton

slide-8
SLIDE 8

UK e-Science Grid – Next Steps

Two activities in parallel

  • Deploy ‘production GT2 Grid’ based on four

dedicated nodes plus the two UK Supercomputer Facilities

Use same middleware base as EGEE if possible Set up Grid Operational Centre with operational

security team

Gain experience from a genuine user community

  • Develop ‘OGSA Grid’

Funded two evaluation OGSA Grid projects Extend to e-Science Centres Work with EGEE project

slide-9
SLIDE 9

Motivations

  • Scientific community developed the Web as

a collaboration technology Transformed modern business world!

  • John Taylor brought the HP vision of the

information utility to the scientific context Global infrastructure for scientific R&D

  • Scientific community is now developing the

Grid as a collaboration technology Will this be relevant to business …?

slide-10
SLIDE 10

Maintenance Centre Global Network eg: SITA Internet, e-mail, pager DS&S Engine Health Center

Data centre

DAME Project

In flight data Ground Station Airline

slide-11
SLIDE 11

Nucleotide Annotation Workflows Download sequence from Reference Server Save to Distributed Annotation Server Interactive Editor & Visualisation Execute distributed annotation workflow

NCBI EMBL TIGR SNP Inter Pro SMART SWISS PROT GO KEGG

1800 clicks 500 Web access 200 copy/paste 3 weeks work in 1 workflow and few second execution

Discovery Net Project

slide-12
SLIDE 12

eDiaMoND Project

Mammograms have different appearances, depending on image settings and acquisition systems Temporal mammography

Standard Mammo Format Standard Mammo Format

Computer Aided Detection 3D View

slide-13
SLIDE 13

Powering the Virtual Universe

http://www.astrogrid.ac.uk

(Edinburgh, Belfast, Cambridge, Leicester, London, Manchester, RAL)

Multi-wavelength showing the jet in M87: from top to bottom – Chandra X-ray, HST optical, Gemini mid-IR, VLA radio. AstroGrid will provide advanced, Grid based, federation and data mining tools to facilitate better and faster scientific

  • utput.

Picture credits: “NASA / Chandra X-ray Observatory / Herman Marshall (MIT)”, “NASA/HST/Eric Perlman (UMBC), “Gemini Observatory/OSCIR”, “VLA/NSF/Eric Perlman (UMBC)/Fang Zhou, Biretta (STScI)/F Owen (NRA)”

p13 Printed: 24/11/2003

slide-14
SLIDE 14

Image from ESO Image + IRIS data

Gamma Ray Bursts Gamma Ray Bursts

  • D. Ducros, ESA

Collate data from multiple telescopes

  • ver months -

meta data issues Localise GRB alert in minutes – as fade rapidly. SWIFT satellite

  • bserves gamma

ray burst Compare against SN light curves – bump shows eveidence for a SN in the GRB (Price et al, 2002) Interaction with

  • bservatory pipe-

lines Cross reference multi- λ data – ID pre-cursor and or environment Large computational photometric redshift calcs on multi-λ > gives distance Reprocessing of ionospheric STP data change coords from earth to celestial

slide-15
SLIDE 15

myGrid: An in silico experiment = a web of interconnected information and components

Provenance record of workflow runs Provenance of the workflow

  • template. Related

workflows. People Ontologies describing workflows Services used Notes Data in and out Literature

slide-16
SLIDE 16

myGrid

Candidate gene pool Genotype Assay Design System

Select a SNP from candidate gene. Is this SNP associated with Disease?

Primer Design Gene ID Restriction Fragment Length Polymorphism experiment SNP SN P SN P Use primers designed by myGrid to amplify region flanking SNP on the gene Emboss Eprimer application in SoapLab Selection of restriction enzyme Talisman SNP Emboss Restrict in SoapLab

3D Protein Structure

What is the structure of the protein product encoded by my candidate gene? PDB

Query PDB & display protein structure using Rasmol Obtain information about protein & extract information about active site Swiss-Prot AMBIT Interpro

AMBIT

Determine whether coding SNPs affects the active site of the protein

Annotation Pipeline

What is known about my candidate gene?

Medline OMIM GO BLAST EMBL DQP

Query

slide-17
SLIDE 17

Data-Centric Grids

Data Complexity Computational Complexity

slide-18
SLIDE 18

Workflow in eScience and eBusiness

  • Open versus closed worlds

– Design tools – Semantics and metadata

  • Verification and publication

– Visualisation – Publication

  • Static versus dynamic workflows

– Provenance

  • Volume and Type of Data

– Large and structured data

slide-19
SLIDE 19

Computer Science for e-Science

  • EPSRC funding £9M CS research programme
  • 18 Projects funded to date including:
  • Ontologies
  • Incomplete data sets
  • Autonomic architectures
  • Data publishing & curation,
  • Provenance,
  • QoS and SLAs
  • Links to applications in Bioinformatics, particle

physics, materials modelling, maths etc Most leading CS groups engaged (> 50% in 5* rated departments)

slide-20
SLIDE 20

Open Grid Services Architecture

  • Development of Web Services
  • OGSA will provide

Naming /Authorization / Security / Privacy/… Projects looking at higher level services: Workflow, Transactions, DataMining, Knowledge Discovery… Exploit Synergy: Commercial Internet with Grid Services

slide-21
SLIDE 21

OGSA – DAI Project

  • Initial £2M project with IBM and Oracle and Edinburgh,

Manchester and Newcastle Centres

  • Production versions released July 2003 of:

– XML Database Interface (Xindice) – Relational Database Interface (DB2, Oracle, MySQL)

  • Prototype version released of:

– Distributed Query Service

  • Second phase of project now approved (£1.5M)

– Continued development and more functionality OGSA-DAI Team in Edinburgh now part of the Globus Alliance

slide-22
SLIDE 22

The UK e-Science Experience

  • UK e-Science Core Programme

– £20M for collaborative industrial R&D

Over 60 UK companies participating Over £30M industrial contributions

  • Engineering, Pharmaceutical, Petrochemical
  • IT companies, Commerce, Media
slide-23
SLIDE 23

Core Programme: Phase 2

  • 1. UK e-Science Grid/Centres and e-Science

Institute

  • 2. Grid Support Centre and Network Monitoring
  • 3. Core Middleware engineering
  • 4. National Data Curation Centre
  • 5. e-Science Exemplars/New Opportunities
  • 6. Outreach and International involvement
slide-24
SLIDE 24

Research Prototype Middleware to Production Quality

  • Research projects are not funded to do the

regression testing, configuration and QA required to produce production quality middleware

  • Common rule of thumb (Brooks) is that it requires

at least 10 times more effort to take ‘proof of concept’ research software to production quality Key issue for UK e-Science projects is to ensure that there is some documented, maintainable, robust grid middleware by the end of the 5 year £250M initiative

slide-25
SLIDE 25

A UK Open Middleware Infrastructure Institute

  • Repository for UK-developed Open Source

‘e-Science/Cyber-infrastructure’ Middleware

  • Compliance testing for GGF/WS standards
  • Documentation, specification and QA
  • Fund work to bring ‘research project’ software

up to ‘production strength’

  • Fund Middleware projects for identified ‘gaps’
  • Work with US NSF, EU Projects and others
  • Supported from major IT companies
slide-26
SLIDE 26

Security Technology Roadmap

  • Identified areas requiring further funding

classified in terms of ‘Short’, ‘Medium’ and ‘Long’ time frames

  • JISC/JCSR will fund Short/Medium Term

security projects from the Roadmap

  • Preparing £3M call for ‘Authorization Models

and Virtual Organisations’

  • Exploring explicit link with Internet2 NSF NMI

project based on Shibboleth and PERMIS

  • OMII/CP and EPSRC will consider Long Term

security R&D projects

slide-27
SLIDE 27

UK Data Curation Centre

  • In next 5 years e-Science projects will produce

more scientific data than has been collected in the whole of human history

  • After 10 years can guarantee that the operating

and spreadsheet program and the hardware used to store data will not exist Establishing Centre to research and develop technologies and best practice for curating digital data Need to liaise closely with individual research communities, data archive centres and digital library community

slide-28
SLIDE 28

e-Science Timeframes

2001 2002 2003 2004 2005 2006 2007 SR2000 * * * SR2002 * * * SR2004 * * * AAA Service * * LHC/LCG *

slide-29
SLIDE 29

International Collaboration

  • Support of Global Grid Forum
  • NSF Cyberinfrastructure activities

– NSF and Internet2 joint security project – Edinburgh part of Globus Alliance – Collaboration with other major groups – Condor, SRB, …

  • EU Activities

– Participation in EGEE – UK participation in 2nd call Grid proposals – ‘GridCoord’ SSA

  • Bilateral collaborations

– Japan, China and Australia N+N Meetings – Dialogues with other EU countries

UK-France Meeting!

slide-30
SLIDE 30

Licklider’s Vision of the Grid

“Lick had this concept of the intergalactic network which he believed was everybody could use computers anywhere and get at data anywhere in the world. He didn’t envision the number of computers we have today by any means, but he had the same concept – all of the stuff linked together throughout the world, that you can use a remote computer, get data from a remote computer, or use lots of computers in your job. The vision was really Lick’s originally.”

Larry Roberts – Principal Architect of the ARPANET

slide-31
SLIDE 31

Acknowledgements

With thanks to Vijay Dialani, David Gavanagh, Carole Goble, Mark Greenwood, Tony Linde, Tom Oinn and Nick Walton