DILIGENT Digital libraries powered by the Grid Bhaskar Mehta, Fhg - - PowerPoint PPT Presentation

diligent
SMART_READER_LITE
LIVE PREVIEW

DILIGENT Digital libraries powered by the Grid Bhaskar Mehta, Fhg - - PowerPoint PPT Presentation

DILIGENT Digital libraries powered by the Grid Bhaskar Mehta, Fhg IPS I, Germany Work Package Leader, Content and Metadata Management Bhaskar.Mehta@ ipsi.fraunhofer.de Overview Introduction to DILIGENT Grid: Oppurtunity and Challenge


slide-1
SLIDE 1

DILIGENT

Digital libraries powered by the Grid

Bhaskar Mehta, Fhg IPS I, Germany

Work Package Leader, Content and Metadata Management Bhaskar.Mehta@ ipsi.fraunhofer.de

slide-2
SLIDE 2

International Symposium on Grid Computing, Taipei, 3rd May 2006 2

Overview

Introduction to DILIGENT Grid: Oppurtunity and Challenge Challenges in Information Management Data Management in DILIGENT Open Issues and Next S teps

slide-3
SLIDE 3

International Symposium on Grid Computing, Taipei, 3rd May 2006 3

Introduction to DILIGENT

DILIGENT: A Digital Library Infrastructure on Grid-Enabled Technology Duration: 3 years Commencement Date: S eptember 2004 Effort: 1024 p/ m Cost: 9.8 M Euro European Union funding: 6.3 M Euro

slide-4
SLIDE 4

International Symposium on Grid Computing, Taipei, 3rd May 2006 4

Partners

Consiglio Nazionale delle Ricerche – ISTI (Italy, S cientific Co-ordinator) European Research Consortium for Informatics and Mathematics (France, Adm Coordinator) University of Athens (Greece) Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. – IPS I (Germany) University for Health Informatics and Technology Tyrol (Austria)/ ETH Zürich/ UNI Basel University of S trathclyde (United Kingdom) Engineering Ingegneria Informatica S pA (Italy) Fast S earch & Transfer AS A (Norway) 4D S OFT S

  • ftware Development Ltd. (Hungary)

European Organization for Nuclear Research (S witzerland) European S pace Agency – ESA (Italy) S cuola Normale S uperiore (Italy) RAI Radio Televisione Italiana (It aly)

slide-5
SLIDE 5

International Symposium on Grid Computing, Taipei, 3rd May 2006 5

Grid Jobs

DILIGENT Objectives

To creat e an advanced test-bed t hat will allow members of dynamic virt ual e-S cience organizat ions t o access shared knowledge and t o collaborat e in a secure, coordinat ed, dynamic and cost -effect ive way. Expect ed Out come A Digit al library infrast ruct ure which is Grid based A t est bed based on t his infrast ruct ure Two implement ed S cenarios: Eart h S cience and Cult ural Herit age

slide-6
SLIDE 6

International Symposium on Grid Computing, Taipei, 3rd May 2006 6

Motivation for NGDLs: Digital library challenges

Cost & Time Construction and management of a DL require Construction and management of a DL requires s high investments and specialized person high investments and specialized personn nel el Years are spent in designing and setting up a Years are spent in designing and setting up a DL DL S hared Infrastructure with Authoring Capabilities S hared Infrastructure with Authoring Capabilities New functionality is computat ionally expensive and evolving Multimedia indexing, clustering: e.g. LS I, pLS A Multimedia indexing, clustering: e.g. LS I, pLS A Multimedia querying: e.g. Image retrieval by feature vectors Multimedia querying: e.g. Image retrieval by feature vectors Multimedia processing: e.g. Satellite images, Partial encrpytion Multimedia processing: e.g. Satellite images, Partial encrpytion for video for video S ervice Based Digital Libaries, with process management/ distribu S ervice Based Digital Libaries, with process management/ distribution support tion support Heterogeneity & Distribution DLs (and underlying components) use different models, apis, data DLs (and underlying components) use different models, apis, data formats, etc formats, etc DLs are distributed/ replicated DLs are distributed/ replicated Basing DLs on standards Basing DLs on standards Providing support for federated/ distributed search, data broker Providing support for federated/ distributed search, data brokering ing

slide-7
SLIDE 7

International Symposium on Grid Computing, Taipei, 3rd May 2006 7

Grid as an Oppurtunity... and a Challenge

Digital Digital Libary Libary

Challenge Potential

Grid Grid Grid Grid Grid DL DL OS OS

slide-8
SLIDE 8

International Symposium on Grid Computing, Taipei, 3rd May 2006 8

Some Methodological Challenges

S ervice Oriented, Distributed Architecture Requires open systems for Requires open systems for indexing, searching, feature extraction, metadata management indexing, searching, feature extraction, metadata management Distributed S earch Query Optimization Query Optimization S emantic Data Integration S emantic Data Integration On Demand S ervice Activation S atellite Images S atellite Images Extraction Extraction Virtual Organizations Content S ecurity Content S ecurity Resource S ecurity Resource S ecurity

slide-9
SLIDE 9

International Symposium on Grid Computing, Taipei, 3rd May 2006 9

Some Technological Challenges

Grid Technology is File centric: DLs are collection centric Metadata mmgt with the Grid: Based on key-value pairs Lack of support for structered data (e.g. XML) Lack of support for structered data (e.g. XML) Retrieval S upport is limited Retrieval S upport is limited Availibility and Replication S upport: file based Real time processing vs batch processing DL users require instantaneous response (ala Google) DL users require instantaneous response (ala Google) Grid processes usually can Grid processes usually can‘ ‘ t provide real time response t provide real time response

slide-10
SLIDE 10

International Symposium on Grid Computing, Taipei, 3rd May 2006 10

DILIGENT Architecture

slide-11
SLIDE 11

International Symposium on Grid Computing, Taipei, 3rd May 2006 11

Data Management in DILIGENT (1)

Common functionality for Content and Metadata management Effort duplication S torage S torage , , Replication Replication Change Change Notification Notification, Association , Association Consistancy Consistancy gLite functionality S eperate S eperate pipelines pipelines for for Content Content and and Metadata Metadata Incomplete Incomplete functionality functionality ( (e.g e.g. . replication replication) ) Insufficient Insufficient for for DILIGENT DILIGENT FileS ystem FileS ystem vs vs Data Data Model Model Flat Flat records records vs vs XML XML

gLite

CM MM

gLite

CM MM

Common Layer

gLite

CM MM

Common Layer

XMLDB Emulation

slide-12
SLIDE 12

International Symposium on Grid Computing, Taipei, 3rd May 2006 12

Data Management in DILIGENT (2)

Indentifying 3 basic layers Base Base layer layer : : glite glite functionality functionality (S E (S E, , Catalog Catalog, FTS ) , FTS ) S torage S torage Layer Layer: ( : (Replication Replication, , change change notification notification, , transactional transactional support support ) ) S ervice S ervice Layer Layer: S ervice : S ervice specific specific functionality functionality, API/ WS , API/ WS view view. .

Content Management Metadata Management Storage Management Layer Base Layer

slide-13
SLIDE 13

International Symposium on Grid Computing, Taipei, 3rd May 2006 13

Data management in DILIGENT (3)

API / WSDL API / WSDL

Storage Layer Base Layer

Content Manager

Metadata Management

Content Security

Metadata Catalog

Metadata Broker Query Processor Annotation

Manager

slide-14
SLIDE 14

International Symposium on Grid Computing, Taipei, 3rd May 2006 14

Current Status and Future Steps

Detailed design has been completed APIs under implementation 1st Experimental prototype based on OpenDLib Extensive testing and deployment of gLite 1.1 -> 3.0 Next S teps Integrate finished components Integrate finished components Deploy Diligent on the Grid Infrastructure Deploy Diligent on the Grid Infrastructure Develop prototypes based on Diligent Develop prototypes based on Diligent Testing & User Feedback Testing & User Feedback

slide-15
SLIDE 15

International Symposium on Grid Computing, Taipei, 3rd May 2006 15

Types of Involvement for Observers

Information about proj ect activities (www.diligentproj ect.org) Involvement in workshops Possible involvement in validation Feedback for DILIGENT development Candidates for adoption of DILIGENT infrastructure

slide-16
SLIDE 16

International Symposium on Grid Computing, Taipei, 3rd May 2006 16

Contact us

Co-operation with other proj ects/ communities is welcome www.diligentproject.org

Contact people:

  • Donatella Castelli, Pasquale Pagano, IS

TI-CNR donatella.castelli/ pasquale.pagano@ isti.cnr.it

  • Jessica Michael, ERCIM

j essica.michel@ ercim.org

  • Bhaskar Mehta, Fraunhofer IPS

I bhaskar.mehta@ ipsi.fraunhofer.de

slide-17
SLIDE 17

International Symposium on Grid Computing, Taipei, 3rd May 2006 17

Questions ? Questions ?

slide-18
SLIDE 18

International Symposium on Grid Computing, Taipei, 3rd May 2006 18

Research today

Research is carried out by groups of individuals, belonging to different institutions, that dynamically aggregate to carry out proj ects together By sharing their resources these individuals create better conditions for their research Digital libraries that maintain the produced knowledge and make it accessible worldwide are becoming key instruments for scientific collaboration in many research areas

slide-19
SLIDE 19

International Symposium on Grid Computing, Taipei, 3rd May 2006 19

Complementary User Scenarios

Earth S cience Domain: Well Well-

  • established tradition in exploiting new

established tradition in exploiting new technologies technologies Wide variety of content types (maps, satellite images, Wide variety of content types (maps, satellite images, measurements, text) measurements, text) Very large, dynamic data sets Very large, dynamic data sets S upport for community events, report generation, S upport for community events, report generation, disaster management. disaster management. Cultural Heritage Domain: IT technology exploitation still in infancy IT technology exploitation still in infancy Multidisciplinary collaborative research Multidisciplinary collaborative research Image based retrieval/ semantic analysis of images Image based retrieval/ semantic analysis of images S upport for research and teaching S upport for research and teaching

slide-20
SLIDE 20

International Symposium on Grid Computing, Taipei, 3rd May 2006 20

DILIGENT Goals & Approach

S ervice Oriented Digital Library Infrastructure Based on EGEE High computing and storage capabilities for handling a wide vari High computing and storage capabilities for handling a wide variety ety

  • f information obj ects
  • f information obj ects

Controlled S haring of Resources Controlled S haring of Resources Basic S ervices for DL Creation & Management DL Creation & Management Indexing, S earch, Data Fusion Indexing, S earch, Data Fusion Content & Metadata Management Content & Metadata Management Process Management Process Management Testbed for User S cenarios Earth S ciences Earth S ciences Cultural Heritage Cultural Heritage