Investigation as a member of research discourse Vasily Bunakov - - PowerPoint PPT Presentation

investigation as a member of
SMART_READER_LITE
LIVE PREVIEW

Investigation as a member of research discourse Vasily Bunakov - - PowerPoint PPT Presentation

Investigation as a member of research discourse Vasily Bunakov Science and Technology Facilities Council United Kingdom Digital Libraries: Advanced Methods and Technologies, Digital Collections. Dubna, Russia, October 16, 2014 STFC Funds and


slide-1
SLIDE 1

Investigation as a member of research discourse

Vasily Bunakov Science and Technology Facilities Council United Kingdom

Digital Libraries: Advanced Methods and Technologies, Digital Collections. Dubna, Russia, October 16, 2014

slide-2
SLIDE 2

Scientific Computing develops and

  • perates computing infrastructure:
  • High Performance Computing
  • Petabyte data store
  • CERN LHC Tier 1 hub

also conducts applied research and does software development Funds and operates large scale instruments for the UK and visitor researchers in:

  • physics, astronomy
  • chemistry, materials
  • biology, medicine

STFC

slide-3
SLIDE 3

Facilities Support

Diamond Light Source ISIS neutron and muon source Central Laser Facility

Big Facilities for Small Science

slide-4
SLIDE 4

Facilities science in Europe

PaNdata Europe 2010 – 2011 Preparation: common policies and standards http://pan-data.eu/pandata/?q=PaNdataEurope PaNdata ODI 2011 – 2014 Implementation: delivering new infrastructure http://pan-data.eu/pandata/?q=ODIWP

slide-5
SLIDE 5

Computing support throughout the scientific lifecycle

STFC Scientific Computing supports each stage in the work of researchers from background research through conducting simulations and experiments, to analysing and archiving data.

JISCMail Grid Computing e-pubs Grid Computing

slide-6
SLIDE 6

Facilities Research Lifecycle

Proposal Approval Scheduling Experiment Data storage Record Publication

Scientist submits application for beamtime Facility committee approves application Facility registers, trains, and schedules scientist’s visit Scientists visits, facility run’s experiment Subsequent publication registered with facility Raw data filtered, and stored

Data analysis

Tools for processing made available

ICAT data catalogue software: http://code.google.com/p/icatproject/

A corresponding intellectual entity: Investigation

slide-7
SLIDE 7

DOIs for experimental data

www.DataCite.org Much cheaper DOIs than directly from DOI Foundation

slide-8
SLIDE 8

LCDP 2013

Our DOIs landing pages are in fact for Investigations (series of Experiments)

slide-9
SLIDE 9

ICAT data catalogue called from DOI landing page

https://data.isis.stfc.ac.uk/TOPCATWeb.jsp#view///&&tab=Search//&Model =INVESTIGATION&ServerName=ISIS&InvestigationId=24071239

slide-10
SLIDE 10

What can cite what

Citations “from” column “to” row

Publication Investigation Dataset Software Publication V V V V Investigation V V V V Dataset X V V (derived or

aggregated datasets)

V (simulation) Software V X V (testing) V (software libraries,

service calls)

slide-11
SLIDE 11

Publication and Investigation similarity

No Feature / aspect Publication Investigation 1 Is an intellectual entity V V 2 Is a subject of peer review V V (via proposal approval) 3 Can cite all significant intellectual entities of a research discourse V V 4 Citation chains (steps of discourse) observed V V 5 Universal identifiers “mints” available V V This gives Investigation a potential for a “full membership” in the research discourse along with Publication. Datasets and software are likely to remain “associated members” because of weaker features 2, 3 and a de-facto weaker feature 5.

slide-12
SLIDE 12

Publications and investigations network

slide-13
SLIDE 13

Bibliographic Reference in ICAT Data Catalogue (a few thousand records) ePubs Publications Repository Reference Pratt et al, Phys. Rev. Lett. 96, 247203 (2006) Phys Rev Lett 96 247203 (2006) Lancaster et al, Phys. Rev B73, 020410(R) (2005) Phys Rev B 73 020410 (2006) Blundell and Pratt, J. Phys.: Condens. Matter 16, R771 (2004) J Phys Condens Matter 16 R771-R828 (2004) M.T.F.Telling and S.H.Kilcoyne, Electron transfer in dextran, J. Phys.:

  • Condens. Matter 19 No 2 (17 January

2007) J Phys Condens Matter 19 2 026221 (2007) J Tomkinson and M.T.F Telling, Ammonium ions in alkali metal halide crystals: Tunnelling and spin relaxation, PCCP 2006 8 38 4434 Phys Chem Chem Phys 8 4434-4440 (2006)

How to link Publications and Investigations?

slide-14
SLIDE 14

Beyond bibliographic records matching

A few thousand publications mapped with publications repository

  • n previous stage could be used for tuning and testing the machine

learning techniques

Publications repository records and their DOI landing pages Instruments / departments tags Authors Publication title Authors’ organizations Publication date Abstract Keywords, e.g. PACS indices Data catalogue records (investigation descriptions) Instruments to investigations mapping Researchers’ names Investigation title Researchers’ organizations Investigation period Investigation description ICAT keywords

slide-15
SLIDE 15

What represents facilities research?

Publications catalogue Experiment descriptions catalogue Publica tions

Research Awards

Experi ments Selected

  • ntologies

+

Yesterday: publications Today: publications and data (in fact, Investigations) Tomorrow: “facility-centric” Linked Open Data cloud Selected external Linked Data Publications catalogue

slide-16
SLIDE 16

Triple store (Jena TDB) OAI –PMH sources Fuseki Web application Harvesters & Mappers Command line tools (ARQ, loaders,

  • ptimizers, …)

SPARQL (can be from a remote client) Linked Data API Bespoke (Jena) Web application RDF extractors and loaders Data cleansers and mappers with vocabularies,

  • ntologies,

geolocation services and other Linked Data sources OAI-PMH Linked Data wrappers Databases Other triple stores , SPARQL endpoints and Linked Data APIs Data converters Database Linked Data wrappers Semantically enriched data Bespoke / customized software applications

Linked Data technology stack

Legend:

Prospective components Implemented or evaluated components Facilities user community

slide-17
SLIDE 17

Information entities circulating in your research domain

  • What are they? (beyond

publications)

  • Do they have a clear identity?
  • Do they circulate in your
  • rganization only or universally

across organizations?

  • Can they be linked with

publications and other information entities?

  • Can they be linked with the

world-wide data cloud?

slide-18
SLIDE 18

Scienti tifi fic c Computi uting Department

Thank you!