Frontier of Data Discovery: MD in Astronomy “A users perspective” - - B B a a c c k k g g r r o o u u n n d d s s t t o o r r y y - - D D a a t t a a f f o o r r m m a a t t s s & & M M e e t t a a d d a a t t a a - - V V O O s s t t a a n n d d a a r r d d s s & & P P r r o o v v e e n n a a n n c c e e - - C C o o n n c c l l u u s s i i o o n n s s & & L L O O F F A A R R E E O O S S C C S t e l l a r n u r s e r y i n C y g n u s DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara)
An introduction to (radio) astronomy Astronomy: scientific study of objects that exist naturally in space → moon, sun, planets, stars, galaxies, black holes, ... (Orion) F L U X Wavelength To understand celestial sources we want sensitive measurements with the highest possible resolution across the EM spectrum. DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara)
LOFAR a pan-european astronomy project Amsterdam Poznan LOFAR data → Archive * 3 LTA sites - SURFsara (NL) - FZ Juelich (GER) - PSNC (POL) Data is findable via DB ↓ ( Medicina, Italy) meta data (webportal) * Aeneas @ Bologna 08/10/2018 J.B.R. Oonk (SURFsara)
LOFAR a pan-european astronomy project Amsterdam Poznan LOFAR Science (KSP) * Birth of the first stars * Evolution of black holes * Neutron stars / pulsars * Exo-planets ↓ ( Medicina, Italy) * Space weather * Aeneas @ Bologna 08/10/2018 J.B.R. Oonk (SURFsara)
World map of Radio Astronomy Radio telescopes: spread all over world with different organizations
World map of Radio+Optical Astronomy Optical telescopes: spread all over world with different organizations
World/Space map of Astronomy Progress in Astronomy is driven by combining data. This requires: * Data sharing , Common data formats & Meta data standards !
Common Data formats – FITS & MS FITS = Flexible Image Transport System (1981: v.4.0 2016 – IAU) (courtesy: Allegrazza 2012) Radio: Measurement Set (MS) = set of tables in a directory structure DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara)
Meta data – headers & keywords Meta data make data findable, queriable and quantitative. (courtesy: Allegrazza 2012) * Header = machine readable meta data → 80 characters per line (#lines not limited) * Header can also be used for Provenance, but not automatically enforced (users) ! DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara)
Virtual Observatory & Standards “The Virtual Observatory (VO) is the vision that astronomical datasets and other resources should work as a seamless whole.” * Governed by the International Virtual Observatory Alliance (IVOA) * VO’s → access & interoperability between data collections * >20 VO projects world-wide (e.g. large data centres – ESA, NASA) Note: However, finding sustained funding for maintenance of VO tools is difficult * IVOA has agreed on an improved (W3C) provenance model (2018) * Some IVOA standards have been adopted by other sciences. (Arvisset+2018) DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara)
Data Discovery VO standards enable (incomplete) data discovery across collections → e.g. missing; (i) collections from some telescopes , (ii) high level user processed data ESA SKY (sky.es.int) NASA MAST (archive.stsci.edu) DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara)
HL user products I : Journals HL user products are only described in journals → provenance ?? Journals & Papers provide “meta data” which is far from the data A c c c c e e s s s t t o o a n n d d D D i s s c c o o v v e r r y y o o f P P a a p p e r r s s i i n A A s s t t r o o n n o o m m y a a c c r r o s s s s J J o u u r r n n a a l s s : : [open access] Search on keywords (e.g. sky objects) DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara)
HL user products II : CDS/Vizier Access to some HL products from Journals → does not solve provenance! (digital repositories for tables, images) * alternatives e.g., NASA/IPAC Extragalactic database , Canadian Astronomy Data Centre. DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara)
Conclusions – MD & Astronomy Astronomy → FITS & MS are the standard formats for (meta) data - High level (user) products often lack provenance ! - Data discovery – VO’s there are many/incomplete ? LOFAR part of modern multi-wavelength Astronomy LOFAR meta data & EOSC (Pilot/Hub): * L L O O F F A R R a a r r c h h i i v v e e d / / p p r r o o d u u c c t t s s d d a a t t a a a a r r e e M S S / / F F I I T S S w w i i t h h m m e e t a a d d a a t a a i i n n c l l . . * L L T T A A m m o o d d e e l h h a a s s a a m m e e t a a d d a a t a a m m o o d e e l l t t h a a t t i i n c c l l u u d d e s s p p r r o v v e e n n a a n c c e e - M M e e t t a d d a a t t a w w i i l l l a a l l w w a y y s s b b e k k e e p p t , , e e v v e n n f f o o r d d e e l l e t t e e d d r a a w w d d a t t a a - H H i i g g h l l e e v v e l l u u s s e r r p p r r o d d u u c c t t a a r r c c h h i v v e e : : P I I D D , , B 2 2 s e e r r v v i i c c e e s – – E E O S S C C * L L O O F F A R R s s k k y s s u u r r v e e y y s s a i i m m f f o r r p p u u b l l i i c c a a t i i o o n n f o o l l l l o o w i i n n g g V O O s s t t a n n d d a a r r d s DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara)
Extra FAIR data in astronomy F – findable : Yes , there are many archives A – accessible : Yes , but not single sign-on I – interoperable : Yes , VO standards (+common data format) R – reuseable : Yes , driven by F,A,I and the scientific need to combine the data in time space and frequency * caveat: but, current day high level products generated by individual users lack provenance → reproducibility DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara)
Recommend
More recommend