Frontier of Data Discovery: MD in Astronomy A users perspective - - - PowerPoint PPT Presentation

frontier of data discovery md in astronomy a users
SMART_READER_LITE
LIVE PREVIEW

Frontier of Data Discovery: MD in Astronomy A users perspective - - - PowerPoint PPT Presentation

Frontier of Data Discovery: MD in Astronomy A users perspective - - B B a a c c k k g g r r o o u u n n d d s s t t o o r r y y - - D D a a t t a a f f o o r r m m a a t t s s &


slide-1
SLIDE 1

DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara)

Frontier of Data Discovery: MD in Astronomy “A users perspective”

  • B

a c k g r

  • u

n d s t

  • r

y

  • B

a c k g r

  • u

n d s t

  • r

y

  • D

a t a f

  • r

m a t s & M e t a d a t a

  • D

a t a f

  • r

m a t s & M e t a d a t a

  • V

O s t a n d a r d s & P r

  • v

e n a n c e

  • V

O s t a n d a r d s & P r

  • v

e n a n c e

  • C
  • n

c l u s i

  • n

s & L O F A R E O S C

  • C
  • n

c l u s i

  • n

s & L O F A R E O S C

S t e l l a r n u r s e r y i n C y g n u s

slide-2
SLIDE 2

An introduction to (radio) astronomy

Astronomy: scientific study of objects that exist naturally in space → moon, sun, planets, stars, galaxies, black holes, ...

DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara)

To understand celestial sources we want sensitive measurements with the highest possible resolution across the EM spectrum.

F L U X Wavelength

(Orion)

slide-3
SLIDE 3

LOFAR a pan-european astronomy project

↓ (Medicina, Italy)

*

Aeneas @ Bologna 08/10/2018 J.B.R. Oonk (SURFsara) LOFAR data → Archive * 3 LTA sites

  • SURFsara (NL)
  • FZ Juelich (GER)
  • PSNC (POL)

Data is findable via DB meta data (webportal)

Poznan Amsterdam

slide-4
SLIDE 4

LOFAR a pan-european astronomy project

↓ (Medicina, Italy)

*

Aeneas @ Bologna 08/10/2018 J.B.R. Oonk (SURFsara) LOFAR Science (KSP) * Birth of the first stars * Evolution of black holes * Neutron stars / pulsars * Exo-planets * Space weather

Poznan Amsterdam

slide-5
SLIDE 5

World map of Radio Astronomy

Radio telescopes: spread all over world with different organizations

slide-6
SLIDE 6

World map of Radio+Optical Astronomy

Optical telescopes: spread all over world with different organizations

slide-7
SLIDE 7

World/Space map of Astronomy

Progress in Astronomy is driven by combining data. This requires: * Data sharing , Common data formats & Meta data standards !

slide-8
SLIDE 8

Common Data formats – FITS & MS

FITS = Flexible Image Transport System (1981: v.4.0 2016 – IAU)

DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara)

Radio: Measurement Set (MS) = set of tables in a directory structure

(courtesy: Allegrazza 2012)

slide-9
SLIDE 9

Meta data – headers & keywords

DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara) * Header = machine readable meta data → 80 characters per line (#lines not limited) * Header can also be used for Provenance, but not automatically enforced (users) !

(courtesy: Allegrazza 2012)

Meta data make data findable, queriable and quantitative.

slide-10
SLIDE 10

Virtual Observatory & Standards

“The Virtual Observatory (VO) is the vision that astronomical datasets and other resources should work as a seamless whole.” * Governed by the International Virtual Observatory Alliance (IVOA) * VO’s → access & interoperability between data collections * >20 VO projects world-wide (e.g. large data centres – ESA, NASA) Note: However, finding sustained funding for maintenance of VO tools is difficult * IVOA has agreed on an improved (W3C) provenance model (2018) * Some IVOA standards have been adopted by other sciences.

DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara)

(Arvisset+2018)

slide-11
SLIDE 11

Data Discovery

VO standards enable (incomplete) data discovery across collections

→ e.g. missing; (i) collections from some telescopes , (ii) high level user processed data DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara) ESA SKY (sky.es.int) NASA MAST (archive.stsci.edu)

slide-12
SLIDE 12

HL user products I : Journals

HL user products are only described in journals → provenance ?? Journals & Papers provide “meta data” which is far from the data

A c c c c e s e s s t t

  • a

n d n d D D i s c s c

  • v
  • v

e r y r y

  • f

P P a p a p e r s r s i i n A A s t s t r

  • n
  • n
  • m
  • m

y a a c r c r

  • s

s s s J J

  • u

r u r n a n a l s : s :

DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara)

Search on keywords (e.g. sky objects) [open access]

slide-13
SLIDE 13

HL user products II : CDS/Vizier

Access to some HL products from Journals → does not solve provenance!

DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara) * alternatives e.g., NASA/IPAC Extragalactic database , Canadian Astronomy Data Centre.

(digital repositories for tables, images)

slide-14
SLIDE 14

Conclusions – MD & Astronomy

Astronomy → FITS & MS are the standard formats for (meta) data

  • High level (user) products often lack provenance !
  • Data discovery – VO’s there are many/incomplete ?

LOFAR part of modern multi-wavelength Astronomy LOFAR meta data & EOSC (Pilot/Hub):

* L L O F O F A R R a r a r c h i h i v e v e d / p / p r

  • r
  • d

u c u c t s t s d a d a t a t a a r a r e e M S / S / F I F I T S S w i w i t h h m e m e t a a d a d a t a a i n i n c l . l . * L L T A T A m

  • m
  • d

e d e l h h a s a s a a m e m e t a a d a d a t a a m

  • m
  • d

e l e l t t h a t a t i i n c l c l u d u d e s s p r p r

  • v

e v e n a n a n c e c e

  • M

M e t e t a d d a t a t a w w i l i l l a a l w l w a y s y s b b e k k e p e p t , , e v e v e n n f

  • f
  • r

d d e l e l e t e t e d d r a w a w d d a t a t a

  • H

H i g i g h l l e v e v e l l u s u s e r r p r p r

  • d

u d u c t c t a r a r c h c h i v e v e : : P I D I D , , B 2 2 s e r e r v i v i c e c e s – – E E O S C S C * L L O F O F A R R s k s k y s s u r u r v e y e y s s a i m i m f f

  • r

r p u p u b l i l i c a c a t i

  • i
  • n

n f

  • l
  • l

l

  • l
  • w

i n i n g g V O O s t s t a n d n d a r a r d s

DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara)

slide-15
SLIDE 15

Extra

FAIR data in astronomy F – findable : Yes , there are many archives A – accessible : Yes , but not single sign-on I – interoperable : Yes , VO standards (+common data format) R – reuseable : Yes , driven by F,A,I and the scientific need to combine the data in time space and frequency * caveat: but, current day high level products generated by individual users lack provenance → reproducibility

DI4R @ Lisbon 11/10/2018 J.B.R. Oonk (SURFsara)