Cosmic Variability and the IT Challenges Mass Data Management of - - PowerPoint PPT Presentation

cosmic variability and the it challenges
SMART_READER_LITE
LIVE PREVIEW

Cosmic Variability and the IT Challenges Mass Data Management of - - PowerPoint PPT Presentation

Cosmic Variability and the IT Challenges Mass Data Management of Astronomical Databases Wen-Ping Chen Institute of Astronomy National Central University Taiwan 2006 May 04 @ ISGC2006 1 Outline Outline Fully operational; in Taiwan TAOS


slide-1
SLIDE 1

1

Cosmic Variability and the IT Challenges ─ Mass Data Management of Astronomical Databases

Wen-Ping Chen Institute of Astronomy National Central University Taiwan

2006 May 04 @ ISGC2006

slide-2
SLIDE 2

2

Outline Outline

TAOS

Taiwan-America Occultation Survey

Pan-STARRS

Panoramic Survey Telescope and Rapid Response System

Fully operational; in Taiwan Being constructed; in Hawaii, USA

slide-3
SLIDE 3

3

slide-4
SLIDE 4

4

LULIN OBSERVATORY

鹿林天文台

Elevated to 2862m; above inversion layer … seen from Yusan (Jade Mt)

slide-5
SLIDE 5

5

LELIS TAOS LOT SLT LULIN OBSERVATORY

slide-6
SLIDE 6

6

The TAOS (Taiwan-America Occultation Survey) project, a novel telescope array set up by groups from Taiwan, US and Korea, began routine observations in early 2005 and has the potential to make unique contribution to the knowledge of our Solar System.

Comet nuclei too faint to be detected by direct imaging may be “seen” when they move in front of a background star --- a stellar occultation event.

中美掩星計畫

slide-7
SLIDE 7

7

Project Overview

Census of the solar-system family An array of wide-field telescopes (D=50 cm, f/1.9, FOV=3 sq. deg) to monitor the brightness changes of ~1,000 stars at 5 Hz rate Looking for a ‘blink’ of starlight (occultation) when an

  • bject (> 2 km) moves in front of a distant star

Frequency of events population of “interveners” Data rate a few 100 GB per night; only “interesting” data downloaded via the dedicated E1 connection Real time data analysis (light curves, statistics) Requiring coincidence detection of the same event by all the telescopes (against false positive)

slide-8
SLIDE 8

8

A

TAOS Telescopes

B C D

Lulin Observatory altitude=2862 m

TAOS is the only one of its kind in the world to conduct a census of small (1-2 km size) icy bodies at the outer reach of the solar system. 100 GB/night

With a special data acquisition and a non- parametric statistical analysis scheme

slide-9
SLIDE 9

9

Adaptive Aperture Photometry Pipeline

Fast Moderately accurate Compensate for image motion

slide-10
SLIDE 10

10

Sample Output of the TAOS Photometric Light Curves

  • beg. time end time count ct err

x y rb

slide-11
SLIDE 11

11 SCHEDULER SCHEDULER STATISTICS STATISTICS AGGREGATOR AGGREGATOR ARCHIVER ARCHIVER CARTOUCHE

DB DB

CAMERA CAMERA CAMERA CAMERA CAMERA CAMERA CAMERA CAMERA PHOTOMETRY PHOTOMETRY

FITS FITS FITS FITS FITS FITS FITS FITS

  • Pipeline Flow from image taking to archival
  • Arrows represent the flow of messages

between components

slide-12
SLIDE 12

12

CY Aqr, a known Delta-Scuti star with P~88 min, was

  • bserved by TAOS on 2003 September 16 with 0.3 s

sampling, here binned to 150 s for illustration. time-domain astrophysics

TEST DRIVE 1

slide-13
SLIDE 13

13

2004 February 21 TAOS detected the

  • ccultation event
  • f HIP 079407,

mV=8.8 mag) by (51) Nemausa (mV=11.9)

Prediction by Isao Sato (左藤勳) D~150 km

Δt ~ 6.25 +/- 0.50 s

TEST DRIVE 2

slide-14
SLIDE 14

14

TAOS/A TAOS/B

2004 June 05 TAOS detected the occultation of HIP 050525 (mV~8.46 mag) by (1723) Klemola (mV~15.7 mag; D~31 km) with two telescopes Enclosure opened by a resident assistant and observations carried out remotely from Taipei

TEST DRIVE 3

slide-15
SLIDE 15

15

TEST DRIVE 4

2006 Feb 06 three TAOS telescopes detected the

  • ccultation of TYC 076200961 (mV ~ 11.83) by

(286) Iclea (mV ~ 14.0 mag, D~ 97 km)

slide-16
SLIDE 16

16

Typical CCD imaging Every star, together with surrounding skies, get exposure at the same time TAOS data Integrate for 200 ms and then read out 32 rows of pixels, with the shutter remains open The sequence continues, so each star appears as a series of dots ‘zipper’ ‘Fake’ neighboring stars and skies!

Data Acquisition Data Acquisition

slide-17
SLIDE 17

17

E vent Detection --- E vent Detection --- Rank Statistics Rank Statistics Use the rank, instead of the flux, to quantify the light curve A true occultation event should have the lowest rank in all telescopes no need for highly accurate flux speed conditional probability low false rates

) ( log ) ( log

4 1 10 4 10 i i w

W S Z

Π

=

− =

Simulated light curves from each of the four telescopes Rank statistics With occultation Without

slide-18
SLIDE 18

18

Higher flux ranking

An event can be detected even it is not obvious in the data

slide-19
SLIDE 19

19

Panoramic Survey Telescope and Rapid Response System

slide-20
SLIDE 20

20

Project Overview

All-sky survey (3π) Frequent revisit (cadence 4-7 days) Wide-Field Imaging Short Duty Cycle Efficient Operations An array of 4 telescopes, located in Hawaii, each of D=1.8 m, equipped with a 1.4 gigapixel camera of an Orthogonal Transfer Array CCD detector (=40 cm square focal plane) 7 square-degree FOV with 0.26” pixels Detection of moving, transient, and variable celestial

  • bjects down to very faint limits

Cumulate very deep sky images

slide-21
SLIDE 21

21

The Sciences

  • First large-field survey program to open the time

domain in astronomical observing ------ transient sources in time & space

  • Multiple survey modes, both wide field ------

ecliptic (& “sweet spots”), all sky (3π) ------ and selected deep fields: – Solar System (PHA emphasis) – Cosmology (weak gravitational lensing, supernovae, GRBs) – Galactic Structure

  • Ultra-deep static images (R < 23.5 mag)
slide-22
SLIDE 22

22

Pa n-ST ARRS Mino r Pla ne t Summa ry

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 10 Series1 Series2 Series3

1 10,000,000 1,000,000 100,000 10,000 1,000 100 10 Known PS 1 Ye a r PS 10 Ye a rs NEO / PHO Ma in Be lt Jovia n T roja ns Othe r T roja ns Ce nta urs Come ts T NOs Wide T NO Bina rie s Compa nions Inte rste lla r Visitors

slide-23
SLIDE 23

23

The Telescopes

slide-24
SLIDE 24

24

The Detector

Independently addressable

  • rthogonal transfer CCDs (cells)

Reducing cost by increasing yield Fast readout: Gigapixels in 2 s On-Chip guiding Minimizing effects of bright stars Compensating for image motion

slide-25
SLIDE 25

25

OTCCDs Work

M13 I band 300 sec Telescope guiding only With OT tracking 0.59" FWHM psf 0.45" FWHM psf 7 Hz frame rate Note: the arrows point to two examples in the images where the improvements in image quality due to OT tracking are clearly evident

slide-26
SLIDE 26

26

The Site(s) Prototype telescope site --- Haleakala High Altitude Observatory (Maui)

slide-27
SLIDE 27

27

slide-28
SLIDE 28

28

Eventual Mauna Kea site for Pan-STARRS

slide-29
SLIDE 29

29

The Budget

$60M funded by Congress through US Air Force (2002) Funding for construction only --- Telescopes, Detectors, and “working” control software and analysis pipelines Annual operations cost amounts to ~$2M/year Operations partners sought to share cost and make use

  • f data for breakthrough science

In-kind contributions encouraged --- technical human power, e.g., professional software engineers, observers Scholar and student exchanges

slide-30
SLIDE 30

30

IT Challenges

Each raw image from a single Pan-STARRS camera will contain 2 Gbytes (2 bytes per pixel). In the full survey mode, typical exposures last 30 seconds, so the raw data rate is several terabytes per night for the full telescope

  • system. The amount of data produced by Pan-STARRS is

so large that it will not be practical to archive every

  • image. Software techniques are therefore being

developed to extract the important information from the images, while allowing less crucial information to be discarded.

slide-31
SLIDE 31

31

Pan-STARRS Pan-STARRS(泛星計畫)

Panoramic Survey Telescope and Rapid Response System

U of Hawaii 4 x 1.8 m telescopes Orthogonal Transfer Array CCD technology (G pix) 7 sq deg Whole sky patrol every 4-7 days down to 24th mag … a 10-year movie of the cosmos moving, transient, and variable objects several TB/night

slide-32
SLIDE 32

32

The Data Flow

Subsystems TEL – Telescopes CAM – Cameras OTIS – Observatory, Telescope & Instrument Software IPP - Image Processing Pipeline MOPS – Moving Object Processing Software PSPS – Published Science Data Products

slide-33
SLIDE 33

33

  • !
  • "
  • #
  • #
  • $%
  • &

% $%

  • !'
  • #

(

  • Summit Process Flow Diagram
slide-34
SLIDE 34

34

PS1 Data Storage Requirements

  • Capable of saving all raw data

from year 1 and one stacked image with about 0.5 Petabytes of storage

  • With this approach it will be

possible to re-reduce the AP survey during year 2 with the global astrometric and photometric solutions.

slide-35
SLIDE 35

35

– provide the archival storage for the project’s science products – protect the integrity of the scientific data products – provide data access to these products for the IfA and collaborating scientists – provide data access to the products to users from outside the project (subject to funding)

Published Science Product Published Science Product Subsystem (PSPS) Subsystem (PSPS)

slide-36
SLIDE 36

36

PSPS Overview

slide-37
SLIDE 37

37

Object DB P2 P4Σ P4Δ

  • bj/deg

2.7E+04 1.1E+05 2.0E+05 deg/fpa 7.00 7.00 7.00 FPA/night 3000.00 750.00 750.00 nights/year 250.00 250.00 250.00 bytes/obj 64.00 64.00 64.00 Data Product Size (PB) DB OH 4.00 4.00 4.00 Static Sky Img 1.51 Years 10.00 10.00 10.00 Object Data 1.43 PB 0.36 0.36 0.67

  • Cum. Sky Cat.

0.19 Metadata 0.04

  • Cum. Sky Catalog

Static Sky Images Postage Stamps 0.01 deg 3.0E+04 deg 3.0E+04 MOPS 0.0021

  • bj/deg

4.3E+05 pix/deg 3.2E+08 Filtered Trans. 0.00001 filters 6.00 filters 6.00 Total (PB) 3.19 bytes/obj 300.00 bytes / pix 7.20 Compress 1.00 Compress 0.40 DB OH 4.00 DB OH 1.00 Copies 2.00 Copies 9.00 PB 0.19 PB 1.51

Database Sizing Justification

slide-38
SLIDE 38

38

Prototype Data Sources

  • 1.8 billion objects from three star catalogs

– US Naval Observatory catalog B (USNOB) – 2 Micron All Sky Survey (2MASS) – (YB6)

  • 5.4 billion individual detections recreated

– 10 different filters – 50 year time span

  • Recreating merged objects

– Approx 1.2 million per week on existing hardware

slide-39
SLIDE 39

39

PSPS Conceptual Design Components

slide-40
SLIDE 40

40

Database Development Strategy

  • The data stores required for the Pan-STARRS project

will be >> than any previously developed.

  • Multi-phase development strategy:

– Conceptual prototype using exisiting catalogs – PS-1 functional prototype – PS-4 operational system

Prototype Object Data Store

  • Oracle 9.2 Enterprise with Spatial & Partitioning
  • Red Hat Enterprise Linux 3.0
  • 2 CPU Dell 2650 9GB RAM
  • 4 TB StorageTek Fibre Channel Disk
slide-41
SLIDE 41

41

The PSPS for PS-1

  • Data volume generated by the PS-1

astrometric/photometric survey will be ~10X > than the SDSS.

  • The PSPS for PS-1 will implement a subset of

the clients envisioned for the full PSPS: – Stationary sky catalogs (time dependent & cumulative sky. – Moving object products generated by MOPS

slide-42
SLIDE 42

42

Firewall Firewall Firewall

PSPS WBI PSPS DRL PSPS Monitor PSPS SSDM PSPS ODM

PS1 PSPS

External IfA and transient clients

PS1 IPP PS1 MOPS

Firewall Firewall Firewall

PSPS WBI PSPS DRL PSPS Monitor PSPS SSDM PSPS SSDM PSPS ODM PSPS ODM

PS1 PSPS

External IfA and transient clients

PS1 IPP PS1 MOPS

The PSPS Components

slide-43
SLIDE 43

43

Web-Based Interface (WBI) DRL Validation Data Management Component

Query Time Estimate Request Time Estimate Validation Error (possible)

DRL Query Processing

Query plus Priority Result Set Query Complete Notification

DRL Resource Management Increasing Time

Result Retrieval Request Result Set Failure Notification (possible) Query Failure Notification

DRL Result Set Cleared

Time Threshold Exceeded

Validation State Result Set Query Error (possible)

Data Retrieval Layer Query Processing Flow

slide-44
SLIDE 44

44

Project Operations

  • Participating Institutions:
  • UH Institute for Astronomy (Lead)—Cameras, Science Software
  • MIT Lincoln Laboratories—Pixel Devices
  • Maui High Performance Computing Center—Pipeline Software
  • Science Applications International Corporation—Database Software
  • Sub-Contractors:
  • EOS Technologies—Telescope Structure and Enclosure
  • VertexRSI—Telescope Structure and Enclosure
  • Rayleigh Optical Corporation—Telescope Mirrors
  • Corning Incorporated—Mirror Glass
  • Oceanit—Optical Corrector Elements
  • Goodrich Corporation—Optical Corrector Elements
  • SAGEM—Optical Filters
  • University of Bonn—Camera Shutter Assembly
  • KC Environmental—Environmental Studies and Permitting for PS-1
  • Potential Operations Partners

Taiwan, Princeton University, Harvard-Smithsonian, UC Berkeley

slide-45
SLIDE 45

45

Pan-STARRS Major Milestones

  • First Light on Prototype Telescope (June, 2006)
  • Primary Site Selection (January, 2005)
  • Environmental Impact Statement (January, 2005)
  • Ground Breaking (January, 2006?, Depending on

Completion of Permitting Process)

  • First Light on Full Pan-STARRS System (2008-9?,

Depending on Available Funding)

slide-46
SLIDE 46

46

2006/02

slide-47
SLIDE 47

47

Conclusions

Astronomers have been demanding to push the IT forefronts Telescope/Detector technology larger, finer

  • bservations

Rapid cadence huge data volume processing, analysis, storage, archival, distribution ... ($1 hardware, $1 software, $10 DB) Need to involve software engineers, IT managers, statisticians … from the very beginning of a project to design the experiment

slide-48
SLIDE 48

48

slide-49
SLIDE 49

49