1
Cosmic Variability and the IT Challenges Mass Data Management of - - PowerPoint PPT Presentation
Cosmic Variability and the IT Challenges Mass Data Management of - - PowerPoint PPT Presentation
Cosmic Variability and the IT Challenges Mass Data Management of Astronomical Databases Wen-Ping Chen Institute of Astronomy National Central University Taiwan 2006 May 04 @ ISGC2006 1 Outline Outline Fully operational; in Taiwan TAOS
2
Outline Outline
TAOS
Taiwan-America Occultation Survey
Pan-STARRS
Panoramic Survey Telescope and Rapid Response System
Fully operational; in Taiwan Being constructed; in Hawaii, USA
3
4
LULIN OBSERVATORY
鹿林天文台
Elevated to 2862m; above inversion layer … seen from Yusan (Jade Mt)
5
LELIS TAOS LOT SLT LULIN OBSERVATORY
6
The TAOS (Taiwan-America Occultation Survey) project, a novel telescope array set up by groups from Taiwan, US and Korea, began routine observations in early 2005 and has the potential to make unique contribution to the knowledge of our Solar System.
Comet nuclei too faint to be detected by direct imaging may be “seen” when they move in front of a background star --- a stellar occultation event.
中美掩星計畫
7
Project Overview
Census of the solar-system family An array of wide-field telescopes (D=50 cm, f/1.9, FOV=3 sq. deg) to monitor the brightness changes of ~1,000 stars at 5 Hz rate Looking for a ‘blink’ of starlight (occultation) when an
- bject (> 2 km) moves in front of a distant star
Frequency of events population of “interveners” Data rate a few 100 GB per night; only “interesting” data downloaded via the dedicated E1 connection Real time data analysis (light curves, statistics) Requiring coincidence detection of the same event by all the telescopes (against false positive)
8
A
TAOS Telescopes
B C D
Lulin Observatory altitude=2862 m
TAOS is the only one of its kind in the world to conduct a census of small (1-2 km size) icy bodies at the outer reach of the solar system. 100 GB/night
With a special data acquisition and a non- parametric statistical analysis scheme
9
Adaptive Aperture Photometry Pipeline
Fast Moderately accurate Compensate for image motion
10
Sample Output of the TAOS Photometric Light Curves
- beg. time end time count ct err
x y rb
11 SCHEDULER SCHEDULER STATISTICS STATISTICS AGGREGATOR AGGREGATOR ARCHIVER ARCHIVER CARTOUCHE
DB DB
CAMERA CAMERA CAMERA CAMERA CAMERA CAMERA CAMERA CAMERA PHOTOMETRY PHOTOMETRY
FITS FITS FITS FITS FITS FITS FITS FITS
- Pipeline Flow from image taking to archival
- Arrows represent the flow of messages
between components
12
CY Aqr, a known Delta-Scuti star with P~88 min, was
- bserved by TAOS on 2003 September 16 with 0.3 s
sampling, here binned to 150 s for illustration. time-domain astrophysics
TEST DRIVE 1
13
2004 February 21 TAOS detected the
- ccultation event
- f HIP 079407,
mV=8.8 mag) by (51) Nemausa (mV=11.9)
Prediction by Isao Sato (左藤勳) D~150 km
Δt ~ 6.25 +/- 0.50 s
TEST DRIVE 2
14
TAOS/A TAOS/B
2004 June 05 TAOS detected the occultation of HIP 050525 (mV~8.46 mag) by (1723) Klemola (mV~15.7 mag; D~31 km) with two telescopes Enclosure opened by a resident assistant and observations carried out remotely from Taipei
TEST DRIVE 3
15
TEST DRIVE 4
2006 Feb 06 three TAOS telescopes detected the
- ccultation of TYC 076200961 (mV ~ 11.83) by
(286) Iclea (mV ~ 14.0 mag, D~ 97 km)
16
Typical CCD imaging Every star, together with surrounding skies, get exposure at the same time TAOS data Integrate for 200 ms and then read out 32 rows of pixels, with the shutter remains open The sequence continues, so each star appears as a series of dots ‘zipper’ ‘Fake’ neighboring stars and skies!
Data Acquisition Data Acquisition
17
E vent Detection --- E vent Detection --- Rank Statistics Rank Statistics Use the rank, instead of the flux, to quantify the light curve A true occultation event should have the lowest rank in all telescopes no need for highly accurate flux speed conditional probability low false rates
) ( log ) ( log
4 1 10 4 10 i i w
W S Z
Π
=
− =
Simulated light curves from each of the four telescopes Rank statistics With occultation Without
18
Higher flux ranking
An event can be detected even it is not obvious in the data
19
Panoramic Survey Telescope and Rapid Response System
20
Project Overview
All-sky survey (3π) Frequent revisit (cadence 4-7 days) Wide-Field Imaging Short Duty Cycle Efficient Operations An array of 4 telescopes, located in Hawaii, each of D=1.8 m, equipped with a 1.4 gigapixel camera of an Orthogonal Transfer Array CCD detector (=40 cm square focal plane) 7 square-degree FOV with 0.26” pixels Detection of moving, transient, and variable celestial
- bjects down to very faint limits
Cumulate very deep sky images
21
The Sciences
- First large-field survey program to open the time
domain in astronomical observing ------ transient sources in time & space
- Multiple survey modes, both wide field ------
ecliptic (& “sweet spots”), all sky (3π) ------ and selected deep fields: – Solar System (PHA emphasis) – Cosmology (weak gravitational lensing, supernovae, GRBs) – Galactic Structure
- Ultra-deep static images (R < 23.5 mag)
22
Pa n-ST ARRS Mino r Pla ne t Summa ry
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 10 Series1 Series2 Series3
1 10,000,000 1,000,000 100,000 10,000 1,000 100 10 Known PS 1 Ye a r PS 10 Ye a rs NEO / PHO Ma in Be lt Jovia n T roja ns Othe r T roja ns Ce nta urs Come ts T NOs Wide T NO Bina rie s Compa nions Inte rste lla r Visitors
23
The Telescopes
24
The Detector
Independently addressable
- rthogonal transfer CCDs (cells)
Reducing cost by increasing yield Fast readout: Gigapixels in 2 s On-Chip guiding Minimizing effects of bright stars Compensating for image motion
25
OTCCDs Work
M13 I band 300 sec Telescope guiding only With OT tracking 0.59" FWHM psf 0.45" FWHM psf 7 Hz frame rate Note: the arrows point to two examples in the images where the improvements in image quality due to OT tracking are clearly evident
26
The Site(s) Prototype telescope site --- Haleakala High Altitude Observatory (Maui)
27
28
Eventual Mauna Kea site for Pan-STARRS
29
The Budget
$60M funded by Congress through US Air Force (2002) Funding for construction only --- Telescopes, Detectors, and “working” control software and analysis pipelines Annual operations cost amounts to ~$2M/year Operations partners sought to share cost and make use
- f data for breakthrough science
In-kind contributions encouraged --- technical human power, e.g., professional software engineers, observers Scholar and student exchanges
30
IT Challenges
Each raw image from a single Pan-STARRS camera will contain 2 Gbytes (2 bytes per pixel). In the full survey mode, typical exposures last 30 seconds, so the raw data rate is several terabytes per night for the full telescope
- system. The amount of data produced by Pan-STARRS is
so large that it will not be practical to archive every
- image. Software techniques are therefore being
developed to extract the important information from the images, while allowing less crucial information to be discarded.
31
Pan-STARRS Pan-STARRS(泛星計畫)
Panoramic Survey Telescope and Rapid Response System
U of Hawaii 4 x 1.8 m telescopes Orthogonal Transfer Array CCD technology (G pix) 7 sq deg Whole sky patrol every 4-7 days down to 24th mag … a 10-year movie of the cosmos moving, transient, and variable objects several TB/night
32
The Data Flow
Subsystems TEL – Telescopes CAM – Cameras OTIS – Observatory, Telescope & Instrument Software IPP - Image Processing Pipeline MOPS – Moving Object Processing Software PSPS – Published Science Data Products
33
- !
- "
- #
- #
- $%
- &
% $%
- !'
- #
(
- Summit Process Flow Diagram
34
PS1 Data Storage Requirements
- Capable of saving all raw data
from year 1 and one stacked image with about 0.5 Petabytes of storage
- With this approach it will be
possible to re-reduce the AP survey during year 2 with the global astrometric and photometric solutions.
35
– provide the archival storage for the project’s science products – protect the integrity of the scientific data products – provide data access to these products for the IfA and collaborating scientists – provide data access to the products to users from outside the project (subject to funding)
Published Science Product Published Science Product Subsystem (PSPS) Subsystem (PSPS)
36
PSPS Overview
37
Object DB P2 P4Σ P4Δ
- bj/deg
2.7E+04 1.1E+05 2.0E+05 deg/fpa 7.00 7.00 7.00 FPA/night 3000.00 750.00 750.00 nights/year 250.00 250.00 250.00 bytes/obj 64.00 64.00 64.00 Data Product Size (PB) DB OH 4.00 4.00 4.00 Static Sky Img 1.51 Years 10.00 10.00 10.00 Object Data 1.43 PB 0.36 0.36 0.67
- Cum. Sky Cat.
0.19 Metadata 0.04
- Cum. Sky Catalog
Static Sky Images Postage Stamps 0.01 deg 3.0E+04 deg 3.0E+04 MOPS 0.0021
- bj/deg
4.3E+05 pix/deg 3.2E+08 Filtered Trans. 0.00001 filters 6.00 filters 6.00 Total (PB) 3.19 bytes/obj 300.00 bytes / pix 7.20 Compress 1.00 Compress 0.40 DB OH 4.00 DB OH 1.00 Copies 2.00 Copies 9.00 PB 0.19 PB 1.51
Database Sizing Justification
38
Prototype Data Sources
- 1.8 billion objects from three star catalogs
– US Naval Observatory catalog B (USNOB) – 2 Micron All Sky Survey (2MASS) – (YB6)
- 5.4 billion individual detections recreated
– 10 different filters – 50 year time span
- Recreating merged objects
– Approx 1.2 million per week on existing hardware
39
PSPS Conceptual Design Components
40
Database Development Strategy
- The data stores required for the Pan-STARRS project
will be >> than any previously developed.
- Multi-phase development strategy:
– Conceptual prototype using exisiting catalogs – PS-1 functional prototype – PS-4 operational system
Prototype Object Data Store
- Oracle 9.2 Enterprise with Spatial & Partitioning
- Red Hat Enterprise Linux 3.0
- 2 CPU Dell 2650 9GB RAM
- 4 TB StorageTek Fibre Channel Disk
41
The PSPS for PS-1
- Data volume generated by the PS-1
astrometric/photometric survey will be ~10X > than the SDSS.
- The PSPS for PS-1 will implement a subset of
the clients envisioned for the full PSPS: – Stationary sky catalogs (time dependent & cumulative sky. – Moving object products generated by MOPS
42
Firewall Firewall Firewall
PSPS WBI PSPS DRL PSPS Monitor PSPS SSDM PSPS ODM
PS1 PSPS
External IfA and transient clients
PS1 IPP PS1 MOPS
Firewall Firewall Firewall
PSPS WBI PSPS DRL PSPS Monitor PSPS SSDM PSPS SSDM PSPS ODM PSPS ODM
PS1 PSPS
External IfA and transient clients
PS1 IPP PS1 MOPS
The PSPS Components
43
Web-Based Interface (WBI) DRL Validation Data Management Component
Query Time Estimate Request Time Estimate Validation Error (possible)
DRL Query Processing
Query plus Priority Result Set Query Complete Notification
DRL Resource Management Increasing Time
Result Retrieval Request Result Set Failure Notification (possible) Query Failure Notification
DRL Result Set Cleared
Time Threshold Exceeded
Validation State Result Set Query Error (possible)
Data Retrieval Layer Query Processing Flow
44
Project Operations
- Participating Institutions:
- UH Institute for Astronomy (Lead)—Cameras, Science Software
- MIT Lincoln Laboratories—Pixel Devices
- Maui High Performance Computing Center—Pipeline Software
- Science Applications International Corporation—Database Software
- Sub-Contractors:
- EOS Technologies—Telescope Structure and Enclosure
- VertexRSI—Telescope Structure and Enclosure
- Rayleigh Optical Corporation—Telescope Mirrors
- Corning Incorporated—Mirror Glass
- Oceanit—Optical Corrector Elements
- Goodrich Corporation—Optical Corrector Elements
- SAGEM—Optical Filters
- University of Bonn—Camera Shutter Assembly
- KC Environmental—Environmental Studies and Permitting for PS-1
- Potential Operations Partners
Taiwan, Princeton University, Harvard-Smithsonian, UC Berkeley
45
Pan-STARRS Major Milestones
- First Light on Prototype Telescope (June, 2006)
- Primary Site Selection (January, 2005)
- Environmental Impact Statement (January, 2005)
- Ground Breaking (January, 2006?, Depending on
Completion of Permitting Process)
- First Light on Full Pan-STARRS System (2008-9?,
Depending on Available Funding)
46
2006/02
47
Conclusions
Astronomers have been demanding to push the IT forefronts Telescope/Detector technology larger, finer
- bservations
Rapid cadence huge data volume processing, analysis, storage, archival, distribution ... ($1 hardware, $1 software, $10 DB) Need to involve software engineers, IT managers, statisticians … from the very beginning of a project to design the experiment
48
49