SPATIAL SEARCHES IN ASTRONOMY DATABASES MULTI-DIMENSIONAL INDEXING - - PowerPoint PPT Presentation

spatial searches in
SMART_READER_LITE
LIVE PREVIEW

SPATIAL SEARCHES IN ASTRONOMY DATABASES MULTI-DIMENSIONAL INDEXING - - PowerPoint PPT Presentation

Tams Budavri (Johns Hopkins University) 7/16/2012 SPATIAL SEARCHES IN ASTRONOMY DATABASES MULTI-DIMENSIONAL INDEXING FOR SIMULATIONS AND OBSERVATIONS Tams Budavri (Johns Hopkins University) 7/16/2012 Storing Simulations 3 Tams


slide-1
SLIDE 1

Tamás Budavári (Johns Hopkins University)

7/16/2012

slide-2
SLIDE 2

SPATIAL SEARCHES IN

ASTRONOMY DATABASES

MULTI-DIMENSIONAL INDEXING FOR SIMULATIONS AND OBSERVATIONS Tamás Budavári (Johns Hopkins University)

7/16/2012

slide-3
SLIDE 3

Tamás Budavári

Storing Simulations

 Millennium Run (MPA)

 10 billion particles, 64 snapshots  FoF groups and merger trees

 Millennium XXL

 300 billion particles

 MultiDark – Bolshoi  Turbulence simulations (JHU)

 10244 grid, 27TB

7/16/2012 ISSAC at HiPACC 3

slide-4
SLIDE 4

Tamás Budavári

Storing Simulations

 Millennium Run (MPA)

 10 billion particles, 64 snapshots  FoF groups and merger trees

 Millennium XXL

 300 billion particles

 MultiDark – Bolshoi  Turbulence simulations (JHU)

 10244 grid, 27TB

7/16/2012 ISSAC at HiPACC 4

Kai Bürger (TUM, JHU)

slide-5
SLIDE 5

Tamás Budavári ISSAC at HiPACC

Observing Simulations

 Comparison to real observations  Lots of spatial searches  In the database?

5 7/16/2012

slide-6
SLIDE 6

Tamás Budavári

Sky Coverage

 For precise window function

 Virtual surveys

7/16/2012 ISSAC at HiPACC 6

slide-7
SLIDE 7

Tamás Budavári

Outline

 Query shapes in SQL  Indexing with space-filling curve  Combine for spatial searches

 Periodic boxes  Celestial sphere

7/16/2012 ISSAC at HiPACC 7

slide-8
SLIDE 8

Tamás Budavári

Databases

 Which one to use depends on the task

 Sqlite, MySQL, PostGRES, DB2, Oracle, SQL Server

 Free “express versions” of the big ones, too  Customization is a must

 There is always something missing

 Extend by loading your libraries

7/16/2012 ISSAC at HiPACC 8

slide-9
SLIDE 9

Tamás Budavári

Query Shapes

7/16/2012 ISSAC at HiPACC

 IShape interface

TopoPoint Contains(Point p); TopoShape GetTopo(Box b); Box GetBoundingBox();

 Geometric primitives

 Sphere, Box, Cone…

9

slide-10
SLIDE 10

Tamás Budavári

Query Shapes

7/16/2012 ISSAC at HiPACC

 IShape interface

TopoPoint Contains(Point p); TopoShape GetTopo(Box b); Box GetBoundingBox();

 Geometric primitives

 Sphere, Box, Cone…

10

slide-11
SLIDE 11

Tamás Budavári

Query Shapes

7/16/2012 ISSAC at HiPACC

 IShape interface

TopoPoint Contains(Point p); TopoShape GetTopo(Box b); Box GetBoundingBox();

 Composites

 Intersect, Union, Difference…

11

slide-12
SLIDE 12

Tamás Budavári

Query Shapes

7/16/2012 ISSAC at HiPACC 12

 In SQL

 UDT

slide-13
SLIDE 13

Tamás Budavári

Query Shapes

7/16/2012 ISSAC at HiPACC 13

 Generic

 UDT

 Boolean

 Methods

slide-14
SLIDE 14

Tamás Budavári

Query Shapes

7/16/2012 ISSAC at HiPACC 14

 Generic

 UDT

 Boolean

 Methods

slide-15
SLIDE 15

Tamás Budavári

Indexing Tables

7/16/2012 ISSAC at HiPACC 15

 Better performance of queries

 Instantaneous range searches  Fast JOINs

 Syntax

CREATE INDEX ix_Name ON Table (X ASC, …) INCUDE (V, …)

slide-16
SLIDE 16

Tamás Budavári

Multi-Dimensional

7/16/2012 ISSAC at HiPACC 16

 Map the space to a simple index  Different kinds of Space-Filling Curves

 Morton’s Z-curve  Peano-Hilbert Curve

slide-17
SLIDE 17

Tamás Budavári

Peano-Hilbert Curve

 Hierarchical space filling

17 7/16/2012 ISSAC at HiPACC

slide-18
SLIDE 18

Tamás Budavári

Peano-Hilbert Curve

 Hierarchical space filling

18 7/16/2012 ISSAC at HiPACC

slide-19
SLIDE 19

Tamás Budavári

Peano-Hilbert Curve

 Hierarchical space filling

19 7/16/2012 ISSAC at HiPACC

slide-20
SLIDE 20

Tamás Budavári

Peano-Hilbert Curve

 Hierarchical space filling

20 7/16/2012 ISSAC at HiPACC

slide-21
SLIDE 21

Tamás Budavári

Also others…

7/16/2012 ISSAC at HiPACC 21

 Morton Z-order

 Simple bit interleave

 Etc…  Which one to use?

 Statistical analyses

 Correlation fn

slide-22
SLIDE 22

Tamás Budavári

Divide and Conquer

7/16/2012 ISSAC at HiPACC 22

slide-23
SLIDE 23

Tamás Budavári

Covers for Shapes

 Inside approximation  Outside overshoot

7/16/2012 ISSAC at HiPACC 23

slide-24
SLIDE 24

Tamás Budavári

Covers for Shapes

 Inside approximation  Outside overshoot

 They are Key ranges

7/16/2012 ISSAC at HiPACC 24

slide-25
SLIDE 25

Tamás Budavári

Covers for Shapes

 Inside approximation  Outside overshoot

 They are Key ranges

7/16/2012 ISSAC at HiPACC 25

Key between 0 and 3

slide-26
SLIDE 26

Tamás Budavári

Covers for Shapes

 Inside approximation  Outside overshoot

 They are Key ranges

7/16/2012 ISSAC at HiPACC 26

Key between 0 and 3 Key between 0 and 7

slide-27
SLIDE 27

Tamás Budavári

Covers for Shapes

 Inside approximation  Outside overshoot

 They are Key ranges

7/16/2012 ISSAC at HiPACC 27

Key between 0 and 3 Key between 0 and 7 Key between 0 and 3

  • r

Key between 8 and 11

slide-28
SLIDE 28

Tamás Budavári

Covers for Shapes

 Inside approximation  Outside overshoot

 They are Key ranges

7/16/2012 ISSAC at HiPACC 28

Key between 0 and 3 Key between 0 and 7 Key between 0 and 3

  • r

Key between 8 and 11

slide-29
SLIDE 29

Tamás Budavári

Periodic Boundaries

 Infinite with periodicity

 Have to search all boxes

7/16/2012 ISSAC at HiPACC 29

slide-30
SLIDE 30

Tamás Budavári

Periodic Boundaries

 Infinite with periodicity

 Have to search all boxes

7/16/2012 ISSAC at HiPACC 30

slide-31
SLIDE 31

Tamás Budavári

Searching in SQL

7/16/2012 ISSAC at HiPACC 31

 Key filter

 By Cover

 ShiftX,-Y,-Z

 Where?

slide-32
SLIDE 32

Tamás Budavári

Real!

7/16/2012 ISSAC at HiPACC 32

 E.g.,

slide-33
SLIDE 33

Tamás Budavári

Online Interfaces

7/16/2012 ISSAC at HiPACC 33

 Largest simulations

 Search and visualize  10 billion+ objects

and growing…

 Indra 512 simulations

 Coming soon at JHU

slide-34
SLIDE 34

Tamás Budavári

Millennium XXL

ISSAC at HiPACC 34

slide-35
SLIDE 35

Tamás Budavári

Web Services

7/16/2012 ISSAC at HiPACC 35

 Programming interfaces

 Execute SQL queries

 Most flexible

 Inject probes in simulations

 Turbulence  Cosmology

slide-36
SLIDE 36

Sky Coverage

7/16/2012 ISSAC at HiPACC

36

slide-37
SLIDE 37

Tamás Budavári

No Sky Coverage?

37

slide-38
SLIDE 38

Tamás Budavári

A B A B

Green area: A  (B- ε) should find B if it contains an A and not masked Yellow area: A  (B±ε) is an edge case may find B if it contains an A.

Spherical Geometry

38 7/16/2012

slide-39
SLIDE 39

Tamás Budavári

Approaches to Consider

 Pixel maps

 Sensitivity, etc…

 Equations of shapes

 Spherical “vector graphics”

 And beyond…

7/16/2012 ISSAC at HiPACC 39

slide-40
SLIDE 40

Tamás Budavári

An Observation

 FITS header with WCS

 Image dimensions map

to the geometry

 More exposures?

 No common pixel

coordinate-system

 Overlapping areas

7/16/2012 ISSAC at HiPACC 40

slide-41
SLIDE 41

Tamás Budavári

Common Pixels

 Pre-defined pages of an atlas  Standard in cartography  Image pyramids of hierarchical pixels  Including HTM, Igloo, HEALPix, SDSSPix, etc…  Always approximate!

7/16/2012 41

slide-42
SLIDE 42

Tamás Budavári

Practical Implementation

 Looking at Terapixels

 We know how to work with images  Now have commodity Internet  We have cheap hard-drives

WorldWideTelescope.org Sky in Ggle Earth

 Integrated catalogs for efficiency

 How about more surveys?

7/16/2012 ISSAC at HiPACC 42

slide-43
SLIDE 43

Tamás Budavári

Drawing with Equations

 Working with 3D normal vectors  Benefits include No wraparound No projections No singularities

7/16/2012 ISSAC at HiPACC 43

slide-44
SLIDE 44

Tamás Budavári

Drawing with Equations

 Direct 3D approach

 Halfspace  Circle/Cap  Convex  Simple shapes

 Region

 Unions of convexes  Patches on the sphere

7/16/2012 ISSAC at HiPACC 44

slide-45
SLIDE 45

Tamás Budavári

Point in Region Test

 Halfspace: one side of a plane

 Inside, when

 Convex: a collection of halfspaces

 Inside, when inside all halfspaces

 Region: a collection of convexes

 Inside, when inside any convex

c x n    

) , ( c n 

7/16/2012 ISSAC at HiPACC 45

slide-46
SLIDE 46

Tamás Budavári

Shape Operations

 Intersection

 Concat halfspace lists

 Union

 Concat convex lists  Unique coverage  Analytic area

 Boolean algebra

7/16/2012 ISSAC at HiPACC

46

slide-47
SLIDE 47

Tamás Budavári

Difference of Convexes is a Region

7/16/2012 ISSAC at HiPACC 47

 The set of Regions is closed for the Boolean ops

slide-48
SLIDE 48

Tamás Budavári ISSAC at HiPACC

Simplification

 Eliminate redundant halfspaces

 First handle trivial combinations of constraints  Then solve geometry on the surface

 Derive Roots, Arcs, Patches

 Eliminate redundant convexes

 Some trivial cases, but…

 Make convexes disjoint

 Unique coverage, area, etc.

 Stitch together convexes

 When possible

7/16/2012 48

slide-49
SLIDE 49

Tamás Budavári ISSAC at HiPACC

SphericalLib .NET

 C# code  10k lines

 OS independent (Windows, Un*x w/ Mono)  Documentation via Sandcastle

 Great performance!

 Sloan Digital Sky Survey in 10s

(13× larger than USA in area)

7/16/2012 49

slide-50
SLIDE 50

Tamás Budavári ISSAC at HiPACC

Numerical Imprecision

 Double precision calculations

 IEEE 754 standard

 Degeneracy

 When are two vectors the same?

 Spatial resolution limit

 Roughly 30 cm on Earth

 Lots of tricks from Graphics Gems ɛ

7/16/2012 50

slide-51
SLIDE 51

Tamás Budavári

Sky coverage of the Sloan Digital Sky Survey’s 5th Data Release and the Galaxy Evolution Explorer’s 2nd Public Release

51

slide-52
SLIDE 52

Tamás Budavári

Region in SQL

7/16/2012 ISSAC at HiPACC 52

slide-53
SLIDE 53

Tamás Budavári

Footprint Services

 All about coverage

 Editor and calculator  Online public repository  On-the-fly visualization  STC translator, etc…

 Web services

 Simple programming

http://voservices.net/footprint

53

ISSAC at HiPACC

slide-54
SLIDE 54

Tamás Budavári 7/16/2012 54

slide-55
SLIDE 55

Hybrid Solutions

7/16/2012 ISSAC at HiPACC

55

slide-56
SLIDE 56

Tamás Budavári

Heuristic Simplification

7/16/2012 ISSAC at HiPACC

 Before and After

56

slide-57
SLIDE 57

Tamás Budavári

Indexing the Sky

 Hierarchical Triangular Mesh  Region approximation

 Fast filtering using

HTM ID ranges

7/16/2012 ISSAC at HiPACC 57

slide-58
SLIDE 58

Tamás Budavári

Anatomy of an SDSS Region

7/16/2012 ISSAC at HiPACC 58

slide-59
SLIDE 59

Tamás Budavári

HTM Filtering

7/16/2012 ISSAC at HiPACC 59

slide-60
SLIDE 60

Tamás Budavári

Summary

7/16/2012 ISSAC at HiPACC 60

 Store simulations, e.g., the reference Millennium

 Simulations take 10x longer than analysis

 Databases enable fast searches

 Custom routines  Space-filling curves

 Direct comparison of observed universe to sims

slide-61
SLIDE 61

Tamás Budavári 7/16/2012 ISSAC at HiPACC 61