Accelerating access to data archives with the new version of - - PowerPoint PPT Presentation

accelerating access to data archives with the new version
SMART_READER_LITE
LIVE PREVIEW

Accelerating access to data archives with the new version of - - PowerPoint PPT Presentation

Accelerating access to data archives with the new version of pgSphere Markus Nullmeier Zentrum fr Astronomie der Universitt Heidelberg Astronomisches Rechen-Institut mnullmei@ari.uni.heidelberg.de Accelerating access to data archives with


slide-1
SLIDE 1

Markus Nullmeier

Zentrum für Astronomie der Universität Heidelberg Astronomisches Rechen-Institut

mnullmei@ari.uni.heidelberg.de

Accelerating access to data archives with the new version of pgSphere

slide-2
SLIDE 2

Markus Nullmeier

mnullmei@ari.uni.heidelberg.de

  • About pgSphere
  • New pgSphere features since 2014
  • Extending pgSphere with sky coverage data types

Accelerating access to data archives with the new version of pgSphere

slide-3
SLIDE 3
  • pgSphere?

About pgSphere

slide-4
SLIDE 4
  • PostgreSQL extension: new SQL data types, functions,

indexes

  • PostgreSQL: “The world's most advanced open source database”
  • SQL data types: spherical points (RA, DEC),

spherical lines, polygons, ellipses, paths, spherical transformations (rotations)

About Pgsphere

slide-5
SLIDE 5

VO Usage of pgSphere

RA DEC SR

X-match

slide-6
SLIDE 6

Database indexes of spherical coordinates for, e. g.:

  • Cone search
  • Cross-match
  • Images (e. g., digitised astronomical plates)

Pgsphere internals

slide-7
SLIDE 7

Pgsphere internals R-tree

R1 R3 R4 R9 R11 R13 R10 R12 R16 R15 R14 R8 R2 R6 R7 R17 R18 R19 R5 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13 R14 R15 R16 R17 R18 R19

slide-8
SLIDE 8

Pgsphere development history

Janko Richter Teodor Sigaev Oleg Bartunov Igor Chilingarian

slide-9
SLIDE 9

Pgsphere development nowadays

Dmitry Ivanov Alexander Korotkov Markus Nullmeier contributors: Pat Dowler, Serge Monkewitz

slide-10
SLIDE 10
  • Greatly improved R-tree indexing, 1..2 order of magnitude

faster:

  • A. Korotkov “A new double sorting-based node splitting

algorithm for R-tree”, Programming and Computing Software 38(3), 2012, DOI: 10.1134/S0361768812030024

  • All open / known open bugs fixed
  • Addition of new-style SQL “contains” operators
  • More numerical stability
  • Custom PostgreSQL optimisation for spatial joins

(= crossmatch)

New PgSphere features since 2014

slide-11
SLIDE 11

[publication of benchmarks planned for ADASS XXVI, Trieste 2016]

New R-tree indexing

slide-12
SLIDE 12

F

MOC = Multi-order coverage (HEALPix Multi-Order Coverage map)

  • Concise mapping of a catalog's coverage of the sphere
  • Coverage made up from discrete elements
  • Making MOC and sky maps a first-class SQL data type...

Extending pgSphere with sky coverage data types

→ go to the MOC tutorial tomorrow!

slide-13
SLIDE 13

MOC as indexable SQL data type

  • I/O to / from files
  • Create one MOC from table column or query
  • Specify your own MOC and search over all catalogs of a

data center:

SELECT name FROM catalogs WHERE my_moc <@ catalogs.moc ; Sky map data type: analogous to MOC

WIP: sky coverage data types for pgSphere

slide-14
SLIDE 14
  • R-trees will not work for MOC representing catalogs
  • PostgreSQL custom indexing will be in Release 9.6:

https://github.com/postgrespro/rum

  • Core of new index structure:

MOC: indexing

RANGES OF NUMBERS OF HEALPIX ELEMENTS SETS OF MOC IDs range0 { id7, id11 } range1 { id2, id108, id109 } range2 { id108, id732, id11030 } ... ...

slide-15
SLIDE 15
  • Download, use, test, and join the community

at the pgSphere home page:

http://pgsphere.github.io

  • Send in bug reports
  • Send in test cases
  • Send in patches
  • Send in feature requests :-)

Your involvement