accelerating access to data archives with the new version
play

Accelerating access to data archives with the new version of - PowerPoint PPT Presentation

Accelerating access to data archives with the new version of pgSphere Markus Nullmeier Zentrum fr Astronomie der Universitt Heidelberg Astronomisches Rechen-Institut mnullmei@ari.uni.heidelberg.de Accelerating access to data archives with


  1. Accelerating access to data archives with the new version of pgSphere Markus Nullmeier Zentrum für Astronomie der Universität Heidelberg Astronomisches Rechen-Institut mnullmei@ari.uni.heidelberg.de

  2. Accelerating access to data archives with the new version of pgSphere Markus Nullmeier mnullmei@ari.uni.heidelberg.de ● About pgSphere ● New pgSphere features since 2014 ● Extending pgSphere with sky coverage data types

  3. About pgSphere ● pgSphere?

  4. About Pgsphere ● PostgreSQL extension: new SQL data types, functions, indexes ● PostgreSQL: “The world's most advanced open source database” ● SQL data types: spherical points (RA, DEC), spherical lines, polygons, ellipses, paths, spherical transformations (rotations)

  5. VO Usage of pgSphere SR X-match DEC RA

  6. Pgsphere internals Database indexes of spherical coordinates for, e. g.: ● Cone search ● Cross-match ● Images (e. g., digitised astronomical plates)

  7. Pgsphere internals R1 R4 R11 R-tree R3 R9 R5 R13 R10 R14 R8 R12 R2 R7 R18 R17 R6 R16 R19 R15 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13 R14 R15 R16 R17 R18 R19

  8. Pgsphere development history Igor Janko Richter Chilingarian Teodor Sigaev Oleg Bartunov

  9. Pgsphere development nowadays Dmitry Ivanov Markus Nullmeier Alexander Korotkov contributors: Pat Dowler, Serge Monkewitz

  10. New PgSphere features since 2014 ● Greatly improved R-tree indexing, 1..2 order of magnitude faster: A. Korotkov “A new double sorting-based node splitting algorithm for R-tree”, Programming and Computing Software 38( 3), 2012, DOI: 10.1134/S0361768812030024 ● All open / known open bugs fixed ● Addition of new-style SQL “contains” operators ● More numerical stability ● Custom PostgreSQL optimisation for spatial joins (= crossmatch)

  11. New R-tree indexing [publication of benchmarks planned for ADASS XXVI, Trieste 2016]

  12. Extending pgSphere with sky coverage data types MOC = Multi-order coverage (HEALPix Multi-Order Coverage map) ● Concise mapping of a catalog's coverage of the sphere → go to the MOC F tutorial tomorrow! ● Coverage made up from discrete elements ● Making MOC and sky maps a first-class SQL data type...

  13. WIP: sky coverage data types for pgSphere MOC as indexable SQL data type ● I/O to / from files ● Create one MOC from table column or query ● Specify your own MOC and search over all catalogs of a data center: SELECT name FROM catalogs WHERE my_moc <@ catalogs.moc ; Sky map data type: analogous to MOC

  14. MOC: indexing ● R-trees will not work for MOC representing catalogs ● PostgreSQL custom indexing will be in Release 9.6: https://github.com/postgrespro/rum ● Core of new index structure: RANGES OF NUMBERS OF SETS OF MOC IDs HEALPIX ELEMENTS range0 { id7, id11 } range1 { id2, id108, id109 } range2 { id108, id732, id11030 } ... ...

  15. Your involvement ● Download, use, test, and join the community at the pgSphere home page: http://pgsphere.github.io ● Send in bug reports ● Send in test cases ● Send in patches ● Send in feature requests :-)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend