L1dbproto Design and Results Andy Salnikov SLAC LSST2018 Project - - PowerPoint PPT Presentation

l1dbproto design and results
SMART_READER_LITE
LIVE PREVIEW

L1dbproto Design and Results Andy Salnikov SLAC LSST2018 Project - - PowerPoint PPT Presentation

L1dbproto Design and Results Andy Salnikov SLAC LSST2018 Project & Community Workshop August 16, 2018 #lsst2018 #lsst2018 LSST Project and Community Workshop 2018 Tucson August 13 - 17 LSST Project and Community Workshop 2018


slide-1
SLIDE 1

1

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

L1dbproto Design and Results

Andy Salnikov SLAC

LSST2018 Project & Community Workshop August 16, 2018

slide-2
SLIDE 2

2

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

What is l1dbproto

  • L1DB (a.k.a. PPDB) has some serious challenges

Data size / number of objects stored

Time constraints for alert generatjon

Concurrency

  • Not clear if single database server can handle that sort of load

Need to try some ~realistjc use cases on ~realistjc hardware to understand limits

  • f existjng tech
  • L1dbproto

Trivial simulatjon of AP pipeline to generate DiaObjects/DiaSources

(Evolving) implementatjon of L1DB API on top of SQLAlchemy

slide-3
SLIDE 3

3

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Some numbers

  • 10k variable sources per visit
  • ~5k noise sources
  • New DiaObject object version is constructed from ~12 months

history of DiaSource (and DiaForcedSource)

slide-4
SLIDE 4

4

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Database Tables

  • Schema in DPDD and cat package:

DiaObject – wide table with all measured parameters

DiaSource – medium-width table

DiaForcedSource – narrow table

slide-5
SLIDE 5

5

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

AP Pipeline

  • Simplifjed workfmow (CCD-parallel):

Extract Sources from difgerence images

Read DiaObjects from database and Dia(Forced)Source history (using CCD region and/or diaObjectId)

Associate new Sources with existjng DiaObjects

For non-associated DiaObjects

  • Do forced photometry on current visit

For non-associated DiaSources

  • possibly do some forced photometry?

Build new (version of) DiaObject from all Dia(Forced)Sources

Save all new DiaObjects, Dia(Forced)Sources

slide-6
SLIDE 6

6

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Parallelism in AP Pipeline

  • Actual pipeline will use CCD-level parallelism (?)

~200 concurrent workers

  • Does database access need to be done in parallel

Or will it work betuer if we ran single query for whole visit?

  • May depend on many factors, needs to be measured
  • Prototype can run in:

Single-process mode, one process reads/writes whole visit data

Multj-process mode, forking N children on each visit, each processing its

  • wn patch of FOV
slide-7
SLIDE 7

7

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

L1DB database API

  • Implementjng basic read/store operatjons

Read last version of DiaObjects in a region

Read Dia(Forced)Sources history (based on region or objectId)

Store set of new objects/sources

  • Based on SQLAlchemy, with all necessary engine-specifjc
  • ptjmizatjons

Feels more or less optjmal already, some minor improvements may be possible going to C++ (if needed)

  • Many confjgurable aspects (to study implementatjon optjons)
  • Should be usable outside l1dbproto
slide-8
SLIDE 8

8

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

First Tests at IN2P3

  • Single server machine with plenty of RAM and some SSD:

CPU Intel(R) Xeon(R) CPU E5-2680 v4, 2.4GHz

2x14 physical cores, 56 total threads

512GB RAM

2TB NVMe and 3TB SATA SSD

7TB of spinning disks in RAID6 array

  • Bunch of tests were done with MySQL and PostgreSQL using

difgerent storage types and indexing approaches.

slide-9
SLIDE 9

9

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Baseline schema, HDDs

  • Three tables: DiaObject, DiaSource,

DiaForcedSource

  • Need betuer indexes for AP

performance

DiaObject select is “last version of object in a region”

Adding index on HTM+IOV

  • Single-process mode
  • Conclusion: performance on spinning

disks is terrible

Not unexpected

slide-10
SLIDE 10

10

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Baseline schema on SSD

  • Testjng SSD/SATA vs NVMe,

PostgreSQL vs MySQL

Stjll single-process

  • Much betuer performance

compared to HDDs

  • DiaObject read-write takes

signifjcant fractjon

Need difgerent indexing approach

slide-11
SLIDE 11

11

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Betuer DiaObject Indexing

  • For AP we only need last

DiaObject version, but we have to index all versions

  • Make separate table with latest

DiaObject versions and index it

  • ptjmally for AP (DiaObjectLast)

De-normalizatjon, have to insert into both tables

But can do UPSERT on the Last table which should be more effjcient

slide-12
SLIDE 12

12

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Parallel processing

  • CCD-level paralellism, ~200

independent process per visit

  • Every process doing its database

actjvity completely independently

  • Lot’s of limitjng factors:

28 cores / 56 threads

Clients and server on one host

Server and I/O

  • Postgres behaves betuer than

MySQL, NVMe somewhat betuer

slide-13
SLIDE 13

13

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Oracle RAC @ NCSA

  • Impressive hardware

3 Dual Intel Xeon E5-2650, 2x12 cores, 48 threads @ 2.2 GHz

  • 128GB memory
  • 10Gbps connectjvity to both public and private networks
  • 16Gbps fjbre channel card for storage connectjvity

Netapp storage array

  • 16Gbps fjbre channel connectjvity
  • 50 8TB HDDs
  • 10 1.5TB SSDs
  • RAID 10
  • Several backend-specifjc things in L1dbproto (e.g. UPSERT, limits) had to be

re-implemented for Oracle

slide-14
SLIDE 14

14

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Initjal testjng

  • Observed some strange performance

hit for ~1k visits

Afuer that signifjcant improvement

  • Spent a lot of tjme with DBAs to

understand the reason

Optjmizer chooses poor query plan (FULL INDEX SCAN vs betuer INDEX RANGE SCAN)

  • Tried to fjnd a way to force optjmizer

into a betuer plan, nothing worked yet

  • Workaround is to either wait or pre-

load statjstjcs (that has to be generated by previous run)

slide-15
SLIDE 15

15

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Intermediate result with Oracle

  • Tried several indexing optjons for

DiaObjectLast

Decided on Index-Organized Table

Only a subset of DiaObject columns

  • Ignoring initjal bump - performance

about the same as Postgres

slide-16
SLIDE 16

16

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Further plans

  • At 1-month scale we see linear growth, dominated by fetching of

DiaSource data

Does not look good for extrapolatjng to 12 months of history

Try fetch by region instead of by objectId

  • Optjons to study:

Partjtjon DiaSource at ~1-month shards, try to load concurrently

  • We may already be at concurrency limit

Pre-loading of DiaSources in advance

  • Potentjal issue with data dependency if there are visit overlaps

In-memory cache for DiaSources, e.g. in-memory table

  • Full sky 12 months DiaSources: ~1.5TB, one night: ~500GB
  • Comparable hardware setup for Postgres @ NCSA?
slide-17
SLIDE 17

17

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Summary

  • L1dbproto is a useful tool to study scalability and

implementatjon optjons

Can also be used as interim PPDB API

  • Main issue we want to understand now is whether single-

server database can be used for PPDB

Extrapolatjng to 12 months history is not encouraging yet

There are few optjons to try

  • Some interestjng issues observed with Oracle

We don’t think this will be critjcal, but takes efgort to understand

  • Comparable PostgreSQL setup would be nice to have for testjng
slide-18
SLIDE 18

18

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

  • DPDD (schema for PPDB): htup://ls.st/LSE-163*
  • Sizing Model Spreadsheet: htup://ls.st/LDM-141*
  • DMTN-018: htups://dmtn-018.lsst.io/
  • Notebooks: htups://github.com/lsst-dm/l1dbproto-notebooks
  • Code: htups://github.com/lsst-dm/l1dbproto

Links