[PPT] - L1dbproto Design and Results Andy Salnikov SLAC LSST2018 Project PowerPoint Presentation

SLIDE 1

1

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

L1dbproto Design and Results

Andy Salnikov SLAC

LSST2018 Project & Community Workshop August 16, 2018

SLIDE 2

2

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

What is l1dbproto

L1DB (a.k.a. PPDB) has some serious challenges

–

Data size / number of objects stored

–

Time constraints for alert generatjon

–

Concurrency

Not clear if single database server can handle that sort of load

–

Need to try some ~realistjc use cases on ~realistjc hardware to understand limits

f existjng tech
L1dbproto

–

Trivial simulatjon of AP pipeline to generate DiaObjects/DiaSources

–

(Evolving) implementatjon of L1DB API on top of SQLAlchemy

SLIDE 3

3

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Some numbers

10k variable sources per visit
~5k noise sources
New DiaObject object version is constructed from ~12 months

history of DiaSource (and DiaForcedSource)

SLIDE 4

4

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Database Tables

Schema in DPDD and cat package:

–

DiaObject – wide table with all measured parameters

–

DiaSource – medium-width table

–

DiaForcedSource – narrow table

SLIDE 5

5

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

AP Pipeline

Simplifjed workfmow (CCD-parallel):

–

Extract Sources from difgerence images

–

Read DiaObjects from database and Dia(Forced)Source history (using CCD region and/or diaObjectId)

–

Associate new Sources with existjng DiaObjects

–

For non-associated DiaObjects

Do forced photometry on current visit

–

For non-associated DiaSources

possibly do some forced photometry?

–

Build new (version of) DiaObject from all Dia(Forced)Sources

–

Save all new DiaObjects, Dia(Forced)Sources

SLIDE 6

6

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Parallelism in AP Pipeline

Actual pipeline will use CCD-level parallelism (?)

–

~200 concurrent workers

Does database access need to be done in parallel

–

Or will it work betuer if we ran single query for whole visit?

May depend on many factors, needs to be measured
Prototype can run in:

–

Single-process mode, one process reads/writes whole visit data

–

Multj-process mode, forking N children on each visit, each processing its

wn patch of FOV

SLIDE 7

7

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

L1DB database API

Implementjng basic read/store operatjons

–

Read last version of DiaObjects in a region

–

Read Dia(Forced)Sources history (based on region or objectId)

–

Store set of new objects/sources

Based on SQLAlchemy, with all necessary engine-specifjc
ptjmizatjons

–

Feels more or less optjmal already, some minor improvements may be possible going to C++ (if needed)

Many confjgurable aspects (to study implementatjon optjons)
Should be usable outside l1dbproto

SLIDE 8

8

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

First Tests at IN2P3

Single server machine with plenty of RAM and some SSD:

–

CPU Intel(R) Xeon(R) CPU E5-2680 v4, 2.4GHz

–

2x14 physical cores, 56 total threads

–

512GB RAM

–

2TB NVMe and 3TB SATA SSD

–

7TB of spinning disks in RAID6 array

Bunch of tests were done with MySQL and PostgreSQL using

difgerent storage types and indexing approaches.

SLIDE 9

9

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Baseline schema, HDDs

Three tables: DiaObject, DiaSource,

DiaForcedSource

Need betuer indexes for AP

performance

–

DiaObject select is “last version of object in a region”

–

Adding index on HTM+IOV

Single-process mode
Conclusion: performance on spinning

disks is terrible

–

Not unexpected

SLIDE 10

10

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Baseline schema on SSD

Testjng SSD/SATA vs NVMe,

PostgreSQL vs MySQL

–

Stjll single-process

Much betuer performance

compared to HDDs

DiaObject read-write takes

signifjcant fractjon

–

Need difgerent indexing approach

SLIDE 11

11

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Betuer DiaObject Indexing

For AP we only need last

DiaObject version, but we have to index all versions

Make separate table with latest

DiaObject versions and index it

ptjmally for AP (DiaObjectLast)

–

De-normalizatjon, have to insert into both tables

–

But can do UPSERT on the Last table which should be more effjcient

SLIDE 12

12

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Parallel processing

CCD-level paralellism, ~200

independent process per visit

Every process doing its database

actjvity completely independently

Lot’s of limitjng factors:

–

28 cores / 56 threads

–

Clients and server on one host

–

Server and I/O

Postgres behaves betuer than

MySQL, NVMe somewhat betuer

SLIDE 13

13

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Oracle RAC @ NCSA

Impressive hardware

–

3 Dual Intel Xeon E5-2650, 2x12 cores, 48 threads @ 2.2 GHz

128GB memory
10Gbps connectjvity to both public and private networks
16Gbps fjbre channel card for storage connectjvity

–

Netapp storage array

16Gbps fjbre channel connectjvity
50 8TB HDDs
10 1.5TB SSDs
RAID 10
Several backend-specifjc things in L1dbproto (e.g. UPSERT, limits) had to be

re-implemented for Oracle

SLIDE 14

14

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Initjal testjng

Observed some strange performance

hit for ~1k visits

–

Afuer that signifjcant improvement

Spent a lot of tjme with DBAs to

understand the reason

–

Optjmizer chooses poor query plan (FULL INDEX SCAN vs betuer INDEX RANGE SCAN)

Tried to fjnd a way to force optjmizer

into a betuer plan, nothing worked yet

Workaround is to either wait or pre-

load statjstjcs (that has to be generated by previous run)

SLIDE 15

15

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Intermediate result with Oracle

Tried several indexing optjons for

DiaObjectLast

–

Decided on Index-Organized Table

–

Only a subset of DiaObject columns

Ignoring initjal bump - performance

about the same as Postgres

SLIDE 16

16

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Further plans

At 1-month scale we see linear growth, dominated by fetching of

DiaSource data

–

Does not look good for extrapolatjng to 12 months of history

–

Try fetch by region instead of by objectId

Optjons to study:

–

Partjtjon DiaSource at ~1-month shards, try to load concurrently

We may already be at concurrency limit

–

Pre-loading of DiaSources in advance

Potentjal issue with data dependency if there are visit overlaps

–

In-memory cache for DiaSources, e.g. in-memory table

Full sky 12 months DiaSources: ~1.5TB, one night: ~500GB
Comparable hardware setup for Postgres @ NCSA?

SLIDE 17

17

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

Summary

L1dbproto is a useful tool to study scalability and

implementatjon optjons

–

Can also be used as interim PPDB API

Main issue we want to understand now is whether single-

server database can be used for PPDB

–

Extrapolatjng to 12 months history is not encouraging yet

–

There are few optjons to try

Some interestjng issues observed with Oracle

–

We don’t think this will be critjcal, but takes efgort to understand

Comparable PostgreSQL setup would be nice to have for testjng

SLIDE 18

18

LSST Project and Community Workshop 2018 • Tucson • August 13 - 17

#lsst2018

DPDD (schema for PPDB): htup://ls.st/LSE-163*
Sizing Model Spreadsheet: htup://ls.st/LDM-141*
DMTN-018: htups://dmtn-018.lsst.io/
Notebooks: htups://github.com/lsst-dm/l1dbproto-notebooks
Code: htups://github.com/lsst-dm/l1dbproto