Thoughts on alternate DUNE DAQ design Georgia Karagiorgi DUNE DAQ - - PowerPoint PPT Presentation

thoughts on alternate dune daq design
SMART_READER_LITE
LIVE PREVIEW

Thoughts on alternate DUNE DAQ design Georgia Karagiorgi DUNE DAQ - - PowerPoint PPT Presentation

Thoughts on alternate DUNE DAQ design Georgia Karagiorgi DUNE DAQ Meeting Oct. 16, 2017 Introduction What is presented in this talk is a conceptual DAQ design & architecture for the DUNE FD ; I will advocate it should be explored


slide-1
SLIDE 1

Thoughts on alternate DUNE DAQ design

Georgia Karagiorgi DUNE DAQ Meeting

  • Oct. 16, 2017
slide-2
SLIDE 2

Introduction

  • What is presented in this talk is a conceptual DAQ design & architecture for the DUNE FD;

I will advocate it should be explored further, and evaluated against DUNE DAQ requirements, as something which would be advantageous to move toward to.

  • Concept: Online (real-time) image processing and data selection.

– Event ID “on the fly”, minimizing offline processing and reconstruction needs – Could potentially require much more minimal processing needs in front-end DAQ

  • Design being explored leverages advancements in Deep Neural Networks and their

applications, and is informed and motivated by – Columbia Nevis Labs experience with MicroBooNE readout/DAQ system – Deep Learning development and results by MicroBooNE & DUNE collaborations – Recent work done in collaboration with L. Carloni’s Group (Columbia Comp. Sci.)

  • For this talk: explore design from the perspective of the single-phase detector;

expectation is that the same design is equivalently applicable to dual-phase as well

2

slide-3
SLIDE 3

DUNE DAQ System Parameters: Data Rates

  • Consider a single 10kton module:

No data reduction – Continuous readout rate: 150 APA x 2,560 ch x 2 MHz x 1.5 B = 1.1 TB/s – Single, localized event size (~size of an APA x 1 drift): 2,560 ch x 2 MHz x 1.5 B x 2.25 ms = 17.3 MB – Single, extended event size (all APAs x 2.4 drifts): 150 APA x 2,560 ch x 2 MHz x 1.5 B x 5.4 ms = 6.22 GB

APA 1 APA N APA 1 APA N

localized event extended event

3

slide-4
SLIDE 4

DUNE DAQ System Parameters: Data Rates

  • Consider a single 10kton module:

With data reduction (e.g. factor of 500-1800, depending on noise and radiological backgrounds, see docdb-4481) – Continuous readout rate: 150 APA x 2,560 ch x 2 MHz x 1.5 B / (500-1800) = 0.6-2.2 GB/s – Single, localized event size (~size of an APA x 1 drift): 2,560 ch x 2 MHz x 1.5 B x 2.25 ms / (500-1800) = 10-35 kB – Single, extended event size (all APAs x 2.4 drifts): 150 APA x 2,560 ch x 2 MHz x 1.5 B x 5.4 ms / (500-1800) = 3.5-12 MB

APA 1 APA N APA 1 APA N

localized event extended event

4

slide-5
SLIDE 5

DUNE DAQ System Parameters: Data Rates

  • Consider a single 10kton module:

With data reduction (e.g. factor of 500-1800, depending on noise and radiological backgrounds, see docdb-4481) – Continuous readout rate: 150 APA x 2,560 ch x 2 MHz x 1.5 B / (500-1800) = 0.6-2.2 GB/s – Single, localized event size (~size of an APA x 1 drift): 2,560 ch x 2 MHz x 1.5 B x 2.25 ms / (500-1800) = 10-35 kB – Single, extended event size (all APAs x 2.4 drifts): 150 APA x 2,560 ch x 2 MHz x 1.5 B x 5.4 ms / (500-1800) = 3.5-12 MB

APA 1 APA N APA 1 APA N

localized event extended event

Note: A system which rests on noise assumptions and assumed data reduction factors is risk-prone

5

slide-6
SLIDE 6

DUNE DAQ: Rethinking our challenge

  • DUNE is a 3D imaging device
  • Raw data format is ideally suited for deep learning based image

processing techniques

  • Promising performance for powerful image processing and classification
  • E.g. performance with offline (GPU)

training and inference:

DUNE SP FD simulations [J. Hewes] VGG16 CNN trained to isolate n-nbar

  • scillation events from atmospheric neutrino

backgrounds (more in Jeremy’s thesis) MicroBooNE: CNNs successful in identification and differentiation among different particle types. [JINST 12, P03011 (2017)]

Most Frequent MisID (%) Detection Accuracy (%)

6 Excellent separation between atm. nu and n-nbar signal images

slide-7
SLIDE 7

New DAQ philosophy

  • Real-time image processing utilizing Deep Neural Networks (ideally on FPGA)
  • Minimize disk buffering needs (more on back-end)
  • Minimize reconstruction needs
  • Minimize reliance on noise rates (trainability)
  • Concerns

1. Speed of inference (per “image”); can we keep up with rates if we want to process every drift window (necessary for SN)? 2. Reliability of inference; already know MC-only training is deficient; how can we train reliably? Rare event searches often have no “control” data samples. 3. Changing detector conditions and need for retraining; what features are DNNs most sensitive to? what retraining frequency? what resources does this require? 4. How do we practically (re)train on data in real time? 5. Cost, technology lifecycle, power consumption, lifetime, …

  • Studies are needed to address above concerns and demonstrate the feasibility of a

DNN-based readout & DAQ scheme early on.

7

slide-8
SLIDE 8

A “stab” at a conceptual DAQ design:

Cold elec. “slice” (e.g. 1 APA)

Noise filtering (e.g. coh. noise removal)

  • Reco. class. A

(high-E nu)

ROI image

  • classif. (A/

B/C/D/...); per plane,

  • r all 3

planes Signal processing channel data regrouping (e.g. by wire plane, by APA volume) Signal processing 1 Signal processing 2

  • Reco. Class. B

(e.g. CRM)

  • Reco. Class.

C (e.g. p->Knu)

  • Reco. Class.

D (e.g. n-nbar)

Full frame

  • classif. (SN);

per plane,

  • r all 3

planes

  • Reco. Class.

  • Reco. Class. X

(SN)

“traditional” fpga

Active “frame” selection (e.g. min.

  • integr. charge

AND/OR external/beam trigger) Active “frame” selection (e.g. min.

  • integr. charge

AND/OR external/beam trigger) Active “frame” selection (e.g. min.

  • integr. charge

AND/OR external triggers)

external/beam/photodet. triggers DNN

ROI finding (cropping in channel and time space) ROI finding (cropping in channel and time space) ROI finding (cropping in channel and time space)

localized events extended events

data pre-processing data pre-selection/selection disk writing

8

slide-9
SLIDE 9

A “stab” at a conceptual DAQ design:

Cold elec. “slice” (e.g. 1 APA)

Noise filtering (e.g. coh. noise removal)

  • Reco. class. A

(high-E nu)

ROI image

  • classif. (A/

B/C/D/...); per plane,

  • r all 3

planes Signal processing channel data regrouping (e.g. by wire plane, by APA volume) Signal processing 1 Signal processing 2

  • Reco. Class. B

(e.g. CRM)

  • Reco. Class.

C (e.g. p->Knu)

  • Reco. Class.

D (e.g. n-nbar)

Full frame

  • classif. (SN);

per plane,

  • r all 3

planes

  • Reco. Class.

  • Reco. Class. X

(SN)

“traditional” fpga

Active “frame” selection (e.g. min.

  • integr. charge

AND/OR external/beam trigger) Active “frame” selection (e.g. min.

  • integr. charge

AND/OR external/beam trigger) Active “frame” selection (e.g. min.

  • integr. charge

AND/OR external triggers)

external/beam/photodet. triggers DNN

ROI finding (cropping in channel and time space) ROI finding (cropping in channel and time space) ROI finding (cropping in channel and time space)

localized events extended events Raw data (channel, ADC, TDC) flows from left to right, organized serially, in “frames”. a frame is O(1) drift, and, e.g. 1 APA wire plane (well defined boundary). Every single frame processed down to this stage (at least); then, can optionally drop frames.

9

slide-10
SLIDE 10

What is this layer?

  • Layer developed and optimized for application of DNN for both image selection and classification
  • Can be a combination of FPGA and GPU devices:
  • GPU: acceleration of training
  • FPGA: acceleration of inference
  • During normal operations, DNN implemented in FPGA select and classify frames/ROIs of interest.
  • GPU allows semi-offline (re)training, and adjusting to changing detector conditions.
  • After this layer, images are already classified; specialized, topology-targeted reconstruction can be applied

separately on each event class

A “stab” at a conceptual DAQ design:

Cold elec. “slice” (e.g. 1 APA)

Noise filtering (e.g. coh. noise removal)

  • Reco. class. A

(high-E nu)

ROI image

  • classif. (A/

B/C/D/...); per plane,

  • r all 3

planes Signal processing channel data regrouping (e.g. by wire plane, by APA volume) Signal processing 1 Signal processing 2

  • Reco. Class. B

(e.g. CRM)

  • Reco. Class.

C (e.g. p->Knu)

  • Reco. Class.

D (e.g. n-nbar)

Full frame

  • classif. (SN);

per plane,

  • r all 3

planes

  • Reco. Class.

  • Reco. Class. X

(SN)

“traditional” fpga

Active “frame” selection (e.g. min.

  • integr. charge

AND/OR external/beam trigger) Active “frame” selection (e.g. min.

  • integr. charge

AND/OR external/beam trigger) Active “frame” selection (e.g. min.

  • integr. charge

AND/OR external triggers)

external/beam/photodet. triggers DNN

ROI finding (cropping in channel and time space) ROI finding (cropping in channel and time space) ROI finding (cropping in channel and time space)

localized events extended events

10

slide-11
SLIDE 11

A “stab” at a conceptual DAQ design:

Cold elec. “slice” (e.g. 1 APA)

Noise filtering (e.g. coh. noise removal)

  • Reco. class. A

(high-E nu)

ROI image

  • classif. (A/

B/C/D/...); per plane,

  • r all 3

planes Signal processing channel data regrouping (e.g. by wire plane, by APA volume) Signal processing 1 Signal processing 2

  • Reco. Class. B

(e.g. CRM)

  • Reco. Class.

C (e.g. p->Knu)

  • Reco. Class.

D (e.g. n-nbar)

Full frame

  • classif. (SN);

per plane,

  • r all 3

planes

  • Reco. Class.

  • Reco. Class. X

(SN)

ü Modularity ü Flexibility ü Accessibility ü Built-in redundancies ü Minimization of offline processing and reconstruction ü No explicit assumption on noise levels and data reduction Additional considerations (per docdb-8841) to be addressed:

  • Cost per channel
  • Technology lifecycle
  • Power consumption
  • Lifetime & reliability

“traditional” fpga

Active “frame” selection (e.g. min.

  • integr. charge

AND/OR external/beam trigger) Active “frame” selection (e.g. min.

  • integr. charge

AND/OR external/beam trigger) Active “frame” selection (e.g. min.

  • integr. charge

AND/OR external triggers)

external/beam/photodet. triggers DNN

ROI finding (cropping in channel and time space) ROI finding (cropping in channel and time space) ROI finding (cropping in channel and time space)

localized events extended events

11

slide-12
SLIDE 12

A “stab” at a conceptual DAQ design:

Cold elec. “slice” (e.g. 1 APA)

Noise filtering (e.g. coh. noise removal)

  • Reco. class. A

(high-E nu)

ROI image

  • classif. (A/

B/C/D/...); per plane,

  • r all 3

planes Signal processing channel data regrouping (e.g. by wire plane, by APA volume) Signal processing 1 Signal processing 2

  • Reco. Class. B

(e.g. CRM)

  • Reco. Class.

C (e.g. p->Knu)

  • Reco. Class.

D (e.g. n-nbar)

Full frame

  • classif. (SN);

per plane,

  • r all 3

planes

  • Reco. Class.

  • Reco. Class. X

(SN)

ü Built-in redundancies

  • Coherent noise filtering may be redundant (though unlikely)
  • Channel data regrouping may be unnecessary, as training may work on mixed-wire arrangements
  • Signal processing (deconvolution, further noise filtering) may not be necessary; e.g. DNN performance
  • n raw waveforms may be equivalently good (ongoing study by Y. Zhou)
  • Active frame selection will be necessary if DNN inference is “too slow”
  • ROI cropping will be necessary if DNN implementation (e.g. in FPGA) is limited in terms of image size
  • If frame-by-frame processing is fast enough, then localized and extended events paths can be merged

“traditional” fpga

Active “frame” selection (e.g. min.

  • integr. charge

AND/OR external/beam trigger) Active “frame” selection (e.g. min.

  • integr. charge

AND/OR external/beam trigger) Active “frame” selection (e.g. min.

  • integr. charge

AND/OR external triggers)

external/beam/photodet. triggers DNN

ROI finding (cropping in channel and time space) ROI finding (cropping in channel and time space) ROI finding (cropping in channel and time space)

localized events extended events

12

slide-13
SLIDE 13

Concern 1: Speed of inference: How fast is fast?

  • Standalone study performed on GPU, by Simone Rossi (Columbia Comp. Sci.)
  • VGG16 network used in n-nbar analysis study
  • Collection plane only; after deconvolution; ROI images

preselected and down-sampled to fit 600x600 image

  • Timing calculated on GPU server w/
  • 1 GPU NVIDIA GeForce GTX1080 with 8GB
  • Intel Core i7-6700 CPU (8 cores) @ 3.4GHz (up to 4.00GHz)
  • 64 GB of RAM
  • 16.04.2-Ubuntu LTS
  • Inference speed on GPU:
  • Average: 18.4 ms per ROI image
  • Min: 17.6 ms
  • Max: 71.38 ms
  • Still 1 order of magnitude too slow for frame-by-frame processing;

requires “active frame” pre-selection, or further optimization.

  • Frame-by-frame processing is a critical requirement for real-time SN

selection.

  • In parallel, CPU vs FPGA speed comparisons (ongoing):
  • Preliminary: FPGA offers speed-up by x4 in matrix multiplication

13

slide-14
SLIDE 14

Concerns 2-4: Letting the data itself inform image selection

  • CNN studies so far seem extremely sensitive to (1) modeling/simulation

deficiencies, and possibly (2) noise characteristics. How do we mitigate this?

  • Furthermore, and this is regardless of architecture and data selection strategy, on
  • f our biggest challenges will be how to adapt to changing detector conditions.
  • We need a system which is capable of “learning” from the data itself, and *almost*

in real time, and capable of adapting to the new conditions; what latency is acceptable is TBD.

  • With that in mind, next steps: identifying specific architecture
  • f this block, and how this can be implemented in practice.
  • Also hardware, cost considerations will follow this step.

ROI image

  • classif. (A/

B/C/D/...); per plane,

  • r all 3

planes Full frame

  • classif. (SN);

per plane,

  • r all 3

planes

DNN

14

slide-15
SLIDE 15

Summary

A specialized, and possibly mixed FPGA/GPU architecture for online (real-time) DNN application offers certain advantages 1. Efficient and fast ROI/frame processing and selection 2. Potential for event classification/ID on-the-fly 3. Potential for semi-online (re)training and “auto-tuning” (to deal with data/MC inherent differences; changing detector conditions) 4. Modular and scalable design, with little sensitivity to “event boundary” effects (based

  • n studies so far; see backup)

A first stab at a flexible design with redundancies and built-in knobs to mitigate noise risks has been presented. It could be extremely advantageous to adopt such design for the DUNE FD. Even if such design may not be realistic for implementation on Day 1, we should consider eventually moving to this solution.

  • Will be continuing R&D for such an implementation
  • Feedback, collaboration is welcome!

15

slide-16
SLIDE 16

Backup slides

slide-17
SLIDE 17

ZS legacy Noise, and Noise+Radiologicals

17

slide-18
SLIDE 18

Insensitivity to APA boundary

It seems possible to train CNN’s irrespective of event APA containment.

18