Some ideas for DUNE DAQ Architecture, Triggering, Reduction (focus - - PowerPoint PPT Presentation

some ideas for dune daq architecture triggering reduction
SMART_READER_LITE
LIVE PREVIEW

Some ideas for DUNE DAQ Architecture, Triggering, Reduction (focus - - PowerPoint PPT Presentation

Some ideas for DUNE DAQ Architecture, Triggering, Reduction (focus on single-phase) Brett Viren Physics Department DUNE FD DAQ Workshop Columbia, 30-31 OCT 2017 Outline Some Numbers Conceptual Design Collection Plane Triggering To Do


slide-1
SLIDE 1

Some ideas for DUNE DAQ Architecture, Triggering, Reduction (focus on single-phase) Brett Viren

Physics Department DUNE FD DAQ Workshop Columbia, 30-31 OCT 2017

slide-2
SLIDE 2

Outline

Some Numbers Conceptual Design Collection Plane Triggering To Do

Brett Viren (BNL) DUNE DAQ October 27, 2017 2 / 22

slide-3
SLIDE 3

Some Numbers

Some Relevant Numbers

39Ar 1 Bq/kg × 10 kt / 200 TPC1 = 50 kHz/TPC = 50/ms/TPC

SNB O(0) event/ms/TPC, ∼1000 events/burst/10kt Beam ∼1Hz rate, trigger latency = MINOS: 0.6-5s, NOνA: 20s APA 10 GByte/sec (full-stream @ 2 Byte samples) 10 kt 1.5 TByte/sec = 12 Tbps PCIe 15 GByte/sec (v3x16 today, v4 will be × faster) RAM 25 GByte/sec (DDR4-3200) NIC 10 Gbps “common”, 40 or 100 Gbps available FFT 5ms / GPU card, 500ms / CPU-FPU core (2D, round trip, per plane, single-prec. FP) NF+SP Wire-Cell noise filter + sig.proc. 2 min/event (µB). reco protoDUNE LArSoft reco full chain 30-45 min/event.

1Aka “drift cells” aka “LAr volumes” Brett Viren (BNL) DUNE DAQ October 27, 2017 3 / 22

slide-4
SLIDE 4

Conceptual Design

Some Numbers Conceptual Design Collection Plane Triggering To Do

Brett Viren (BNL) DUNE DAQ October 27, 2017 4 / 22

slide-5
SLIDE 5

Conceptual Design

Concept In a Nutshell

  • Flow all data through RAM in commodity computers.
  • Process data quickly while still in RAM.
  • Trigger locally (per-TPC), process triggers globally, return

trigger commands.

  • Interpret trigger commands locally (per-APA) to dispatch

requested subset of data.

  • Scale commodity computers up and out as needed to

accommodate above.

Brett Viren (BNL) DUNE DAQ October 27, 2017 5 / 22

slide-6
SLIDE 6

Conceptual Design

Minimal Concept Requirements: ADC to RAM

  • Get data across cold → warm boundary
  • Introduce no excess noise in the process!
  • Aggregate data through whatever needed stages (ADC,

FEMB, WIB, FELIX) finally to RAM via just a few, fast links.

  • Reliable, constant data transfer of 82 Gbps/APA.
  • Perform minimal processing on the way:
  • NO compression/decompression needed/wanted.
  • Reformat data from hardware packing to “software-friendly” format

for a RAM ring buffer (channel×tick block and 12bit→2Byte)

  • A single interface board per host computer preferred for

simpler firmware and software interface to RAM buffer. WIB+FELIX provides a working example.

Brett Viren (BNL) DUNE DAQ October 27, 2017 6 / 22

slide-7
SLIDE 7

Conceptual Design

High Level, Minimal Concept

APA Per-APA Host Computer WIB Crate PCIe board(s)

  • eg. FELIX

10GB/s Full-stream Data Shared Memory Ring Buffer TPC: Nticks*2560ch*2B (not to forget PDS!) reader1 reader2 reader3

  • Full-stream, constant data flow from ADCs host system RAM.
  • Buffer Nch TPC waveforms in Ntick-deep shared-memory ring buffer.
  • Operate directly on data with “reader” processes running on localhost.

Call this a “Tier 1 Host”.

Brett Viren (BNL) DUNE DAQ October 27, 2017 7 / 22

slide-8
SLIDE 8

Conceptual Design

Generalize and Scale

Likely one host is not enough to process all data from one APA. Build a tiered hierarchy of hosts. Each generic host node has:

1 Data stream input source (FELIX, NIC). 2 RAM ring buffer. 3 Local processing. 4 Result stream output sink (NIC, disk).

Data flows generally “outward” toward higher tiers, but some upstream messaging needed (in particular global triggering). Tier may specialize: RAM-heavy, CPU-heavy, have GPUs, perform meta processing (eg, trigger, DQM), etc.

Brett Viren (BNL) DUNE DAQ October 27, 2017 8 / 22

slide-9
SLIDE 9

Conceptual Design

Example: Scalable Ring Buffer

APA1 Tier 1 node APA2 Tier 1 node Tier 2 buffer node 1 Tier 2 buffer node 2 Tier 2 buffer node 3 FELIX Ring Buffer NIC switch FELIX Ring Buffer NIC NIC Ring Buffer NIC NIC Ring Buffer NIC NIC Ring Buffer NIC

  • Can make deeper buffer by flowing data through 2nd (or 3rd, etc) tier.
  • Tier 1 “reader” processes would flow data between tiers at fixed rate.
  • Tier 2 “writer” process replaces the FELIX spot in Tier 1
  • A switch can allow dynamic routing, but the 12 Tbps/10 kt total

throughput may require some segmentation.

Brett Viren (BNL) DUNE DAQ October 27, 2017 9 / 22

slide-10
SLIDE 10

Conceptual Design

Note on RAM Buffer Size and Cost

How big must it be?

  • Beam trigger packet latency: MINOS: 0.6-5s, NOνA: 20s.

? Latency of software trigger processors needs study. ? Latency for SNB trigger formation? (10s?)

? once raised, can we stream SNB data to sink? ? or do we need dedicated long-term buffer for whole SNB period?

Costs?

  • Let’s not speculate on Moore’s law?2
  • Today: RAM costs ∼$10/GB (but it’s currently rising!)

→ $100/APA·sec, $6k/APA·min

→ 1 minute RAM buffer is not so crazy.

per-APA, costs less than RCE and comparable to FELIX. can avoid bespoke concentrated high-RAM systems by scaling-out, as above.

2Okay, I know you are all dividing by factors of 2 in your head! Brett Viren (BNL) DUNE DAQ October 27, 2017 10 / 22

slide-11
SLIDE 11

Conceptual Design

Readers

Q: What do these “readers” do? A: Whatever you want! (if you have enough CPU) Some likely readers:

  • Form and emit local trigger primitives
  • Accept and interpret global trigger commands
  • Perform data reduction or selection processing.
  • Transfer data to 2nd tier hosts for deeper buffer or more

distributed CPU.

  • Data quality monitoring, “express lane” processes.
  • Save data to file.

Brett Viren (BNL) DUNE DAQ October 27, 2017 11 / 22

slide-12
SLIDE 12

Collection Plane Triggering

Some Numbers Conceptual Design Collection Plane Triggering To Do

Brett Viren (BNL) DUNE DAQ October 27, 2017 12 / 22

slide-13
SLIDE 13

Collection Plane Triggering

Induction vs Collection Waveforms

  • Raw induction waveforms
  • Biploar, no direct charge measure, often in the noise.
  • Any threshold will be inefficient or noise-dominated (or both).
  • Need relatively expensive signal processing to use properly.
  • Each channel is sensitive to activity on both sides of APA.
  • Raw collection waveforms
  • Good measure of ionization energy as 2D profile (drift × transverse)
  • 1 view, so can only reduce data by time slices, no spatial reduction.
  • Immediately useful signals, no expensive processing required.
  • Activity from either side of APA can be distinguished.

⇒ Try to use collection planes to form basic trigger primitives independently for each TPC on either side of an APA.

Brett Viren (BNL) DUNE DAQ October 27, 2017 13 / 22

slide-14
SLIDE 14

Collection Plane Triggering

Collection Waveform Segment Categories

Consider some mutually exclusive categories for collection waveform segments: noise contiguous waveform chunks consistent with noise, eg contain no samples above some nσ RMS level. blips a set of waveform samples that are “connected”, “compact” and “isolated” (by some metrics)

  • Ie, “small, compact islands” in channel vs tick space

surrounded by “enough” noise samples.

  • Intention is to efficiently select a rich sample 39Ar decays,

SNB interactions and similar low-energy activity.

signals “everything else”

  • Waveform chunks not consistent with noise nor blips.
  • With more thought, additional categories may present

themselves.

Brett Viren (BNL) DUNE DAQ October 27, 2017 14 / 22

slide-15
SLIDE 15

Collection Plane Triggering

Local Collection Plane Trigger Primitives

Raise a local collection plane trigger primitive whenever any non-noise waveform is present. Local trigger packet holds: type follows waveform category (“blip” or “signal”). ident TPC number (ie, APA+face) extent rectangular extent in time and channel. charge baseline subtracted ADC sum over extent. stats any other stats which can be quickly calculated.

Brett Viren (BNL) DUNE DAQ October 27, 2017 15 / 22

slide-16
SLIDE 16

Collection Plane Triggering

Example Global Trigger Logic

  • “SNB” trigger accept stream of local “blip” triggers
  • Select blips above some charge threshold,
  • ignore blips inside “signal” trigger, veto known “extra blippy” TPCs.

→ trigger when remaining blip rate is above some threshold.

  • “High Energy” trigger watch local “signal” trigger.
  • Merge overlapping local extents.

→ trigger on all.

  • Trigger command sent back to multiple, specific APA host

computers with readout instructions. Eg:

cmd: Save select time region in all “signal” triggered APAs and their nearest neighbors. cmd: Save all data in all APAs for next N-seconds as a SNB is happening. Depending on trigger type, triggered data may then undergo further processing (per-trigger data reduction, compression, etc).

Brett Viren (BNL) DUNE DAQ October 27, 2017 16 / 22

slide-17
SLIDE 17

Collection Plane Triggering

One Possible Triggering and Data Flow

APA1 Tier 1 node APA2 Tier 1 node Global Trigger Service Tier 2 node 1 Tier 2 node 2 I/O Tier node FELIX ring coltrig1 coltrig2 selector LTs Trigger Logic GT reducer1 reducer2 reducer3 ... FELIX ring coltrig1 coltrig2 selector reducer1 reducer2 reducer3 ... saver

Just two APAs shown for simplicity.

  • One collection trigger processor

per APA face (coltrig)

  • Global trigger command

interpreter (selector)

  • Farm selected data some worker

(reducer)

  • Save reduced data to

concentrated sink. ? Lots of possible options.

  • Trigger reduction is enough, no

“reducers” needed?

  • Single “reducer” stage is NOT

enough?

  • Add “stream branches” (DQM,

“express lane”)?

  • Sink I/O is still too high for just one

node, add more?

Brett Viren (BNL) DUNE DAQ October 27, 2017 17 / 22

slide-18
SLIDE 18

To Do

To Do

triggering much more thought is needed about local trigger categories, algorithms as well as global trigger logic and commands. simulation use Wire-Cell signal and noise sim for accurate event “size” determination and estimating trigger alg CPU (GPU, etc) requirements. scenarios develop 2 or 3 conceptual designs (eg, the one shown here, many another exploiting GPU accel, etc). scaling Use sim to estimate the requirements for number and sizes of

  • Tiers. Estimate throughput between tiers and final data rate to

disk/tape. If “too much”, iterate scenario definition. costing Use commodity costs from today. If “too expensive”, consider conservative Moore scaling and/or iterate the scenario(s).

Brett Viren (BNL) DUNE DAQ October 27, 2017 18 / 22

slide-19
SLIDE 19

To Do

Simulation Samples

noise + 39Ar understand signal/noise energy threshold and the rate of local “blip” triggers.

  • MicroBooNE (M.Mooney+co) now looking at 39Ar in data.
  • Can join forces.

noise + SNB understand SNB trigger logic (ie, how to separate

39Ar and SNB “blip” triggers).

high energy ν and cosmic-µ sim to understand local trigger alg’s CPU requirements.

  • For trigger-based reduction, their data size can be

estimated with just flux, cross-section and dE/dX.

  • More sophistication, eg signal-processing based reduction,

needs understanding of processing time variances.

Brett Viren (BNL) DUNE DAQ October 27, 2017 19 / 22

slide-20
SLIDE 20

Extras

Brett Viren (BNL) DUNE DAQ October 27, 2017 20 / 22

slide-21
SLIDE 21

Infrastructure Implementation

APA1 Tier 1 node APA2 Tier 1 node Global Trigger Service Tier 2 node 1 Tier 2 node 2 I/O Tier node FELIX ring coltrig1 coltrig2 selector LTs Trigger Logic GT reducer1 reducer2 reducer3 ... FELIX ring coltrig1 coltrig2 selector reducer1 reducer2 reducer3 ... saver

Try to reuse as much as possible

  • Ring-buffer implies async,

distributed, real-time, multi-processing.

  • Does/can this model map to

artDAQ? (hope “yes”!)

  • Many software roles:
  • RAM filled by FELIX Linux driver(?)
  • long-lived steam processors

(coltrig, selector)

  • triggering is client/server model
  • chunked transfer / processing

clearly artDAQ’ish (selector→reducer→saver)

Brett Viren (BNL) DUNE DAQ October 27, 2017 21 / 22

slide-22
SLIDE 22

What Might Go Where?

APA1 Tier 1 node APA2 Tier 1 node Global Trigger Service Tier 2 node 1 Tier 2 node 2 I/O Tier node FELIX ring coltrig1 coltrig2 selector LTs Trigger Logic GT reducer1 reducer2 reducer3 ... FELIX ring coltrig1 coltrig2 selector reducer1 reducer2 reducer3 ... saver

Possible locations: under Tier 1 and Global Trigger Service above Tier 2+

  • Gives some underground

autonomy if network fibers cut.

  • Need bandwidth to surface for

just triggered readout and not full 12 Tbps.

  • Allows scaling up (possible

majority) of nodes (Tier 2+) in the cheaper, above ground environment.

Brett Viren (BNL) DUNE DAQ October 27, 2017 22 / 22