SVT DAQ 2019 Physics Run Cameron Bravo (SLAC) Introduction SVT - - PowerPoint PPT Presentation

svt daq 2019 physics run
SMART_READER_LITE
LIVE PREVIEW

SVT DAQ 2019 Physics Run Cameron Bravo (SLAC) Introduction SVT - - PowerPoint PPT Presentation

SVT DAQ 2019 Physics Run Cameron Bravo (SLAC) Introduction SVT DAQ system underwent a major overhaul before the run FEB bootloader image Rogue framework TI interface on PCIE cards in SVT DAQ blades (clonfarm 2 and 3)


slide-1
SLIDE 1

Cameron Bravo (SLAC)

SVT DAQ 2019 Physics Run

slide-2
SLIDE 2

2

  • SVT DAQ system underwent a major overhaul before the run
  • FEB bootloader image
  • Rogue framework
  • TI interface on PCIE cards in SVT DAQ blades (clonfarm 2 and 3)
  • Upgrades not commissioned until we started receiving beam
  • Fixed several major issues during recovery after power outage

– Rogue hosts channel access server to interface with EPICS – Archiving of variables to aid in investigations – Cooling FEBs to increase lifetime of LV regulation circuitry – Slow copy times in SVT event building – Improper handling of DAQ state transitions

  • Little usable beam delivered before the outage
  • Ungrounded target was crashing the DAQ during production running

Introduction

slide-3
SLIDE 3

3

SVT DAQ Overview

Raw ADC data rate (Gbps) Per hybrid 3.33 Per L1-3 Front end board 10 Per L4-6 Front end board 13

Hybrid 2 Hybrid 3 Hybrid 0 Hybrid 1

Copper

Front End Board 0 Hybrid 35 Hybrid 33 Hybrid 34 Front End Board 9 Flange

25m fiber

RCE Crate

Copper

Power Supplies

25m copper

JLAB DAQ JLAB Slow Control

Ethernet Ethernet Vacuum Air Copper

. . . . .

  • 40 hybrids
  • 16 in layers 0 – 3 (2 per module)
  • 24 in layers 4 – 6 (4 per module)
  • 10 front end boards
  • 4 servicing layers 0 – 3 with 4 hybrids per board
  • 6 servicing layers 4 – 6 with 4 hybrids per board
  • RCE crate: ATCA, data reduction, event building

and JLab DAQ interface

slide-4
SLIDE 4

4

SLAC Gen3 COB (Cluster on Board)

  • Supports 4 data processing FPGA mezzanine cards (DPM)
  • 2 RCE nodes per DPM
  • 12 bi-directional high speed links to/from RTM (GTP)
  • Data transport module (DTM)
  • 1 RCE node
  • Interface to backplane clock & trigger lines & external trigger/clock source
  • 1 bi-directional high speed link to/from RTM (GTP)
  • 6 general purpose low speed pairs (12 single ended) to/from RTM
  • connected to general purpose pins on FPGA

DPM Board 0 (2 x RCE) DPM Board 1 (2 x RCE) DPM Board 2 (2 x RCE) RTM Fulcrum Ethernet Switch Switch Control & Timing

  • Dist. Board

DTM (1 x RCE) ATCA Back Plane IPMB Power & Reset

Ethernet Clock & Trigger Clock / Trigger

1Gbps

10Gbps

DPM Board 3 (2 x RCE) 10Gbps

slide-5
SLIDE 5

5

SVT RCE Allocation

  • Two COBs utilized in the SVT readout system
  • 16 RCEs On DPMs (2 per DPM, 4 DPMs per COB)
  • 2 RCEs on DTMs (1 per DTM, 1 DTM per COB)
  • 7 RCEs on each COB process data from ½ SVT
  • 2019 system required COBs to be unbalanced
  • Dead channels on RTMs and dying FEBs
  • 8th RCE on COB 0 manages all 10 FE Boards
  • Configuration and status messages
  • Clock and trigger distribution to FE boards & hybrids
  • 8th RCE on COB 1 is not used
slide-6
SLIDE 6

6

CODA ROC Instances On SVT

DTM TI Firmware DPM 7 Control Firmware DPM 5 Readout Firmware DPM 0 Readout Firmware

JLAB Triggers FEB Control Hybrid Data Hybrid Data

DPM 7 Control ROC COB1 DATA ROC

10Gbps To JLAB DAQ Local Ethernet Network

Ti Timing ROC ... DPM 5 Readout Firmware DPM 0 Readout Firmware

Hybrid Data Hybrid Data Clock / Trigger / Busy

COB0 DATA ROC ...

  • Unbalanced load on two COBs motivated changing to have two

ROCs which were not exclusive to either COB

  • Balancing load on servers toward end of run greatly improved
  • verall stability of the system!
slide-7
SLIDE 7

7

Rogue EPICS Bridge

  • Slow control software hosts an EPICS channel access server
  • Development of GUIs went into the run
  • Rogue required for GUIs and can take several minutes to fully populate GUIs
  • Archiving of variables took time to coordinate
  • FEBs now have SEU monitoring
  • Module implemented which can recover from SEUs
  • Observed on the order of 10 SEUs per day
  • Never observed an irrecoverable SEU
  • This became a strong tool for monitoring health of FEBs
slide-8
SLIDE 8

8

TI PCIE Card

  • Interface to central trigger system at JLab achieved via a PCIE

card in each of the two DAQ servers in the SVT system

  • Observed stability issues in FW of this PCIE card
  • Locked up linux kernel multiple times
  • Low jitter clock not available out-of-the-box
  • One server required loading linux driver after reboot, other

server would crash immediately if linux driver was loaded after reboot

  • Minimal support provided
  • Multiple crashes required accessing hall to power cycle machines
  • Reboot would not recover because PCIE card FW could
  • nly be loaded via full power cycle
  • Needed ability to remotely power cycle machines
slide-9
SLIDE 9

9

Server Load Balancing

  • Livetime was observed to be unstable, becoming more unstable as trigger rate

increased

  • We observed all reserved memory blocks for the DAQ on the server being held
  • nly on clonfarm2
  • Clonfarm2 had a higher data rate than clonfarm3
  • A few iterations of shuffling around the RCE to server map proved to

bring more stability to the system

  • Lowered operational point of trigger thresholds
  • Slightly lowered trigger rate
  • hps_v11 → hps_v12 trigger configuration change
slide-10
SLIDE 10

10

Summary

  • Overall, we had a successful run summer 2019
  • We had a rough start
  • Got on our feet
  • Ran! (Now to run some analysis…)
  • The major issues on the SVT DAQ side have been resolved
  • Still a few minor things to iron out for slow control
  • Ignoring all the fried hardware for now
  • Interested in discussing what development is foreseen wrt the TI

PCIE card

  • Happening at all?
  • Will the interface change?
  • Thanks for your attention!