svt daq 2019 physics run
play

SVT DAQ 2019 Physics Run Cameron Bravo (SLAC) Introduction SVT - PowerPoint PPT Presentation

SVT DAQ 2019 Physics Run Cameron Bravo (SLAC) Introduction SVT DAQ system underwent a major overhaul before the run FEB bootloader image Rogue framework TI interface on PCIE cards in SVT DAQ blades (clonfarm 2 and 3)


  1. SVT DAQ 2019 Physics Run Cameron Bravo (SLAC)

  2. Introduction • SVT DAQ system underwent a major overhaul before the run FEB bootloader image ● Rogue framework ● TI interface on PCIE cards in SVT DAQ blades (clonfarm 2 and 3) ● • Upgrades not commissioned until we started receiving beam Fixed several major issues during recovery after power outage ● – Rogue hosts channel access server to interface with EPICS – Archiving of variables to aid in investigations – Cooling FEBs to increase lifetime of LV regulation circuitry – Slow copy times in SVT event building – Improper handling of DAQ state transitions Little usable beam delivered before the outage ● Ungrounded target was crashing the DAQ during production running ● 2

  3. SVT DAQ Overview 25m fiber Ethernet Hybrid 0 Front End RCE JLAB Board 0 Hybrid 1 Copper Crate DAQ . Hybrid 2 Flange . Copper . Hybrid 3 Ethernet . . JLAB Power Hybrid 33 Front End Slow Control Supplies Copper Board 9 Hybrid 34 25m copper Hybrid 35 Vacuum Air • 40 hybrids Raw ADC data rate (Gbps) • 16 in layers 0 – 3 (2 per module) Per hybrid 3.33 • 24 in layers 4 – 6 (4 per module) Per L1-3 Front end board 10 • 10 front end boards • Per L4-6 Front end board 13 4 servicing layers 0 – 3 with 4 hybrids per board • 6 servicing layers 4 – 6 with 4 hybrids per board • RCE crate: ATCA, data reduction, event building and JLab DAQ interface 3

  4. SLAC Gen3 COB (Cluster on Board) 10Gbps DPM Fulcrum Board 0 Ethernet Ethernet (2 x RCE) 10Gbps Switch DPM Board 1 ATCA IPMB (2 x RCE) 1Gbps Back Power & DPM Plane Reset RTM Board 2 Switch Control (2 x RCE) & Timing Clock & DPM Dist. Board Trigger Board 3 DTM (1 x RCE) (2 x RCE) Clock / Trigger • Supports 4 data processing FPGA mezzanine cards (DPM) - 2 RCE nodes per DPM - 12 bi-directional high speed links to/from RTM (GTP) • Data transport module (DTM) - 1 RCE node - Interface to backplane clock & trigger lines & external trigger/clock source - 1 bi-directional high speed link to/from RTM (GTP) - 6 general purpose low speed pairs (12 single ended) to/from RTM • connected to general purpose pins on FPGA 4

  5. SVT RCE Allocation • Two COBs utilized in the SVT readout system - 16 RCEs On DPMs (2 per DPM, 4 DPMs per COB) - 2 RCEs on DTMs (1 per DTM, 1 DTM per COB) • 7 RCEs on each COB process data from ½ SVT - 2019 system required COBs to be unbalanced - Dead channels on RTMs and dying FEBs • 8 th RCE on COB 0 manages all 10 FE Boards - Configuration and status messages - Clock and trigger distribution to FE boards & hybrids • 8 th RCE on COB 1 is not used 5

  6. CODA ROC Instances On SVT Local Ethernet Network Ti DPM 7 COB0 COB1 10Gbps DATA Timing Control DATA To JLAB ROC ROC ROC ROC DAQ DTM DPM 7 DPM 5 DPM 0 DPM 5 DPM 0 TI Control Readout ... Readout Readout ... Readout Firmware Firmware Firmware Firmware Firmware Firmware JLAB FEB Hybrid Hybrid Hybrid Hybrid Triggers Control Data Data Data Data • Unbalanced load on two COBs motivated changing to have two Clock / ROCs which were not exclusive to either COB Trigger / • Balancing load on servers toward end of run greatly improved overall stability of the system! Busy 6

  7. Rogue EPICS Bridge • Slow control software hosts an EPICS channel access server • Development of GUIs went into the run Rogue required for GUIs and can take several minutes to fully populate GUIs ● • Archiving of variables took time to coordinate • FEBs now have SEU monitoring Module implemented which can recover from SEUs ● Observed on the order of 10 SEUs per day ● Never observed an irrecoverable SEU ● • This became a strong tool for monitoring health of FEBs 7

  8. TI PCIE Card • Interface to central trigger system at JLab achieved via a PCIE card in each of the two DAQ servers in the SVT system • Observed stability issues in FW of this PCIE card ● Locked up linux kernel multiple times ● Low jitter clock not available out-of-the-box ● One server required loading linux driver after reboot, other server would crash immediately if linux driver was loaded after reboot ● Minimal support provided • Multiple crashes required accessing hall to power cycle machines ● Reboot would not recover because PCIE card FW could only be loaded via full power cycle ● Needed ability to remotely power cycle machines 8

  9. Server Load Balancing • Livetime was observed to be unstable, becoming more unstable as trigger rate increased • We observed all reserved memory blocks for the DAQ on the server being held only on clonfarm2 Clonfarm2 had a higher data rate than clonfarm3 ● A few iterations of shuffling around the RCE to server map proved to ● bring more stability to the system • Lowered operational point of trigger thresholds Slightly lowered trigger rate ● hps_v11 → hps_v12 trigger configuration change ● 9

  10. Summary • Overall, we had a successful run summer 2019 ● We had a rough start ● Got on our feet ● Ran! (Now to run some analysis…) • The major issues on the SVT DAQ side have been resolved ● Still a few minor things to iron out for slow control ● Ignoring all the fried hardware for now • Interested in discussing what development is foreseen wrt the TI PCIE card ● Happening at all? ● Will the interface change? • Thanks for your attention! 10

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend