the dream tdaq
play

The dream TDAQ Powerful intelligent algorithms Sophisticated - PowerPoint PPT Presentation

Tools (e.g. for streaming DAQ, fast ML, automation/self running DAQ,) Mia Liu, Nhan Tran, Fermilab + input from many in Fast ML and broader community! DOE Basic Research Needs Study (Community meeting for TDAQ) In partnership with: December


  1. Tools (e.g. for streaming DAQ, fast ML, automation/self running DAQ,…) Mia Liu, Nhan Tran, Fermilab + input from many in Fast ML and broader community! DOE Basic Research Needs Study (Community meeting for TDAQ) In partnership with: December 3rd, 2019

  2. The dream TDAQ • Powerful intelligent algorithms • Sophisticated algorithms • Training/updating on the fly • Autonomous, self-calibrating • Safe with minimal down-time • Analyze everything, no data loss • Modular, multiple processing layers � 2

  3. Generic system analysis, alert Detector, TDAQ-1 offline system, self- (reconstruct) data tier 1 Accelerator calibration (re-train) analysis, alert TDAQ-N offline system, self- (reconstruct) data tier N calibration (re-train) � 3

  4. Specific systems � 4

  5. Specific systems Real-time controls, trigger, alerts Fixed latency/clock to transient/streaming events Wide range of detector scales and timelines (1ns to 1s) � 5

  6. ������ ����������� �������� ����������-���/����� � ����� ����������� ����� ����� ����� �������-�� ������� ��������-������� ���-�� ~1 P B / DAY ~1 P B / S ����������-��������-���� ������������/���� ���������-�����-�� Latency landscape ����-�-���� ����� �-���-�� LSST transient detection? RF signal processing? DUNE DAQ? 1ns 1 μ s 1ms 1s CMS Trigger Latency 1 ns ms 1 s 1 us 1 kHz 100 kHz 1 MB/evt CMS example High-Level L1 Trigger 40 MHz Offline Trigger Offline Massive data rates, on-detector low-latency processing Extreme environments: low-power, cryogenic, high-radiation Computing challenges: Need to investigate in how to + integrate heterogeneous computing platforms + + ASICs FPGAs + + + + � 6

  7. On-detector sophisticated algorithms [https://arxiv.org/abs/1804.06913], [fastmachinelearning.org/hls4ml] ML in the hardware trigger • All FPGA design • Flexible: many algorithm kernels for processing different architectures • Application and adoption growing across the LHC and beyond! • Growing interest with many on-going developments • CNNs, Graphs, RNNs, auto-encoders, binary/ternary • Alternate HLS (Intel, Mentor, Cadence) • Co-processors, multi-FPGA • Intelligent ASICs > 5000 parameter fully connected network in 100 ns • See Phil’s talk 7

  8. hls4…ml…4asic? Hardware acceleration with an emphasis on co-design and fast turnaround time First project: Autoencoder with MNIST benchmark (28 x 28 x 8-bits @ 40 MHz) Original Encoder data Decoder High Reprogrammable speed weights Reconstructed data Compressed data Rate: 40MHz drivers - Efficient bandwidth usage reconfigurable - Reduced power consumption (data transfer) Enable edge compute : e.g. data compression First tests of 1-layer design Programmable and Reconfigurable : reprogrammable weights Latency: 9ns Hardware – Software codesign : algorithm-driven architectural approach Power (FPGA, 28nm) ~ 2.5 W Power (ASIC, 65nm) ~ 40 mW Optimized Mixed signal / Analog techniques : Low power and low latency Area = 0.5mm x 0.5mm for extreme environment (ionizing radiation, deep cryogenic) FNAL, NW, Columbia, work-in-progress � 8

  9. Off detector: heterogeneous computing • Opportunities for deploying + Registers accelerated heterogeneous Control + + CPUs GPUs Unit ASICs (CU) Arithmetic compute for real-time FPGAs Logic Unit + + + + (ALU) analysis FLEXIBILITY EFFICIENCY • How best to integrate into a given TDAQ workflow • ML/not ML • Service or direct connect • GPU, FPGA, ASIC Advances in heterogeneous computing • Proof-of-concept for ML with FPGAs as a driven by service, https://arxiv.org/abs/1904.08986 machine learning � 9

  10. Autonomous, self-calibrating detector Insitu-Training: • FPGA/Sytem-on- Chip O ff -line training: • CPU/ heterogeneous fast-streaming computing • Anomaly detection and weight updating • Transient detection algorithms • Reinforcement learning • Neuromorphic algorithms (spiking) Hardware: FPGA/Sytem-on-Chip � 10

  11. Autonomous, self-tuning accelerator Insitu-Training: • FPGA/Sytem-on- Chip O ff -line training: • CPU/ heterogeneous fast-streaming computing • Anomaly detection and weight updating For accelerator applications, • Transient detection algorithms constant tuning/feedback loop required • Reinforcement learning • Neuromorphic algorithms (spiking) Hardware: FPGA/Sytem-on-Chip � 11

  12. Tools for dream New algorithms • Powerful intelligent algorithms Electronics hardware 
 • FPGAs designed for ML and vice versa and infrastructure • Opportunities for heterogeneous hardware (e.g. Versal) • Push up to the frontest end (ML in ASIC, reconfigurable weights) Systems designed for • New types of algorithms beyond classification & regression operations and control • Autonomous, self-calibrating • Automation for (a) when conditions have changed (b) what actions to take • Fast DAQ paths with deep buffers for monitoring individual channels, how to deal with different time scales? • Training and recalibration “offline-system” (GPU…) or small-scale in situ (ARM processor, in FPGA) • Analyze everything, no data loss • Modular, portable, multiple processing layers • Streaming fast analysis - accessible programming paradigms; SoC R&D • Data storage - Affordable, new/different storage technologies for persistent (parked) datasets � 12

  13. Extra � 13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend