From Advanced Instrumentation Towards Supercomputing Andres - - PowerPoint PPT Presentation

from advanced instrumentation towards
SMART_READER_LITE
LIVE PREVIEW

From Advanced Instrumentation Towards Supercomputing Andres - - PowerPoint PPT Presentation

From Advanced Instrumentation Towards Supercomputing Andres Cicuttin ICTP MLAB Multidisciplinary Laboratory of The Abdus Salam International Centre for Theoretical Physics Trieste, Italy A. Cicuttin, ICTP_May_2019 1 Outline 1.


slide-1
SLIDE 1

From Advanced Instrumentation Towards Supercomputing Andres Cicuttin

ICTP – MLAB Multidisciplinary Laboratory of The Abdus Salam International Centre for Theoretical Physics Trieste, Italy

1

  • A. Cicuttin, ICTP_May_2019
slide-2
SLIDE 2

Outline

2

  • A. Cicuttin, ICTP_May_2019
  • 1. Supercomputing and Custom Computing
  • Definitions
  • Time Computation vs. Space Computation
  • Problems and different approaches
  • 2. Scientific Instrumentation based on FPGA
  • Based on Single FPGA (RVI and SoC FPGA)
  • Based on Multiple FPGAs (Distributed and massively

parallel)

  • 3. Abstract model for reconfigurable systems
  • Extended Memory mapping
  • Universal Direct Memory Access (UDMA) Instructions
  • Architecture and Implementation
  • Data packets and routing
slide-3
SLIDE 3

Supercomputing

Reconfigurable Computing Custom Computing

The reconfigurable hardware infrastructure for custom supercomputing should ideally be:

3

  • A. Cicuttin, ICTP_May_2019

1) Versatile Must allow the implementation of many different computing architectures and strategies 2) Homogeneous Any logical subsystem should behave in the same way independently of where it is implemented 3) Scalable It should be possible to be implemented at different sizes preserving its basic logic and physical structure. It should also be conceived to be compatible with different types of FPGA within a wide range of cost-performance trade-offs 4) Efficient Must achieve a large number of arithmetic/logic operations per units of time, money and energy. 5) Portable Must be, as much as possible, FPGA vendor independent 6) Updateable Can be updated with newer devices without changing the basic structure and preserving as much as possible code compatibility 7) Upgradable Can be easily upgraded by adding more RAM or storage memory, or by replacing the main devices with more powerful ones

slide-4
SLIDE 4

The Custom Computing Problem

  • Which is the best reconfigurable hardware infrastructure?

4

  • A. Cicuttin, ICTP_May_2019
  • Which language should be used to capture a computational problem and express

its solution?

  • Which tools should be developed to configure the hardware to implement the

best custom computer?

  • Which tools should be developed to compile the code for its efficient execution in

the configured custom computer? None of these questions can be separately solved It needs solid experimental knowledge and multidisciplinary contribution

slide-5
SLIDE 5

Two Main Computational Paradigms

Scarcity of area & low circuit integration => The uProcessor paradigm:

  • Intensive reutilization of limited HW resources
  • Computation along time (time computation)

Abundance of area & high circuit integration => The FPGA paradigm:

  • Allocation of HW resources as needed
  • Computation along space (space computation)

5

  • A. Cicuttin, ICTP_May_2019

A B C D F G H Q D

slide-6
SLIDE 6

Desirable features of Advanced Instrumentation

6

  • A. Cicuttin, ICTP_May_2019

Scientific Industrial Commercial Academic Military Performance

max max

Accuracy, Precision

max high high

Reconfigurability

high sometimes

Massively parallel

sometimes sometimes sometimes

Physically Distributed

sometimes sometimes

Cost

low low

Design time

sometimes low low

Reliability

high high

slide-7
SLIDE 7

Advanced Instrumentation based on FPGA

Reconfigurable Virtual Instrumentation based on FPGA and SoC FPGA

  • A. Cicuttin, ICTP_May_2019

7

Massively parallel and distributed instrumentation in large high energy physics experiments (Multiple units)

slide-8
SLIDE 8

Emulated Instruments Virtual Instrument Reconfigurable Instrument

Reconfigurable Virtual Instrumentation

  • A. Cicuttin, ICTP_May_2019

8

slide-9
SLIDE 9

Reconfigurable Virtual Instrumentation

Transient recorder Function generator Oscilloscope Multimeter Spectrum Analyzer

  • A. Cicuttin, ICTP_May_2019

9

slide-10
SLIDE 10

Reconfigurable Virtual Instrumentation based on FPGA Global Architecture

  • A. Cicuttin, ICTP_May_2019

10

Trigger I/O Analog I/Os External memory extension SDRAM Module Development/Debugging Facilities LCDs, LEDs, Push Buttons

Actel

ProASIC3E

FPGA

Extension Connectors (board-to-board) Digital I/Os Trigger I/O

External Physical World

Communication Ports PP, RS232, USB, Ethernet Digital Interfaces A/D, D/A, Triggers in/out A/D, D/A, Triggers in/out

RVI Mother Board Daughter Boards Personal Computer (User, Operator)

Analog I/Os Digital I/Os

slide-11
SLIDE 11

Reconfigurable Instrumentation: Architectural approach and modular structure

  • A. Cicuttin, ICTP_May_2019

11

To uProcessor SoC interconnect BUS

slide-12
SLIDE 12

Reconfigurable Virtual Instrumentation based on SoC FPGA Global Architecture

  • A. Cicuttin, ICTP_May_2019

12

Non Time Critical External Hardware External Memory Middleware FPGA-uP communication block SoC FPGA Time Critical External Hardware FPGA uP PC Ext HW Controllers

slide-13
SLIDE 13

Native or Wishbone interface

FPGA2uP FIFOs uP2FPGA FIFOs True Dual Port RAM Memory Mapped AXI Lite/ AXI Full/ AXI Stream

FPGA

External Hardware Interface Native or Wishbone interface User Core Logic Design Registers External DDR RAM Memory Controller

uP

Control Registers/ FIFOs and True Dual Port RAM FPGA – uP Communication SW (uP) uP–PC Communication SW

User Core Program

SoC FPGA Based Reconfigurable Virtual Instrumentation Typical Global Architecture

  • A. Cicuttin, ICTP_May_2019

13

uP – PC Communication SW (uP) Virtual consoles & Control Computing

PC

FMC Connector External Hardware (Application specific) Input/output signals External RAM Memory Controller External RAM Memory Controller

slide-14
SLIDE 14
  • A. Cicuttin, ICTP_May_2019

14

Advanced Instrumentation based on FPGA

Massively parallel and distributed instrumentation in large high-energy physics experiments

Artistic view of the 60 m long COMPASS two-stage spectrometer. The large gray box is the RICH-1 detector. Approximate size:: 4 m x 4 m x 2 m

RICH Detector Reconfigurable Virtual Instrumentation based on FPGA and SoC FPGA

slide-15
SLIDE 15

RICH-1

DOLINA PC de Control del RICH (Ethernet) 8 R e d e s T D M d e D S P BORA-0 BORA-11 BORA-12 BORA-23 Fibra desde TCS Pixel (0,0) Pixel (287, 287) 24 192 tarjetas BORA Fibras Opticas Cámara Fibras Opticas Cámara 7 Cámara 6 Cámara 5 Cámara 4 Cámara 3 Cámara 2 Cámara 1 Cámara 0 PCI

Global Architecture

  • A. Cicuttin, ICTP_May_2019

15

slide-16
SLIDE 16
  • A. Cicuttin, ICTP_May_2019

16

Advanced Instrumentation based on FPGA

Reconfigurable Virtual Instrumentation based on FPGA and SoC FPGA Massively parallel and distributed instrumentation in large high-energy physics experiments

slide-17
SLIDE 17
  • A. Cicuttin, ICTP_May_2019

17

Reconfigurable Virtual Instrumentation based on FPGA

Reconfigurable Virtual Instrumentation based on FPGA and SoC FPGA Massively parallel and distributed instrumentation in large high-energy physics experiments

slide-18
SLIDE 18

Dolina, Side B

Dolina, Side A Data movement through distributed instrumentation

TDP RAMs uP PCI Bus FIFOs FPGA DSPs

  • A. Cicuttin, ICTP_May_2019

18

slide-19
SLIDE 19
  • A. Cicuttin, ICTP_May_2019

19

Reconfigurable Virtual Instrumentation based on FPGA

Reconfigurable Virtual Instrumentation based on FPGA and SoC FPGA Massively parallel and distributed instrumentation in large high-energy physics experiments

slide-20
SLIDE 20

Description of Complex Systems

  • A. Cicuttin, ICTP_May_2019

20

Modularity Hierarchy levels Modules

slide-21
SLIDE 21

What activity at given hierarchical level?

  • A. Cicuttin, ICTP_May_2019

21

Functional blocks Data exchange

DO DI DO DO DO DO DO DI DI DO DI DI DI DI DI DI DI DO DI DI DI DI DO DI DI DI DO

Ports

slide-22
SLIDE 22

FPGA-Based Reconfigurable Instrument: Abstract Model

  • A. Cicuttin, ICTP_May_2019

22

Memory mapping

  • f registered ports

All HW Resources

Hardware Configuration Software Programming

Concurrent execution

  • f Universal Direct

Memory Access (UDMA ) instructions

Ports Ext_RAM Ext_ROM FIFO_a_in FIFO_b_out RAM_block_p Register_h Operand _i Operand _j Operator_m_Out_k Ext_HW_in_port_x Ext_HW_out_port_y Register_k Address 0x00000001 0x0000FFFF 0x000A0000 0x000AEEEE 0x000AEEEE 0x000AEEF0 0x001A0000 0x001AEEEE 0x002A0000 0x002A000A 0x002A000B 0x003A0001 0x003A0001 0x003A0001

Instantiation of functional blocks Description of the HW actvity

slide-23
SLIDE 23

UDMA SA DA SAinc DAinc N <BC> <activate, suspend, abort>

  • A. Cicuttin, ICTP_May_2019

23

Source Address Destination Address Increment of Source Address Number of Words Boolean condition Reaction

Universal Direct Memory Access Instruction

Increment of Destination Address SA SAinc SD SDinc Source Destination

slide-24
SLIDE 24

Universal Direct Memory Access Instruction

UDMA 0x0000F001 0x0000F00A 1 1 256

  • A. Cicuttin, ICTP_May_2019

24

UDMA 0xAAAA4004 0x000FAA40 0 0 0 Permanent link UDMA 0xAAAAF003 0x008FAA80 4 1 2000 RAM to RAM UDMA 0x0000F003 0x0004F00C 0 1 1024 FIFO to RAM UDMA 0x0000F002 0x0002F00B 1 0 1024 RAM to FIFO RAM to RAM UDMA 0xFFFF4004 0x000FAA00 4 1 1024 “timer > countmax“ Abort Conditional data transfer

Some examples

UDMA 0xFFFF4004 0x000FAA00 4 1 1024 “counter1 == 31“ Suspend Conditional data transfer

slide-25
SLIDE 25

System on Chip: The Wishbone Bus-Interface Standard Definitions

  • A. Cicuttin, ICTP_May_2019

25

The four main components of the Wishbone system: Master and Slave interfaces, Syscon and Intercon.

RST_I CLK_I ADR_O() DAT_I() DAT_O() WE_O SEL_O() STB_O ACK_I CYC_O TAGN_O TAGN-I RST_I CLK_I ADR_I() DAT_I() DAT_O() WE_I SEL_I() STB_I ACK_O CYC_I TAGN_I TAGN-O WISHBONE MASTER WISHBONE SALVE SYSCON

SYSCON: drives the system clock and reset signals. MASTER: IP Core interface that generates bus cycles. SLAVE: IP Core interface that receives bus cycles. INTERCON: an IP Core that connects all of the MASTER and SLAVE interfaces together.

INTERCON

slide-26
SLIDE 26
  • A. Cicuttin, ICTP_May_2019

26

INTERCON

slave master slave slave slave master master master slave slave master

SYSCON

* Point-To-Point * Data Flow * Shared Bus * Crossbar Switch

The Wishbone Interconnection is created by the SYSTEM INTEGRATOR, who has total control of its design

slide-27
SLIDE 27
  • A. Cicuttin, ICTP_May_2019

27

Interconnections II

WISHBONE MASTER WISHBONE SLAVE

Point-To-Point Data Flow

WISHBONE MASTER WISHBONE SLAVE WISHBONE MASTER WISHBONE SLAVE WISHBONE MASTER WISHBONE SLAVE IP Core “A” IP Core “B” IP Core “C”

slide-28
SLIDE 28
  • A. Cicuttin, ICTP_May_2019

28

Interconnections III

WISHBONE MASTER “MA” WISHBONE SLAVE “SA”

Shared Bus

WISHBONE MASTER “MB” WISHBONE SLAVE “SB” WISHBONE SLAVE “SC” Common Bus

slide-29
SLIDE 29
  • A. Cicuttin, ICTP_May_2019

29

Interconnections IV

WISHBONE MASTER “MA” WISHBONE SLAVE “SA”

Crossbar Switch

WISHBONE MASTER “MB” WISHBONE SLAVE “SB” WISHBONE SLAVE “SC”

NOTE: Dotted lines indicate one possible connection option

CROSSBAR SWITCH INTERCONNECTION

slide-30
SLIDE 30

UDMA controller for a system based on Wishbone compliant modules

  • A. Cicuttin, ICTP_May_2019

30 RST_I CLK_I ADR_O DAT_I DAT_O WE_O SEL_O STB_O ACK_I CYC_O RST_I CLK_I ADR_I DAT_I DAT_O WE_I SEL_I STB_I ACK_O CYC_I

UDMA CONTROLLER

WISHBONE SALVE 1 SYSCON RST_I CLK_I ADR_I DAT_I DAT_O WE_I SEL_I STB_I ACK_O CYC_I WISHBONE SALVE 2 RST_I CLK_I ADR_I DAT_I DAT_O WE_I SEL_I STB_I ACK_O CYC_I WISHBONE SALVE j

  • UDMA instructions could be stored in a WB module
  • One WB module must be a communication block which

could also store UDMA Instructions in a reserved area.

slide-31
SLIDE 31

Summary of key concepts so far and its relations

uP Instruction set

  • A. Cicuttin, ICTP_May_2019

31

Implementation Architecture Software programming Hardware configuration UDMA instruction set Hierarchy Memory mapping Functional Block Modularity Instantiation Interconnection Space computation Time computation

slide-32
SLIDE 32

Communication through FPGAs in clusters

  • f reconfigurable computational units
  • A. Cicuttin, ICTP_May_2019

32

With same physical connections but with different IO configuration and activity programming:

Data packet transmission over

  • On demand point-to-point connections
  • Buses
  • Time-Division Multiplexing on common

signal paths

slide-33
SLIDE 33

Interconnection of Multiple FPGAs

  • A. Cicuttin, ICTP_May_2019

33

FPGA

Router

Three main communication layers

  • Physical
  • Logical
  • System

Different Topologies

slide-34
SLIDE 34

Native or Wishbone interface FIFOs FPGA2uP FIFOs uP2FPGA Memory Mapped AXI Lite / Full / Stream

FPGA

Registers True Dual Port RAM Reserved area Reserved area

uP

UDMA controller

CommBlock

Slave Communication Blocks

  • A. Cicuttin, ICTP_May_2019

34 Native or Wishbone interface FIFOs FPGA2Rout FIFOs Rout2FPGA

FPGA

Registers True Dual Port RAM Reserved area Reserved area UDMA controller

CommBlock

Native or Wishbone interface Router Flags/semaphores for protocols UDMA instructions Payload data Standardized Data Packets

slide-35
SLIDE 35

Standardized Data Packets

  • A. Cicuttin, ICTP_May_2019

35

32 bits

Header Keyword Packet Type Destination Address Data_1 Data_2 Data_3 Data_N-1 Data_N Checksum Trailer Keyword Source ID Destination ID Priority Data Format Data type Y (Packet X of Y)

Header

Protocol nr. Protocol rev. Check type N (nr. of words)

Payload Trailer

X (Packet X of Y)

  • Command
  • Error Message
  • Status report
  • Raw Data
  • Bit Stream
  • Engineering frame
  • UDMA
  • Etc.

Header and Trailer depend on Packet type

slide-36
SLIDE 36

Standardized data packets and corresponding handling mechanisms for moving data across entire hybrid systems

  • A. Cicuttin, ICTP_May_2019

36

(1) UDMA-Packet is sent from “i” to “j” to move data from data source “j” to destination “k”

(2) A Data-Packet is prepared and sent from data source “j” to destination “k”

UDMA SA DA SAinc DAinc N

Corresponding Acknowledge-Packets can

  • ptionally be sent back to conclude transactions

At this level of abstraction we don’t care about underlying networks and low level communication layers. Data also include instructions, commands, error messages, etc.

slide-37
SLIDE 37

Preliminary conclusions I

  • Reconfigurable Hardware abstract models and strategies developed for advanced

scientific instrumentation based on FPGA can be adapted for high-performance reconfigurable computing.

  • Abundance of reconfigurable hardware resources lead to new computational

paradigms inspired on the FPGA model escaping from the limitations of typical von Neumann and similar uP architectures.

  • A spatial dimension can be added to the temporal dimension of dominant

computing paradigm based on uP instruction set architectures.

  • Universal Direct Memory Access (UDMA) instructions appear as a suitable means

to describe and program the computational activity of powerful hardware platforms based on modern reconfigurable hybrid devices such as SoC FPGA.

  • A. Cicuttin, ICTP_May_2019

37

slide-38
SLIDE 38

Recalling The Custom Computing Problem

  • Which is the best reconfigurable hardware infrastructure?
  • Which language should be used to capture a computational problem and codify its solution?
  • Which tools should be developed to configure the hardware to implement the best custom

computer?

  • Which tools should be developed to compile the code for its efficient execution in the

configured custom computer?

38

  • A. Cicuttin, ICTP_May_2019

This is still a very complex problem that needs multidisciplinary contributions and positive knowledge experimentally obtained

  • n scalable hardware infrastructures.

Preliminary conclusions II

slide-39
SLIDE 39

Thank you for your attention!

  • A. Cicuttin, ICTP_May_2019

39

slide-40
SLIDE 40

Opportunities for open collaboration on scientific supercomputing based on FPGA technologies

  • Synergies between Industry, Universities and Public Research Centers.
  • ICTP (UNESCO - IAEA) Programs

– TRIL: Training and Research in Italian Laboratories – Associates (junior, regular, senior) – Federation Agreements – Scientific Calendar of international activities for training and research in Physics, Mathematics and Interdisciplinary areas.

  • A. Cicuttin, ICTP_May_2019

40

slide-41
SLIDE 41

ICTP (UNESCO - IAEA) Programs

TRIL: Training and Research in Italian Laboratories

https://www.ictp.it/tril.aspx

This programme offers scientists from developing countries the opportunity to undertake training and research in an Italian laboratory in different branches of the physical sciences The ICTP has established agreements of collaboration with more than 400 Italian research institutes, providing young scientists with numerous options. TRIL partners include:

  • CNR (Italian National Research Council) institutes
  • Elettra-Sincrotrone Trieste (Elettra Synchrotron Light Source)
  • ENEA (Italian National Agency for New Technologies, Energy and Sustainable Economic Development)
  • INFN (National Institute for Nuclear Physics)
  • INGV (Istituto Nazione di Geofisica e Vulcanologia)
  • OGS (National Institute of Oceanography and Experimental Geophysics)
  • A. Cicuttin, ICTP_May_2019

41

slide-42
SLIDE 42

ICTP (UNESCO - IAEA) Programs

ICTP Associateship: Junior (<36), Regular (<46), Senior (<63)

https://portal.ictp.it/assoc/associateship-scheme

The Associate Scheme is one of the ICTP's oldest programs, and was established to provide support for distinguished scientists in developing countries in an effort to lessen the brain-drain.

– The Junior Associateship award has a six-year duration throughout which the Junior Associate is entitled to spend up to 180 days (with a maximum duration of 60 days for any single visit) at the Centre, with three fares paid. A fare is granted for visits having a minimum duration of 30 days. For each visit the Centre provides a daily living allowance. – The Regular Associateships are six-year awards intended exclusively for scientists between the ages of 36 and 45 from and working in developing countries. – Senior Associateships are intended for scientists from a developing country who have acquired international scientific

  • status. Awards have a six-year duration with a total allocation of 8000 Euro. These funds are made available for visits

in the form of a daily living allowance and/or travel expenses. During the six years, Senior Associate Members may apply to visit the Centre as often and for as long as they wish, until the allocation is exhausted, although the maximum foreseen duration of any visit is 60 days.

  • A. Cicuttin, ICTP_May_2019

42

slide-43
SLIDE 43

ICTP (UNESCO - IAEA) Programs

ICTP Federated Institutes

https://www.ictp.it/programmes/federated-institutes.aspx The Federated Institutes programme offers young scientific staff, as well as post-doctoral and PhD students from institutes in developing countries, the opportunity to attend meetings at ICTP or to participate in group activities. Institutes wishing to be considered for the possibility of becoming an ICTP Federated Institute must satisfy the following criteria:

  • The institute must be in a developing country;
  • The institute must have active research programmes in at least one of the areas of interest to ICTP;
  • There should be at least a Masters but preferably a PhD programme in the fields of interest;
  • In case the institute is accepted as being Federated, the coordinator (applicant) must be an active

member of the institute for the duration of the agreement.

  • Former Federated Institutes are eligible to apply again for Federation status. Extensions are not

envisaged.

  • A. Cicuttin, ICTP_May_2019

43

slide-44
SLIDE 44

ICTP (UNESCO - IAEA) Programs

ICTP Scientific Calendar

https://www.ictp.it/scientific-calendar.aspx Each year, ICTP organizes more than 60 international conferences, workshops, and numerous seminars and colloquia for training and research in Physics, Mathematics and Interdisciplinary areas.

  • Those interested in attending an activity must complete an online application form.
  • To propose a conference, school or workshop check the corresponding guidelines

(https://www.ictp.it/call-for-proposals.aspx).

  • The deadline for proposals is typically end of February for activities to take place in the next
  • year. ICTP announces the call for proposals on its website.
  • Travel fellowships and financial support for ICTP conferences and workshops are available.
  • A. Cicuttin, ICTP_May_2019

44

slide-45
SLIDE 45

ICTP (UNESCO - IAEA) Programs ICTP invites proposals from the international scientific community for any of the following types of activities:

Schools/Colleges: These largely pedagogical events cover a relatively broad scientific field normally

through lectures at an expository level, and may include exercise sessions, discussion groups and computer laboratory sessions.

Advanced Schools/Workshops: These events deal with specific or specialized topics. In some cases,

particularly when held periodically over time, the main purpose may be to cover developments of the last few years. A fraction of the audience may consist of former participants who should be actively involved in the programme, for instance through poster sessions. Typical length is 2 weeks.

Conferences: These activities last for a few days to a week and consist of presentations of recent

results on timely and exciting subjects.

Extended Workshops: These less structured activities last from 2 to 3 months and cover selected

research topics.

Outside Activities: Regional activities, to take place in an emerging or developing country, meant for

promoting science in the host country and the surrounding region.

Co-sponsored Activities: Proposed activities that typically bring most of their own funding and

  • rganization, but seek an international venue and only modest support from ICTP.
  • A. Cicuttin, ICTP_May_2019

45

slide-46
SLIDE 46

Thank you for your attention!

  • A. Cicuttin, ICTP_May_2019

46