SLIDE 1 Will FPGA reconfiguration change the synthesis problem?
Ghent University, Belgium Hardware and Embedded Systems group
Universiteit Gent – Faculteit Ingenieurswetenschappen – Vakgroep Elektronica en Informatiesystemen – 11 December 2015
SLIDE 2 Outline
- What is Parameterized Run-time Reconfiguration?
- The importance of the parameter choice
- Effects on logic synthesis
2
SLIDE 3 Outline
- What is Parameterized Run-time Reconfiguration?
- The importance of the parameter choice
- Effects on logic synthesis
3
SLIDE 4 FPGA Run-Time Reconfiguration?
- Today: configurability on a large time scale
– Prototyping – System update – ...
- We: configurability on a smaller time scale
– Dynamic circuit specialization
- Frequently changing (regular) inputs vs. infrequently changing
parameters
- Parameters trigger a reconfiguration (through configuration manager)
– Goals:
- Improve performance
- Reduce area
- Minimize design effort
4
SLIDE 5 Configuration Interface
config. DB Configuration Manager Application Software
Reconfiguration Request CPU
config. DB
F1 F2
Static
Conventional Dynamic Reconfiguration
FPGA
F1 F2
config. DB Dynamic
F1 F2
5
SLIDE 6 Conventional Tool Flow
F1 HDL Static HDL Design F2 HDL Synthesis Synthesis Synthesis Tech. Mapping Tech. Mapping Tech. Mapping Place & Route Place & Route Place & Route Static Config. F1 Config. F2 Config.
… …
6
SLIDE 7 Dynamic Circuit Specialization not feasible!
- Application where part of the input data changes
infrequently
– Conventional implementation (no reconfiguration): Generic circuit, Store data in memory, Overwrite memory – Dynamic circuit specialization: Reconfigure with configuration specialized for the data
- Example: Adaptive FIR filter (16-tap, 8-bit
coefficients)
... 2128 possible configurations!
7
SLIDE 8 Our solution: Parameterized Configuration
Parameterized Configuration { 0 1 0 A+B AB A 1 }
* K. Bruneel and D. Stroobandt, “Automatic Generation of Run-time Parameterizable Configurations,” FPL 2008.
1 1 1 1
A B
Specialized Configurations { 0 1 0 0 0 0 1 } { 0 1 0 1 0 0 1 } { 0 1 0 1 0 1 1 } { 0 1 0 1 1 1 1 } Parameters
8
SLIDE 9 config. DB Configuration Manager Application Software
Reconfiguration Request FPGA Configuration Interface
config. DB
CPU
FIR Dynamic Circuit Specialization (micro-reconfiguration)
FIR(4,9)
Static
Dynamic
FIR
FIR(2, 8) config. DB
9
SLIDE 10 Two stage approach
– In: Generic functionality
- Specification of the generic functionality
- Distinction regular and parameter inputs
– Out: Parameterizable Configuration
- Software function
- outputs specialized configurations for given
parameter values
– Evaluate parameterizable configuration – Out: Specialized Configuration – Repeat every time parameters change
10
Generic Functionality Off-line Stage On-line Stage Parameterizable Configuration Specialized Configuration
SLIDE 11
- Param. Configuration Tool Flow
11
Synthesis*
Place* & Route*
- Param. Config.
- Tunable truth table bits
– Adapted Tech. Mapper: TMAP – Map to Tunable LUTs (TLUTs) – [FPL2008], [ReConFig2008], [DATE2009]
– Adapted Tech. Mapper – Adapted Placer – Adapted Router
SLIDE 12 Outline
- What is Parameterized Run-time Reconfiguration?
- The importance of the parameter choice
- Effects on logic synthesis
12
SLIDE 13 entity multiplexer is port(
sel : in std_logic_vector(2 downto 0);
in : in std_logic_vector(7 downto 0);
); end multiplexer; architecture behaviour of multiplexer is begin
- ut <= in(conv_integer(sel));
end behaviour;
Parameterizable HDL design
13
in0 in1 in2 in3 sel0 sel1
sel2 in4 in5 in6 in7
SLIDE 14 Synthesis*
Two types of inputs:
- Regular inputs
- Parameter inputs
14
A A A A O O A A O
in4 in5 in6 in7 sel0 sel1
A A O
sel2
A A A A O O A A O
in0 in1 in2 in3 sel0 sel1
SLIDE 15 Conventional technology mapping
Search for covering
K-input subcircuits.
15
A A A A O O A A O
in4 in5 in6 in7 sel0 sel1
A A O
sel2
A A A A O O A A O
in0 in1 in2 in3 sel0 sel1
K-input LUT (K=3): Can implement any Boolean function with up to K arguments.
SLIDE 16 TMAP: Tunable LUT mapping
16
A A A A O O A A O
in4 in5 in6 in7 sel0 sel1
A A O
sel2
A A A A O O A A O
in0 in1 in2 in3 sel0 sel1
Tunable LUT (TLUT) can implement any Boolean function with K regular inputs and any number of parameter inputs. Search covering with subcircuits that have up to K regular inputs and any number of parameter inputs.
SLIDE 17 LUT structure and functionality
17
) . . .( . . .
1 1 1 1 2 3
in sel in sel sel L sel L in sel in sel L
in0 in1 sel0 sel1
L1 L0
in2 in3
L5 L4 L3
in4 in5 in6 in7 sel2
…
SLIDE 18 Place and Route
18
in0 in1
L1 L0
in2 in3
L5 L4 L3
in4 in5 in6 in7 sel0 sel1 sel2
SLIDE 19 The reduced generation time (5 orders)
– No NP-hard problems (place and route) at run-time – Only evaluation of the tuning functions
Less memory (only 29kB)
– TMAP flow finds similarity between configurations – Compressed form of all configurations
Experiment: 16-tap FIR, 8-bit coefficients
Generic Parameterizable configuration Specialized area (LUTs) 2999 1146 clock freq. (MHz) 84 119
35634 memory (kB) 2128 conf.
19
1301 (-56%) 115 (+37%) 0.166 29
Less area (-56%)
– More functionality in one TLUT – Functionality is moved to the tuning functions
Higher clock frequency (+37%)
– Less LUTs can be placed closer together – Less congestion because less nets
SLIDE 20 When should we use parameterized reonfiguration?
Use the Functional Density as a measure for implementation efficiency. =
A: The area needed T: The total execution time N: The number of operations
*A. M. Dehon, Reconfigurable architectures for general- purpose computing, Massachusetts Institute of Technology, 1996.
20
SLIDE 21 Parameter Selection
- Avg. Time between parameter changes (clock cycles)
Fu n c t io n a l D e n sit y ( O p s/ s/ L U T s)
21
Profiler to trade off gain versus overhead of reconfiguration
SLIDE 22 Outline
- What is Parameterized Run-time Reconfiguration?
- The importance of the parameter choice
- Effects on logic synthesis
22
SLIDE 23 Original logic synthesis solution (3-input LUT)
23
A A A A O O A A O
in4 in5 in6 in7 sel0 sel1
A A O
sel2
A A A A O O A A O
in0 in1 in2 in3 sel0 sel1
SLIDE 24 Making subtrees according to K regular inputs
24
A O
sel2
A A A A O
in5 sel1 in4 sel0
A A A A O A O
in0 in1 sel1 in2 sel0
A
sel2
A A A A O A O
in7 in6 sel2 in3 sel0
A
sel1
O
SLIDE 25 Separate parameters from other inputs
25
O
O O A A A O O
in0 in1 sel in2 sel
A
sel in3 sel
A A A O O
in7 in6 sel in5 sel
A
sel in4 sel
SLIDE 26 Changing the tree depth
26
O
O A A A O O
in0 in1 sel in2 sel
A
sel in3 sel
A A A
in7 in6 sel in5 sel
A
sel in4 sel
O O O
SLIDE 27 Conclusions
- Parameterized reconfiguration opens up new
- ptimization possibilities using run-time reconfiguration
- Parameters are to be treated differently in Technology
Mapping
- Therefore parameters and regular inputs should be
treated differently in logic synthesis
- Cost of parameter calculations (Boolean functions)
should also be taken into account
- New challenge in synthesis
27
SLIDE 28 Submit to IWLS
28
Paper abstract sumission: March 11, 2016 www.iwls.org
SLIDE 29 Last slide
- Much of this work was done in the framework of the EU-
FP7 project FASTER and is now continued in the EU- H2020 project (FETHPC) EXTRA
- Tools at https://github.com/UGent-HES/tlut_flow
- Questions?
- More information: http://hes.elis.ugent.be/
29