Lava 4 (relevant to take home exam) Stepping back to see the bigger picture
Where can more info. be found? What are the hot research topics?
1
Lava 4 (relevant to take home exam) Stepping back to see the bigger - - PowerPoint PPT Presentation
Lava 4 (relevant to take home exam) Stepping back to see the bigger picture Where can more info. be found? What are the hot research topics? 1 Prefix Given inputs x1, x2, x3 xn Compute x1, x1*x2, x1*x2*x3, , x1*x2 ** xn
1
2
not only binary and FP adders address calculation priority encoding etc.
3
least most significant inputs n=8 depth d=7 size s=7 (number ops) Pictures generated by symbolic evaluation of Lava descriptions Style is specific to parallel prefix
4
5
serr _ [a] = [a] serr op (a:b:bs) = a:cs where c = op(a,b) cs = serr op (c:bs) *Main> simulate (serr plus) [1..10] [1,3,6,10,15,21,28,36,45,55]
6
32 inputs, depth 5, 80 operators
7
skl _ [a] = [a] skl op as = init los ++ ros' where (los,ros) = (skl op las, skl op ras) ros' = fan op (last los : ros) (las,ras) = halveList as
8
9
fewer ops, at cost of being deeper. Fanout only 2
10
P is another half size network operating on only the thick wires
11
NOT the same as Sklansky; many books and papers are wrong about this (including slides from Digital Circuit Design course)
12
13
14
P is another half size network operating on only the thick wires This is an alternative view to the ”forwards and backwards trees” that some of you saw in Jeppson’s course
15
Each S is a serial network like that shown earlier
16
4 2 3 … 4 This sequence of numbers determines how the outer ”layer” looks
17
4 2 3 … 4 4 2 3 … 4
sequence for widths of fans at bottom is closely related
18
4 2 3 … 4 3 2 3 … 5 sequence for widths of fans at bottom is closely related
19
4 2 3 … 4 So just look at all possibilities for this sequence and for each one find the best possibility for the smaller P Then pick best overall! Dynamic programming
need a measure function (e.g. number of operators) Very similar to a ”shortest paths” algorithm
20
21
wsoE f1 g ctx = getans (error "no fit") (prefix f1 ctx) where prefix f = memo pm where pm ([d],_,w) = trywire ([d],w) pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w) pm (is,xs,w) = ((bestOnE xs is f).dropFail) [wrpC ds (prefix f)| ds <- topds g h (length is)] where . . . .
22
wsoE f1 g ctx = getans (error "no fit") (prefix f1 ctx) where prefix f = memo pm where pm ([d],_,w) = trywire ([d],w) pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w) pm (is,xs,w) = ((bestOnE xs is f).dropFail) [wrpC ds (prefix f)| ds <- topds g h (length is)] where . . . .
f1 is the measure function being
23
wsoE f1 g ctx = getans (error "no fit") (prefix f1 ctx) where prefix f = memo pm where pm ([d],_,w) = trywire ([d],w) pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w) pm (is,xs,w) = ((bestOnE xs is f).dropFail) [wrpC ds (prefix f)| ds <- topds g h (length is)] where . . . .
g is max width of small S and F
24
wsoE f1 g ctx = getans (error "no fit") (prefix f1 ctx) where prefix f = memo pm where pm ([d],_,w) = trywire ([d],w) pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w) pm (is,xs,w) = ((bestOnE xs is f).dropFail) [wrpC ds (prefix f)| ds <- topds g h (length is)] where . . . .
context delays in wire numbers (positions) in allowed depth (is,xs,w)
25
wsoE f1 g ctx = getans (error "no fit") (prefix f1 ctx) where prefix f = memo pm where pm ([d],_,w) = trywire ([d],w) pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w) pm (is,xs,w) = ((bestOnE xs is f).dropFail) [wrpC ds (prefix f)| ds <- topds g h (length is)] where . . . .
use memoisation to avoid expensive recomputation
26
wsoE f1 g ctx = getans (error "no fit") (prefix f1 ctx) where prefix f = memo pm where pm ([d],_,w) = trywire ([d],w) pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w) pm (is,xs,w) = ((bestOnE xs is f).dropFail) [wrpC ds (prefix f)| ds <- topds g h (length is)] where . . . .
base case: single wire
27
wsoE f1 g ctx = getans (error "no fit") (prefix f1 ctx) where prefix f = memo pm where pm ([d],_,w) = trywire ([d],w) pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w) pm (is,xs,w) = ((bestOnE xs is f).dropFail) [wrpC ds (prefix f)| ds <- topds g h (length is)] where . . . .
Fail if it is simply impossible to fit a prefix network in the available depth
28
wsoE f1 g ctx = getans (error "no fit") (prefix f1 ctx) where prefix f = memo pm where pm ([d],_,w) = trywire ([d],w) pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w) pm (is,xs,w) = ((bestOnE xs is f).dropFail) [wrpC ds (prefix f)| ds <- topds g h (length is)] where . . . .
For each candidate sequence: Build the resulting network (where call of (prefix f) gives the best network for the recursive call inside) (Needed to think hard about controlling size of search space)
29
parpre f1 g ctx = getans (error "no fit") (prefix f1 ctx) where prefix f = memo pm where pm ([d],_,w) = trywire ([d],w) pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w) pm (is,xs,w) = ((bestOnE xs is f).dropFail) [wrpC ds (prefix f)| ds <- topds g h (length is)] where . . . .
Finally, pick the best among all these candidates
30
Result when minimising number of ops, depth 6, 33 inputs, fanout 7 This network is Depth Size Optimal (DSO) depth + number of ops = 2(number of inputs)-2 (known to be smallest possible no. ops for given depth, inputs) 6 + 58 = 2*33 – 2
31
64 inputs, depth 8, size 118 (also DSO) BUT not min. depth. We need to move away from DSO if we want shallow networks
32
33
parpre1 f1 f2 g m ctx = getans (error "no fit") (prefix f1 ctx) where prefix f = memo pm where pm ([],_,w) = trywire ([],w) pm ([i],_,w) = trywire ([i],w) pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w) pm (is,xs,w) = ((bestOnE xs is f).dropFail) [wrpC1 ds (prefix f) (prefix f2)| ds <- topds1 g h m lis]
34
parpre1 f1 f2 g m ctx = getans (error "no fit") (prefix f1 ctx) where prefix f = memo pm where pm ([],_,w) = trywire ([],w) pm ([i],_,w) = trywire ([i],w) pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w) pm (is,xs,w) = ((bestOnE xs is f).dropFail) [wrpC1 ds (prefix f) (prefix f2)| ds <- topds1 g h m lis] extra base case for 0 inputs
35
parpre1 f1 f2 g m ctx = getans (error "no fit") (prefix f1 ctx) where prefix f = memo pm where pm ([],_,w) = trywire ([],w) pm ([i],_,w) = trywire ([i],w) pm (is,_,w) | 2^h < length is = Fail where h = maxd(is,w) pm (is,xs,w) = ((bestOnE xs is f).dropFail) [wrpC1 ds (prefix f) (prefix f2)| ds <- topds1 g h m lis] now there are 2 recursive calls
36
37
Link to Wired allows more accurate estimates. Can then explore design space
38
Can also export to Cadence SoC Encounter
39
This is very low level. What about higher up, earlier in the design? (Tentative assertion: these were general programming idioms with possible application at other levels of abstraction.) What about the cases when such a structural approach is inappropriate? Can we make refinement work? Can we design appropriate GENERIC verification methods?
40
Connection patterns are essential first step (and give some layout awareness when wanted) We write circuit generators rather than circuit descriptions. Everything is done behind the scenes by symbolic evaluation. Full power of Haskell is available to the user (but we have some useful idioms to reduce the fear). Circuit generators are short and sweet and LOOK LIKE circuit descriptions.
41
Non-standard interpretation used after generation (as we have long done) and now also to guide synthesis Clever circuits a good idiom. Can control choice of components, wiring and topology. Greatly increase expressive power of the connection patterns approach. Having a full functional language available is a great once one has had some practice. More idioms to be discovered Ideas compatible with Intel’s IDV
42
Clever circuits give a way to allow non-functional properties to influence design (even early on). Makes blocks context sensitive. Vital as we move to deep sub-micron Separation of concerns becoming less and less possible First experiments are (and will be) about module generation Remains to be seen if there are applications at higher levels Hopefully, a project on DSP Algorithm Design with Ericsson will explore this
43
44
VHDL Verilog C
UML
45
VHDL Verilog C
UML
46
Intel IDV (Seger)
Forte (Intel’s FV system)
IBM SystemML (now called HDML,
Masters projects possible Behavioural Lava (York) Lava + Wired etc.
Lustre, Esterel Cryptol
47
Property Checking Formal
48
Kunz (Infineon, Siemens, Bosch … OneSpin) processor and SoC verification SAT-based Extremely impressive! see also work at companies like NVIDIA, Freescale, … (see panel at FMCAD 2007 (links page))
A problem is that there is a lot of unpublished work….
49
Intel (Seger’s lecture) Forte (STE) niches (such as Floating Point Arith.) IBM Sixth Sense combines formal and semi-formal emphasises scalability and automation see great presentation by Baumgartner from FMCAD 2006 (links page)
Coverage (OneSpin look to have something very interesting, but it is not public) Methodology, Finding new FV ”recipes” Moving up in abstraction levels Satisfiability Modulo Theories (SMT), First Order Logic How to design (and verify) complete systems has become harder because of multicore Getting control of non-functional properties (particularly power consumption)
50
Parallelisation of EDA algorithms Protocol verification Increasing automation of FV (e.g. transformation-based verification ala Sixth Sense) how to build and use verification IP reuse Post-silicon verification
51
The two different design flows that you have seen What was good and bad about them YOUR opinions based on your experience (which is influenced by previous expertise) Formal Verification evidence about its use (suitable niches, module verification) limitations (a main one being scalability) what it can give when it works
52