T42 Transputer Design in FPGA Transputer Design in FPGA T42 - - PowerPoint PPT Presentation

t42 transputer design in fpga transputer design in fpga
SMART_READER_LITE
LIVE PREVIEW

T42 Transputer Design in FPGA Transputer Design in FPGA T42 - - PowerPoint PPT Presentation

T42 Transputer Design in FPGA Transputer Design in FPGA T42 Year- -Three Design Status Report Three Design Status Report Year a and Martin ZABEL b , Uwe MIELKE a and Martin ZABEL b , Uwe MIELKE in collaboration w/ Michael BRUESTLE c


slide-1
SLIDE 1

T42 T42 – – Transputer Design in FPGA Transputer Design in FPGA Year Year-

  • Three Design Status Report

Three Design Status Report

Uwe MIELKE Uwe MIELKE a

a and Martin ZABEL

and Martin ZABEL b

b ,

, in collaboration w/ Michael BRUESTLE in collaboration w/ Michael BRUESTLE c

c

a a E

Electronics Engineer, Dresden, Germany, lectronics Engineer, Dresden, Germany, uwe.mielke@t uwe.mielke@t-

  • online.de
  • nline.de

b b I

Institute of Computer Engineering, nstitute of Computer Engineering, Technische Universit Technische Universitä ät Dresden t Dresden, Germany, , Germany, martin.zabel@tu martin.zabel@tu-

  • dresden.de

dresden.de

c c Electronics Engineer, Vienna, Austria,

Electronics Engineer, Vienna, Austria, michael_bruestle@yahoo.com michael_bruestle@yahoo.com Communicating Process Architectures 2017

slide-2
SLIDE 2

T42 in FPGA @ CPA 2016 T42 in FPGA @ CPA 2016

Abstract Abstract: : Our IMS Our IMS-

  • T425 binary compatible Transputer design

T425 binary compatible Transputer design has so far taken over 300 design days. Up to last year has so far taken over 300 design days. Up to last year minimal effort was spend for verification. Now a minimal effort was spend for verification. Now a regression test bench has been brought in place, which regression test bench has been brought in place, which is targeted to verify the design conformance after any is targeted to verify the design conformance after any

  • changes. This T42 Transputer Verification Suite is
  • changes. This T42 Transputer Verification Suite is

based on a TVS based on a TVS-

  • 1 work from

1 work from Michael Bruestle Michael Bruestle and and compares the register output of 54 selected instructions compares the register output of 54 selected instructions versus a a true T425 golden reference for up to versus a a true T425 golden reference for up to thousands of data samples. It helped already in T42 thousands of data samples. It helped already in T42 micro micro-

  • code debugging and hardware refinement.

code debugging and hardware refinement.

CPA 2017

slide-3
SLIDE 3

Agenda Agenda

(1) Review (2) Regression Test Bench (3) Transputer Verification Suite (4) TVS-1 Coverage (5) Design Environment (6) Achievements (2017) (7) Outlook (Links ; Verification) (8) Summary & Discussion

CPA 2017

slide-4
SLIDE 4

T42 in FPGA @ CPA 2014 T42 in FPGA @ CPA 2014-

  • 16

16 (Review) (Review)

2-stage-pipeline for T42 working in 2014

pre-fetch … is an autonomous FSM *)

  • 1. IF/ID

instruction fetch & decode

  • 2. EX

execute (using a single or multiple clocks per instruction) memory read/write is part of execute *)

micro code assembler & pre-fetch unit working in 2015 system control & memory interface working in 2016 >100 of 134 instructions (~600 lines of µCode) written

A lot of HW How to prove everything is correct ?

*) not a pipeline stage

CPA 2017

slide-5
SLIDE 5

Regression Test Bench ! Regression Test Bench !

  • Our T42

Our T42-

  • in

in-

  • FPGA has taken over 300 design days up to now !

FPGA has taken over 300 design days up to now !

  • A lot of HW ! But minimal effort was spend for verification yet

A lot of HW ! But minimal effort was spend for verification yet. .

  • Visual inspection of simulations is cumbersome & erroneous !

Visual inspection of simulations is cumbersome & erroneous !

  • The purpose of a

The purpose of a Regression Test Bench Regression Test Bench is to verify a design is to verify a design versus a (target) versus a (target) Specification Specification (or a former stable state achieved). (or a former stable state achieved).

  • In our TVS

In our TVS-

  • 1 case the specification is the binary execution result

1 case the specification is the binary execution result

  • f a part of the original T425 instruction set.
  • f a part of the original T425 instruction set.
  • The chosen instructions are the most important ones for a user

The chosen instructions are the most important ones for a user program (compiler) & can be verified by a simple IUT algorithm. program (compiler) & can be verified by a simple IUT algorithm.

  • Lesson learned

Lesson learned: verifications takes as much effort than design! : verifications takes as much effort than design!

CPA 2017

slide-6
SLIDE 6

Transputer Verification Suite Transputer Verification Suite

  • TVS

TVS-

  • 1

1*)

*) uses a golden references based on real T425 outputs.

uses a golden references based on real T425 outputs.

  • Here: 54 IUT (instructions under test) with 1, 2 or 3 operands.

Here: 54 IUT (instructions under test) with 1, 2 or 3 operands.

  • IUT assembler code and sample set will be loaded into on

IUT assembler code and sample set will be loaded into on-

  • chip

chip SRAMs before each run & comparison w/ golden reference file SRAMs before each run & comparison w/ golden reference file

  • Adaption for VHDL simulation was required: reduction of basic

Adaption for VHDL simulation was required: reduction of basic sample set from 128 sample set from 128*)

*) to 32 values to achieve suitable run times.

to 32 values to achieve suitable run times.

  • Basic sample set contains 32 signed integers (32bit): corner ca

Basic sample set contains 32 signed integers (32bit): corner cases ses around MINT around MINT … … Zero Zero … … MaxInt, several bit MaxInt, several bit-

  • pattern and single

pattern and single bit bit ‘ ‘1 1’ ’ and and ‘ ‘0 0’ ’ values, some small and large integers. values, some small and large integers.

  • Permutations: if Areg & Breg loaded then 32x32=1024 sets used.

Permutations: if Areg & Breg loaded then 32x32=1024 sets used.

Info: *) TVS Info: *) TVS-

  • 1 was written by

1 was written by Michael Bruestle Michael Bruestle in 2010 to support software in 2010 to support software development & verification of the Transputer Emulator Project ( development & verification of the Transputer Emulator Project (Gavin Crate Gavin Crate) )

CPA 2017

slide-7
SLIDE 7

TVS TVS-

  • 1 Coverage (Instructions)

1 Coverage (Instructions)

TVS TVS-

  • 1 covers 54 instructions:

1 covers 54 instructions:

  • primary (3/16) ldc, adc, eqc,

primary (3/16) ldc, adc, eqc, … …

  • arithm. logic (16/17) add, gt, xor,
  • arithm. logic (16/17) add, gt, xor, …

  • long arithmetic (9/9) ladd, lsum,

long arithmetic (9/9) ladd, lsum, … …

  • indexing (5/8) bsub, wcnt,

indexing (5/8) bsub, wcnt, … …

  • error handling (2/8) ccnt1, csub0,

error handling (2/8) ccnt1, csub0, … …

  • general (7/8) csngl, xword,

general (7/8) csngl, xword, … …

  • CRC and bits (5/5) bitcnt,

CRC and bits (5/5) bitcnt, … …

  • floating point (5/6) unpack,

floating point (5/6) unpack, … …

  • ALT (2/12) alt, talt.

ALT (2/12) alt, talt.

7 input files to meet different IUT requirements, 7 input files to meet different IUT requirements, e.g. for arithmetic, shift, range check, FP, e.g. for arithmetic, shift, range check, FP, … …

; load test ldl CREG ldl BREG ldl AREG __IUT__ stl AREG stl BREG stl CREG testerr stl ERROR ; send result

CPA 2017

slide-8
SLIDE 8

TVS TVS-

  • 1

1 Coverage (Samples) Coverage (Samples)

  • Original TVS-1 has ~360.000 tests (can be used over link only … may be

later in case T42-in-FPGA is running on an FPGA board w/ PC connection)

  • T42 TVS-1 will have ~25.000 tests (could be increased in case necessary)

TVS-1 Benefit:

  • a TVS-1 run after (some) VHDL modifications will verify if the design still

meets the specification ... or: if there is any (bad) impact from recent changes

TVS-1

  • riginal

input set Extra Constants Areg Values Breg Values Creg Values

  • NO. of

TESTs WORDs per SET IN-WORDs TEST.1 i32_1.bin 128 BBBBBBBB CCCCCCCC 128 3 384 TEST.1.4 i32_1.bin 8 128 BBBBBBBB CCCCCCCC 1.024 3 3072 TEST.2 i32_2.bin 128 128 CCCCCCCC 16.384 3 49152 TEST.3 i32_3.bin 128 128 8 131.072 3 393216 TEST.B i32_B.bin 32 128 8 32.768 3 98304 TEST.F i32_F.bin 64 BBBBBBBB CCCCCCCC 64 3 192 TEST.P i32_P.bin 14 8 72 14 112.896 4 451584 TEST.S i32_S.bin 66 128 8 67.584 3 202752 361.920

TVS-1

(original)

CPA 2017

slide-9
SLIDE 9

TVS TVS-

  • 1

1 Coverage (Samples) Coverage (Samples)

TVS-1

for T42

Output example:

prep_iut.BAT: IUT is ADC prepared @ 11.08.2017 11:46:43,65 tb_07_tvs1.vhd: simulation started... tb_07_tvs1.vhd: simulation Ok. - 4096.word - end of ..\sim\tb_07\golden_reference.mem reached. ghdl_sim.BAT: IUT is ADC finished @ 11.08.2017 11:47:11,96

  • prep_iut.BAT: IUT is ADD prepared @ 12.08.2017 12:05:46,48

tb_07_tvs1.vhd: simulation started... tb_07_tvs1.vhd: simulation Ok. - 16384.word - end of ../sim/tb_07/golden_reference.mem reached. modelsim.BAT: IUT is ADD finished @ 12.08.2017 12:13:25,95

TVS-1 (T-42) input set Extra Constants Areg Values Breg Values Creg Values

  • NO. of

TESTs WORDs per SET IN-WORDs TEST.1 i32_1.bin 128 BBBBBBBB CCCCCCCC 128 3 384 TEST.1.4 i32_1.bin 8 128 BBBBBBBB CCCCCCCC 1.024 3 3072 TEST.2 i32_2.bin 32 32 CCCCCCCC 1.024 3 3072 TEST.3 i32_3.bin 32 32 8 8.192 3 24576 TEST.B i32_B.bin 32 32 2 2.048 3 6144 TEST.F i32_F.bin 64 BBBBBBBB CCCCCCCC 64 3 192 TEST.P i32_P.bin 10 8 10 10 8.000 4 32000 TEST.S i32_S.bin 32 32 4 4.096 3 12288 24.576

CPA 2017

slide-10
SLIDE 10

Design Environment Design Environment SVN SVN

  • Our T42

Our T42-

  • in

in-

  • FPGA design environment was Xilinx ISE 14.7

FPGA design environment was Xilinx ISE 14.7

  • ISE is

ISE is not not supported supported anymore by Xilinx (now used is: Vivado) anymore by Xilinx (now used is: Vivado)

  • ISE has still some bugs

ISE has still some bugs … … e.g. handling of e.g. handling of “ “X X” ” values values … …

  • Therefore a change was required to be more

Therefore a change was required to be more „ „state of the art state of the art“ “

  • Now

Now: 3 different : 3 different Simulators Simulators can be used (more is better!) : can be used (more is better!) :

  • GHDL v0.33 (free) from

GHDL v0.33 (free) from Tristan Gingold Tristan Gingold … … + GTKWave Viewer + GTKWave Viewer

  • Mentor ModelSim

Mentor ModelSim – – as part of ALTERA FPGA lite 16.1 as part of ALTERA FPGA lite 16.1

  • Xilinx iSim

Xilinx iSim – – as part of ISE 14.7 as part of ISE 14.7

  • 2 of these simulators can be scripted ( automated :

2 of these simulators can be scripted ( automated :-

  • ) nicely

) nicely

  • O

Our improved Design Environment looks ur improved Design Environment looks “ “CMOS alike CMOS alike” ” now ; now ;-

  • )

)

CPA 2017

slide-11
SLIDE 11

T42 Achievements 2017 T42 Achievements 2017

  • Memory preparations, e.g. 1024x128bit_ucrom

Memory preparations, e.g. 1024x128bit_ucrom … … … … Nov.2016 Nov.2016

  • TVS

TVS-

  • 1 regression test bench adaption started

1 regression test bench adaption started … … … … Nov.2016 Nov.2016

  • clean up: fix simulation warnings about

clean up: fix simulation warnings about “ “X X” ” values values … … Jan.2017 Jan.2017

  • 1st TVS

1st TVS-

  • 1 instruction tests running (ADC,

1 instruction tests running (ADC, … …) ) … … Feb.2017 Feb.2017

  • all 54 TVS

all 54 TVS-

  • 1 instructions prepared, 50% running

1 instructions prepared, 50% running … … Apr.2017 Apr.2017

  • data path + uCROM review (

data path + uCROM review ( more instructions) more instructions)… … May.2017 May.2017

  • full design environment reorganization (

full design environment reorganization ( SVN) SVN) … … Jul.2017 Jul.2017

  • 42 of 54 TVS

42 of 54 TVS-

  • 1 instructions verified

1 instructions verified … … … … Aug.2017 Aug.2017 Note: switch to 1024x128bit_ucrom, e.g. merging of all available Note: switch to 1024x128bit_ucrom, e.g. merging of all available instructions instructions & routines in about ~600 lines of microcode will be done in autu & routines in about ~600 lines of microcode will be done in autumn mn … …

CPA 2017

slide-12
SLIDE 12

Outlook Outlook 2017Q4+ 2017Q4+ Verification Verification

still a lot of work to do: (partially pre-prepared already)

IUTs to debug: IN, OUT, ALTs, PARs, CRCs, … IUTs in completion: MULs, DIVs, FP support there are still ~80 instructions w/o verification concept yet ! scheduler µCode to debug: start next process, enqueue, dequeue … scheduler µCode completion: HW event interaction w/ timer, links,

boot, peek, poke, Halt-on-Error, analyse, …

verification pending: Timer & time slice, … reverse engineering & verification: Links + control logic (in work)

by help of TUD & POC library: (pre-prepared)

connect external memory: cache & DDR-RAM controller

CPA 2017

slide-13
SLIDE 13

Outlook Outlook 2017Q4+ 2017Q4+ Links Links … …

CHANNEL ADDRESSING by different Instructions

  • IN , OUT

Areg = unsigned message length in byte 2nd push onto link stack via Zbus Breg = pointer to the channel Creg = pointer to the message 1st push onto link stack via Zbus

  • TESTHARDCHAN

Areg = selected channel Breg = data (push into top of Link-DMA-Stack) Areg' := DBuffReg (read from bottom of Link-DMA-Stack) Breg' := Creg Iptr' := next instr.

  • Link-DMA-Stack …

CountReg' := Breg PtrReg' := CountReg DBuffReg' := PtrReg

CPA 2017

(Zbus) (Ubus) Breg CountReg PtrReg DBuffReg Areg

slide-14
SLIDE 14

T42 Summary

1.

“patch work” Transputer implementation is available

  • integer CPU & timer, ( links … in work )
  • simple memory path („on-chip“)
  • by TUD pre-prepared & tested: generic cache + DDR-RAM controller

2.

semi automated micro code generation possible

  • ne update run takes less than 1 minute

3.

regression test bench for verification in place

  • ne full verification run (e.g. 42 instructions) takes < 1 hour

4.

already >100 (of 134) instructions in µCode written

  • ~70 tested & used (e.g. primary & initialisation instructions, move, lend)
  • ~40 instructions proven correct by TVS-1

best conditions to continue & finish the whole design!

CPA 2017

slide-15
SLIDE 15

Time for Questions …

CPA 2017

… we’re on the right way

slide-16
SLIDE 16

BACKUP BACKUP

CPA 2017

slide-17
SLIDE 17

T42 Schematic T42 Schematic

CPA 2017

8kB DPRAM (On-Chip) T42-CPU Timers Link 0-3 & DMAs (in-work) 2nd 8kB DPRAM (preliminary instead of Caches) 1024x 128bit uCode ROM System Services

Fetch and Instr. Bus Addr and Data Bus

IMS T425

Note: T425 µCROM is 742x118bit (~90kBit) investigation done in Nov.2016 by Gavin Crate

slide-18
SLIDE 18

T42 VHDL Top View T42 VHDL Top View

CPA 2017

Ctrl2Data (structural) Pipeline DataPath:

  • ABCDEreg
  • ALU X+Y=Z
  • Wptr
  • Pointers
  • ConstBox
  • ByteAlign

CtrlPath:

  • uCodeROM
  • Idecode
  • Oreg (pipe)
  • Iptr (+Inc)
  • PreFetch

LinkPath:

  • Sync, ChOut, ChIn, ChEvent, Ifos

SysPath:

  • SysCtrl, Sbits, Timer, SysService

MemPath:

  • MemIF
  • MemMain

(dpram2kx32)

preliminary… …instead of cache

  • DummyCache

(dpram2kx32)

available+tested:

  • CacheCtrl (TUD)
  • DDRCtrl (TUD)

t42cpu_all_top (structural) t42_cpu_constpkg t42_cpu_functpkg Remark: Blocks in Remark: Blocks in red red still N/A. still N/A.

Target Board No.3 99$

Digilent Arty MemDDR3

(128Mx16 on board) XC7A35T Target Board No.1 89$

Avnet Micro Board MemLPDDR

(32Mx16 on board XC6LX9 Target Board No.2 199$

Digilent ATLYS MemDDR2

(64Mx16 on board) XC6LX45 Target Board No.1 89$

Avnet Micro Board MemLPDDR

(32Mx16 on board XC6LX9 Target Board No.2 199$

Digilent ATLYS MemDDR2

(64Mx16 on board) XC6LX45 Target Board No.3 99$

Digilent Arty MemDDR3

(128Mx16 on board) XC7A35T Target Board No.1 89$

Avnet Micro Board MemLPDDR

(32Mx16 on board XC6LX9 Target Board No.2 199$

Digilent ATLYS MemDDR2

(64Mx16 on board) XC6LX45

slide-19
SLIDE 19

INMOS Patent Research INMOS Patent Research

CPA 2017

  • Inmos Innovation and Patents: http://www.petritzfoundation.org/?mdocs-file=968

The “In” in Inmos Stands for Innovation ; James R. Adams, PhD

  • US-Pat-4704678 - INMOS 26Nov1982 https://patents.google.com/patent/US4704678A/en

Function set for a microcomputer

  • US-Pat-4724517 - INMOS 26Nov1982 https://patents.google.com/patent/US4724517A/en

Microcomputer with prefixing functions

  • US-Pat-4758948 - INMOS 19Jul1988

https://patents.google.com/patent/US4758948A/en Microcomputer

  • US-Pat-4989133 – INMOS 29Jan1991

https://patents.google.com/patent/US4989133A/en System for executing time dependent processes

  • US-Pat-4783734 – INMOS 08Nov1988 https://patents.google.com/patent/US4783734A/en

Computer with variable length process communication

  • US-Pat-4794526 – INMOS 27Dec1988 https://patents.google.com/patent/US4794526A/en

Microcomputer with priority scheduling