Implementing a MIPS processor using SME Carl-Johannes Johnsen - - PowerPoint PPT Presentation

implementing a mips processor using sme
SMART_READER_LITE
LIVE PREVIEW

Implementing a MIPS processor using SME Carl-Johannes Johnsen - - PowerPoint PPT Presentation

Implementing a MIPS processor using SME Carl-Johannes Johnsen Department of Computer Science University of Copenhagen August 22, 2017 Carl-Johannes Johnsen Implementing a MIPS processor using SME Background The Machine Architecture class at


slide-1
SLIDE 1

Implementing a MIPS processor using SME

Carl-Johannes Johnsen

Department of Computer Science University of Copenhagen

August 22, 2017

Carl-Johannes Johnsen Implementing a MIPS processor using SME

slide-2
SLIDE 2

Background

The Machine Architecture class at DIKU teaches the theory of computer organization and design. However, it does not teach how to construct specialized hardware, as one might implement on an FPGA. FPGAs are more attractive than general purpose CPUs in some applications, as they do not necessarily have the same overhead in both performance and power usage, as they are not as complex. However, FPGAs are programmed using Hardware Description Languages, which are very tedious to program. This has changed with SME.

Carl-Johannes Johnsen Implementing a MIPS processor using SME

slide-3
SLIDE 3

Introduction

The SME programming model is similar to the CSP model, except it is globally synchronous, has broadcasting channels and a hidden

  • clock. This makes it more suitable for generating hardware models

than CSP. Additionally, SME can be transpiled into VHDL, which can be written onto an FPGA. I am implementing a MIPS processor as taught in Machine Architecture by using SME, and documenting the process, so it could be used as teaching material for a course on hardware development.

Carl-Johannes Johnsen Implementing a MIPS processor using SME

slide-4
SLIDE 4

Basic combinatorial

I started by implementing some basic combinatorial circuits, as this was an simple approach to SME. The first one I made consisted of four processes, which simulate four basic gates: AND, NOT, OR and XOR. AND OR NOT XOR Tester input

  • utput

Carl-Johannes Johnsen Implementing a MIPS processor using SME

slide-5
SLIDE 5

Basic combinatorial - Full adder example

1 public interface InputA : IBus { 2 bool bit { get; set; } 3 } 4 ... 5 public class AndGate : SimpleProcess { 6 [InputBus] InputB input1; 7 [InputBus] InputC input2; 8 [OutputBus] Internal3 output; 9 10 protected override void OnTick() { 11

  • utput.bit = input1.bit && input2.bit;

12 } 13 } 14 15 public class OrGate : SimpleProcess { 16 ... 17 protected override void OnTick() { 18

  • utput.bit = input1.bit || input2.bit;

19 } 20 } 21 22 public class XorGate : SimpleProcess { 23 ... 24 protected override void OnTick() { 25

  • utput.bit = input1.bit ^ input2.bit;

26 } 27 }

OR AND AND XOR XOR InputA InputB InputC Sum Carry

Carl-Johannes Johnsen Implementing a MIPS processor using SME

slide-6
SLIDE 6

Components

After getting some experience with SME, i started working on the MIPS processor. I started by implementing each of the components as SME processes, as I can then verify them individually.

1 public interface ReadA : IBus { 2 short addr { get; set; } 3 } 4 ... 5 6 public interface WriteBus : IBus { 7 bool enabled { get; set; } 8 short addr { get; set; } 9 uint data { get; set; } 10 } 11 12 public interface OutputA : IBus { 13 uint data { get; set; } 14 } 15 ... 1 public class Register : SimpleProcess { 2 [InputBus] ReadA readA; 3 [InputBus] ReadB readB; 4 [InputBus] WriteBus write; 5 6 [OutputBus] OutputA outputA; 7 [OutputBus] OutputB outputB; 8 9 uint[] data = new uint[32]; 10 11 protected override void OnTick() { 12 if (write.enabled && write.addr > 0) 13 data[write.addr] = write.data; 14

  • utputA.data = data[readA.addr];

15

  • utputB.data = data[readB.addr];

16 } 17 } Carl-Johannes Johnsen Implementing a MIPS processor using SME

slide-7
SLIDE 7

Single Cycle

With all of the components implemented and verified, ’wiring’ up the processes is straightforward, as the busses should just be named accordingly.

Register file Control unit Jump unit Splitter Instruction Memory Sign extend ALU ALU control Memory | | | PC Write buffer Clock

Carl-Johannes Johnsen Implementing a MIPS processor using SME

slide-8
SLIDE 8

Extending the processor

Then I wanted to extend the processor to handle additional instructions.

Register file Control unit Jump unit Splitter Instruction Memory Sign extend ALU ALU control Memory JAL | | | | PC Write buffer Clock

Carl-Johannes Johnsen Implementing a MIPS processor using SME

slide-9
SLIDE 9

Pipelining

Following the procedure from the Machine Architecture class, I have pipelined the processor, and handled the hazards introduced by pipelining with an hazard detection unit, and a forwarding unit.

ALU Register File Memory Jump Instruction Memory PC Control | Forwarding Unit | | | Hazard Detectection IF ID EX MEM WB

Carl-Johannes Johnsen Implementing a MIPS processor using SME

slide-10
SLIDE 10

Performance

To test both the single cycle processor and the pipelined processor, I have made some programs in MIPS assembler. To verify both the number of executed clock ticks and the results of the program, I have run them in the MIPS simulation program MARS.

MARS SME # CT time (ms) CR (hz) # CT time (ms) CR (hz) Towers of Hanoi n = 5 719 585 ∼1229 720 - 1058 516 - 1190 ∼1395 - ∼889 Quicksort n = 8 483 582 ∼829 484 - 763 375 - 895 ∼1290 - ∼852 Fib no optimization n = 10 220 584 ∼376 221 - 251 191 - 356 ∼1157 - ∼753 Fib forward n = 10 98 586 ∼167 100 - 130 119 - 209 ∼840 - ∼ 622 Fib hazard n = 10 84 588 ∼142 86 - 126 113 - 212 ∼761 - ∼594

Carl-Johannes Johnsen Implementing a MIPS processor using SME

slide-11
SLIDE 11

Synthesizing

As mentioned SME can be transpiled into VHDL, which can be further synthesized, placed and routed onto an FPGA. SME simulation ghdl simulation Vivado behavioral simulation Vivado post-impl simulation Generate bitstream AXI interface Export Hardware SDK

Carl-Johannes Johnsen Implementing a MIPS processor using SME

slide-12
SLIDE 12

Synthesizing - Logic gates

As when I started working with SME, I wanted to get some experience with VHDL and Vivado, by using a simple network. I.e. I started by implementing the Logic Gates, by mapping the top-level input and output wires to hardware switches and LEDs, followed by generating the bitstream.

Carl-Johannes Johnsen Implementing a MIPS processor using SME

slide-13
SLIDE 13

Synthesizing - AXI interface

Then I wanted to be able to communicate with the generated

  • hardware. This is possible on the Zynq chip, by using an AXI

interface. The Zynq chip on the ZedBoard consists of a dual core ARM processor and an FPGA. The AXI interface allows the ARM processor to communicate with the FPGA. Vivado has an AXI interface template, which consists of a set of registers, which are exposed to the ARM processor through peripheral memory.

Carl-Johannes Johnsen Implementing a MIPS processor using SME

slide-14
SLIDE 14

Synthesizing - AXI interface

1 #include "xparameters.h" 2 #include "LogicGates_AXI.h" 3 #include "xil_io.h" 4 5 int base = XPAR_LOGICGATES_AXI_0_S00_AXI_BASEADDR; 6 int bit1 = LOGICGATES_AXI_S00_AXI_SLV_REG0_OFFSET; 7 int bit2 = LOGICGATES_AXI_S00_AXI_SLV_REG1_OFFSET; 8 int and = LOGICGATES_AXI_S00_AXI_SLV_REG2_OFFSET; 9 int or = LOGICGATES_AXI_S00_AXI_SLV_REG3_OFFSET; 10 int not = LOGICGATES_AXI_S00_AXI_SLV_REG4_OFFSET; 11 int xor = LOGICGATES_AXI_S00_AXI_SLV_REG5_OFFSET; 12 13 void print_regs() { 14 xil_printf("%d %d | %d %d %d %d\n", 15 LOGICGATES_AXI_mReadReg(base, bit1), 16 LOGICGATES_AXI_mReadReg(base, bit2), 17 LOGICGATES_AXI_mReadReg(base, and), 18 LOGICGATES_AXI_mReadReg(base, or), 19 LOGICGATES_AXI_mReadReg(base, not), 20 LOGICGATES_AXI_mReadReg(base, xor)); 21 } 22 23 void write_regs(int bit1_data, int bit2_data) { 24 LOGICGATES_AXI_mWriteReg(base, bit1, bit1_data); 25 LOGICGATES_AXI_mWriteReg(base, bit2, bit2_data); 26 } 1 int main() { 2 init_platform(); 3 4 write_regs(0,0); 5 print_regs(); 6 7 write_regs(0,1); 8 print_regs(); 9 10 write_regs(1,0); 11 print_regs(); 12 13 write_regs(1,1); 14 print_regs(); 15 16 cleanup_platform(); 17 return 0; 18 } 1 0 0 | 0 0 1 0 2 0 1 | 0 1 1 1 3 1 0 | 0 1 0 1 4 1 1 | 1 1 0 0 Carl-Johannes Johnsen Implementing a MIPS processor using SME

slide-15
SLIDE 15

Synthesis - Single Cycle

Then I compared the Towers of Hanoi program on the hardware implementation of the Single Cycle processor and the SME simulation. FPGA SME #CT time (ms) #CT time(ms) ∼Speedup n = 5 718 ∼0.1436 719 516 ×3593 n = 10 22572 ∼4.5144 22574 13012 ×2882 n = 20 23068776 ∼4613.7552 N/A N/A N/A

Carl-Johannes Johnsen Implementing a MIPS processor using SME

slide-16
SLIDE 16

Synthesis - Pipelined

I have not succesfully implemented the Pipelined processor with the AXI interface. I do however have additional metrics on the different hardware implementations. Design Clockrate Memory (kb) Utilization Power (W) Logic Gates N/A N/A 4 % 0.001 Logic Gates AXI 100 0.02 1 % 0.006 Single Cycle 5 0.19 14 % 0.001 Single Cycle AXI 5 1 22 % 0.147 Pipelined 68.98 0.19 24 % 0.019 Pipelined BRAM 71.43 64 7 % 0.041

Carl-Johannes Johnsen Implementing a MIPS processor using SME

slide-17
SLIDE 17

Conclusion

Implemented a MIPS processor in SME Extended the accepted instruction set Pipelined the processor Synthesized, placed and routed both processors

Carl-Johannes Johnsen Implementing a MIPS processor using SME

slide-18
SLIDE 18

Future Work

Add an AXI interface to the Pipelined processor See how many cores can be fitted onto one FPGA Making a superscalar processor Running a minimal operating system

Carl-Johannes Johnsen Implementing a MIPS processor using SME