Untethered lowRISC, Memory Mapped IO and TileLink/AXI Wei Song - - PowerPoint PPT Presentation
Untethered lowRISC, Memory Mapped IO and TileLink/AXI Wei Song - - PowerPoint PPT Presentation
Untethered lowRISC, Memory Mapped IO and TileLink/AXI Wei Song 27/07/2015 Time Line expected Nov. 2014 Apr. 2015 Now Oct. 2015 First lowRISC Rocket-Chip release Memeory Untethered from Berkeley release. Mapped IO. lowRISC release.
Time Line
2
- Nov. 2014
Rocket-Chip release from Berkeley
- Apr. 2015
First lowRISC release. Initial tagged memory support. Now Memeory Mapped IO.
- Oct. 2015
Untethered lowRISC release. · Added tags in L1 D$, L2. · Added a tag cache. · Added 2 instructions to load/ store tag. · A tutorial about Rocket-chip. · Untethered SoC. · Support Kintex KC705. · Support MMIO. · Support SD, UART, DDRAM. · Open simulation environment.
expected
Rocket-Chip Release (Berkeley)
3
Rocket Core L2 & Coherence Manager L2 & Coherence Manager TileLink I$ D$
Rocket Tile
TileLink TileLink L2 & Coherence Manager TileLink TileLink TileLink Rocket Core I$ D$
Rocket Tile
Rocket Core I$ D$
Rocket Tile
Arbiter
Memory Controller MemIO Converter
Host Interface ARM
UART SD EtherNet
lowRISC Release (tagged memory)
4
Rocket Core L2 & Coherence Manager L2 & Coherence Manager TileLink
Allocator
I$ D$
Rocket Tile
TileLink TileLink L2 & Coherence Manager TileLink TileLink TileLink Rocket Core I$ D$
Rocket Tile
Rocket Core I$ D$
Rocket Tile
Tracker & Converter
Data Array Tracker & Converter MetaData Array
Arbiter
Memory Controller
Tag Cache Host Interface ARM
UART SD EtherNet
Tag in L1 D$, L2 $ Tag Cache LTAG/STAG instructions
Latest Rocket-Chip (Berkeley)
5
Rocket Core L2 & Coherence Manager L2 & Coherence Manager I$ D$
Rocket Tile
L2 & Coherence Manager Rocket Core I$ D$
Rocket Tile
Rocket Core I$ D$
Rocket Tile
Arbiter
Memory Controller
Host Interface TileLink/AXI AXI/MemIO AXI Bus ARM
UART SD EtherNet
Cached TileLink Uncached TileLink AXI MemIO L2 Bus
Multi-beat TileLink Standardize TileLink transactions Possible coherence support of L3 Code refactoring AXI/AXI interface (NASTI)
Untethered lowRISC SoC (First Version)
6
Rocket Core L2 & Coherence Manager L2 & Coherence Manager I$ D$
Rocket Tile
L2 & Coherence Manager Rocket Core I$ D$
Rocket Tile
Rocket Core I$ D$
Rocket Tile
Arbiter
Memory Controller
TileLink/AXI AXI Bus Cached TileLink Uncached TileLink AXI L2 Cache Bus Tag Cache
On-FPGA Boot Ram
L2 IO Bus AXI-Lite UART SD EtherNet TileLink/AXI-Lite DMA DMA
coherent incoherent
Boot Minion
Current Status
7
Rocket Core L2 & Coherence Manager L2 & Coherence Manager I$ D$
Rocket Tile
L2 & Coherence Manager Rocket Core I$ D$
Rocket Tile
Rocket Core I$ D$
Rocket Tile
Arbiter
Memory Controller
TileLink/AXI AXI Bus Cached TileLink Uncached TileLink AXI L2 Cache Bus Tag Cache
On-FPGA Boot Ram
L2 IO Bus AXI-Lite UART SD EtherNet TileLink/AXI-Lite DMA DMA
coherent incoherent
Boot Minion
Memory Mapped IO
- Target
– IO load/write (B/HW/W/DW) – In-order uncached load/store – Side effect
- None for all write in units of byte
- None for all read in units of word (32-bit AXI-Lite)
– No change in current L2 coherent manager
8
Untethered lowRISC SoC (First Version)
9
Rocket Core L2 & Coherence Manager L2 & Coherence Manager I$ D$
Rocket Tile
L2 & Coherence Manager Rocket Core I$ D$
Rocket Tile
Rocket Core I$ D$
Rocket Tile
Arbiter
Memory Controller
TileLink/AXI AXI Bus Cached TileLink Uncached TileLink AXI L2 Cache Bus Tag Cache
On-FPGA Boot Ram
L2 IO Bus AXI-Lite UART SD EtherNet TileLink/AXI-Lite AXI/AXI-Lite DMA DMA
coherent incoherent
Boot Minion
L1 Data Cache
10
data
[DataArray rocket/nbdcache.scala]meta
[MetadataArray uncore/cache.scala]mshrs
[MSHRFile;rocket/nbdcache.scala]mshr
[MSHR rocket/nbdcache.scala]data
[DataArray rocket/nbdcache.scala]meta
[MetadataArray uncore/cache.scala]wb
[WriteBack;rocket/nbdcache.scala]prober
[ProbeUnit;rocket/nbdcache.scala]dtlb
[TLB rocket/tlb.scala]Arbiter
1 2 3 4
Arbiter
1 2 3 mshrs.replay
Arbiter
1 2 3 4 s1_req
s1_req.addr
= = = =
s1_tag_eq_way
code
[DecodeLogic rocket/decode.scala]s1_addr read read resp resp
s2_req
s2_data (uncorrected)
amoalu
[AMOALU rocket/nbdcache.scala]s3_req
s2_hit s2_data (corrected)
Arb Arb
1 1
write write
mshrs.request mshrs.meta_write mem.req mem.grant
Arb
1
meta/data read meta/data read
wb.meta/data_read prober.meta/data_read
req
Arb
1
release rep wb_req data_resp s2_data (corrected) prober.release
mshrs.wb_req
meta_write prober.meta_write line_state req
mem.probe mem.finish mem.release cpu.resp.valid cpu.resp.bits.data cpu.req cpu.ptw dtlb.ptw
Stage 1 Stage 2 Stage 3 Stage 4
s2_recycle s1_recycled
s2_data_correctable vpn ppn
correctable correct in
- ut
rhs lhs
- ut
s1_data s2_tag_eq_way s2_data s2_hit
L1 Data Cache (simplified)
11
data meta mshrs mshr data meta dtlb Arbiter Arbiter
mshrs.replay
Arbiter
s1_req
s1_req.addr
= = = =
s1_tag_eq_way s1_addr read read resp resp
s2_req
amoalu
s2_hit write write
mshrs.request mshrs.meta_write
mem.req mem.grant cpu.req Stage 1 Stage 2 Stage 3 Stage 4
vpn ppn
rhs lhs
- ut
s1_data s2_data s2_hit
cpu.resp Arb
s1_addr
L1 Data Cache with IO Handler
12
data meta mshrs mshr data meta dtlb Arbiter Arbiter
mshrs.replay
Arbiter
s1_req
s1_req.addr
= = = =
s1_tag_eq_way s1_addr read read resp resp
s2_req
amoalu
write write
mshrs.request mshrs.meta_write
io.req io.grant cpu.req Stage 1 Stage 2 Stage 3 Stage 4
vpn ppn
rhs lhs
- ut
s1_data s2_data s2_hit
cpu.resp Arb
s1_addr
ioaddr
s2_req.addr addr io
iomshr
request iomshr.replay io_data s1_io_data s2_io_data
s2_io_replay
io_data replay
mem.req mem.grant
TileLink Channels
- Manager/Client
– Manager: Coherent manager or next level cache/device – Client: upper level cache
- 5 Channels
– Acquire: [C -> M]
- Read, uncached write (write-through, IO), permission update
– Grant: [M -> C]
- Ack to Acquire (with data when read)
– Finish: [C -> M]
- Finish a transaction
– Probe: [M -> C]
- Coherence probe (snoop, invalidate)
– Release: [C -> M]
- Write-back (replace or invalidate)
13
Untethered lowRISC SoC (First Version)
14
Rocket Core L2 & Coherence Manager L2 & Coherence Manager I$ D$
Rocket Tile
L2 & Coherence Manager Rocket Core I$ D$
Rocket Tile
Rocket Core I$ D$
Rocket Tile
Arbiter
Memory Controller
TileLink/AXI AXI Bus Cached TileLink Uncached TileLink AXI L2 Cache Bus Tag Cache
On-FPGA Boot Ram
L2 IO Bus AXI-Lite UART SD EtherNet TileLink/AXI-Lite AXI/AXI-Lite DMA DMA
coherent incoherent
Boot Minion
TileLink Corssbar
15 Acquire Grant Finish Probe Release Acquire Grant Finish Probe Release Acquire Grant Finish Probe Release Acquire Grant Finish Probe Release
L1 $ L1 $ L2 Bank L2 Bank client Manager TileLink Corssbar
Shared TileLink Corssbar
16 Acquire Grant Finish Probe Release Acquire Grant Finish Probe Release Acquire Grant Finish Probe Release Acquire Grant Finish Probe Release
L1 $ L1 $ L2 Bank L2 Bank client Manager Shared TileLink Corssbar
Use a SuperChannel to store all types of TileLink channels.
Current Status of TileLink/AXI
- TileLink/AXI (Berkeley, Rocket-chip)
– only a whole cache line
- TileLink/AXI-Lite (lowRISC)
– 1,2,4,8 byte write; 4,8 byte read
- AHB/APB (Berkeley, Z-Scale)
- Still needed:
– AXI/AXI-Lite compatible, auto width SerDes switch
- The AXI-Node from PULP
- May be in Chisel for its parameterization capability
– AXI/Wishbone, TileLink/Wishbone
17
Remain Issues
- Interrupt controller
- Open Sourced, License compatible IPs
– UART (Flexpret, BSD) – SD host controller – Ethernet controller (Xilinx IP for now) – Memory controller (difficult to get)
- Open Source EDA tools
– Current environment:
- VCS (DRAMSim, Front-end server, DirectC)
- Vivado+SDK (SDK not available for Kintex)
– Target environment:
- Verilator (SystemVerilog 2009, SystemC, VPI, DPI)
- Vivado only
18
After the Untethered SoC
- Implementing the hierarchical tag cache
(hardware)
- Debug interface
- Integrating minions (PULP)
- Tag support in Rocket cores (Lucas)
19