motivation
play

Motivation Organize main memory to Facilitate byte-addressability - PowerPoint PPT Presentation

7b.1 7b.2 Motivation Organize main memory to Facilitate byte-addressability EE 457 Unit 7b while maintaining Efficient fetching of the words in a cache block __________________________ helps us achieve this Main Memory


  1. 7b.1 7b.2 Motivation • Organize main memory to – Facilitate byte-addressability EE 457 Unit 7b while maintaining… – Efficient fetching of the words in a cache block • __________________________ helps us achieve this Main Memory Organization 7b.3 7b.4 Interleaving Analogy Interleaving Analogy • Consider a journal consisting of 1000 pages (000-999) bound in • Example: Say article 73 runs from page 730-739 – 10 volumes (0-9) of – In Method I: Article 73 is _______________________ – 100 pages each (00-99) – In Method II: The _____ page of _______ volume form article 73 as shown below Method I Method II • Which do you prefer? (Consecutive pages (Consecutive pages in in a volume) consecutive volumes) – If reading the article you may say method I 000 000 – If you have to make a copy of the article and you have 10 photocopy 001 010 Volume 0 Volume 0 machines with 10 friends to help you might say ____________ … … 099 990 • Back to the scenario of reading the article, given those same 10 friends they could 100 001 _____________________ for you so that you can still read in a continuous manner 101 011 Volume 1 Volume 1 … … 199 991 … … 900 Page 730 is page 73 of volume 0 009 901 Page 731 is page 73 of volume 1 019 Low Order Volume 9 … Volume 9 … Interleaving … 999 999 Page 739 is page 73 of volume 9

  2. 7b.5 7b.6 Byte Addressability Byte Addressability 4. Intel 80386: 32-bit addr., 32-bit data, byte A31-A2 1. Intel 8085: 16-bit addr., 8-bit data, A15-A0 1 GB addressable, big-endian proc. byte addressable processor. 64K Memory space: 2 32 = 4GB, A31-A0 Memory space: 2 16 = 64KB, A15-A0, D7-D0 8 8 8 8 [A31-A2, BE3, BE2, BE1, BE0], D31-D0 8 BE_ BE_ BE_ BE_ 2. Intel 8086: 20-bit addr., 16-bit data, byte Byte 40 Byte 41 Byte 42 Byte 43 = Word 40 D[31:24] D[7:0] A19-A1 addressable, little-endian proc. A31-A_ _____ ½ MB ½ MB 5. Little-Endian system, ______________ system: Memory space: 2 20 = 1MB, A19-A0 ½ GB 32-bit addr., 32-bit data, ____ 8 8 [A19-A1, BHE (BE1), A0 (BE0)], D15-D0 byte addressable BHE=0 A0=0 8 8 8 8 8 8 8 8 Byte 41 Byte 40 = Word 40 (Narrow, 32-bit data bus b/w mem. and cache) BE3 BE2 BE1 BE0 BE3 BE2 BE1 BE0 D[15:8] D[7:0] Memory space: 2 32 = 4GB, A31-A0 D[31:24] D[7:0] D[31:24] D[7:0] 3. Intel 80386: 32-bit addr., 32-bit data, XCVR XCVR [A31-A2, BE3, BE2, BE1, BE0], D31-D0 Narrow Bus byte addressable, little-endian proc. A31-A2 1 GB D[31:0] Memory space: 2 32 = 4GB, A31-A0 A3,A2 = __ A3,A2 = __ A3,A2 = __ A3,A2 = __ 6. Same as 5 above, ¼ A31-A_ [A31-A2, BE3, BE2, BE1, BE0], D31-D0 but __________________ 8 8 8 8 GB BE3 BE2 BE1 BE0 XCVR XCVR XCVR XCVR Byte 43 Byte 42 Byte 41 Byte 40 = Word 40 D[31:24] D[7:0] D[31:0] 7b.7 7b.8 2-Way L.O.I. 4-Way L.O.I. System address bus uses • • System address bus – A1:A0 and size info to uses generate /BE3../BE0 Bank 1 Bank 0 (Byte Enables) – A1:A0 and size info to Bank 3 Bank 3 In a 32-bit data bus, we need 2 Bank 2 Bank 2 • A2=1 A2=0 generate /BEi (Byte address bits to produce the 4 A3,A2 = A3,A2 = A3,A2 = A3,A2 = Enables) 11 10 01 00 BE’s A31-A3 A28-A0 A28-A0 A27-0 A27-0 A27-0 A27-0 A31-A4 • In a 64-bit data bus, we would – Lower order bits to ½ GB ¼ need ___ address bits to produce select a “bank” GB ___ BE’s Shift of 3-bits in – Lower order bits to select a – Upper bits connect to 8 8 8 8 8 8 8 8 address XCVR XCVR XCVR XCVR connections BE3 BE2 BE1 BE0 BE3 BE2 BE1 BE0 “bank” each memory chip D[31:24] D[7:0] D[31:24] D[7:0] • Only 1 address bit, A2, to select A2=1 A2=0 XCVR XCVR D[31:0] one of 2 banks Narrow Bus – Upper bits connect to each memory chip D[31:0] • Each memory chip is just a Shift of 4-bits in address collection of ½ GB requiring 29 connections address bits…we can connect appropriate 29 bits

  3. 7b.9 7b.10 Organization Options Organization Comparison • Assume following latencies a.) One-word-wide b.) Wide Memory c.) EE 457 Interleaved Send address to MM 1 clock memory Organization Organization MM (DRAM) Access Time 15 clocks CPU CPU CPU Transfer time for one word 1 clock • Find time to access a cache line of 4-words Multiplexer Cache Cache Cache a. Narrow Memory ____________________________ (assume mem. controller will Bus Bus Bus auto-increment address) b. Wide Memory Mem. Mem. Mem. Mem. Memory Memory Bank Bank Bank Bank c. Interleaved Memory 0 1 2 3 7b.11 7b.12 Example Tag RAM Example • Consider a set-associative mapping and physical organization of main Tag RAM Tag RAM memory, cache data RAMs, and cache tag RAMs. (Holding Tags & Valid (Holding Tags & Valid Bits for Way 0) Bits for Way 2) • Specs: ______ ______ Set A DO A DO – 32-bit physical address, byte-addressable system Hit/Miss Hit/Miss = = DI – Cache Size = 64KB DI ________ Tag ________ + V-bit – Block Size = 4 words (16 bytes) + V-bit – Set Size = 4 blocks (64 bytes) Tag RAM Tag RAM # of MM Blocks = _____________ (Holding Tags & Valid (Holding Tags & Valid # of Cache Blocks = _____________ Bits for Way 1) Bits for Way 3) ______ ______ # of Sets = _____________________________ A DO A DO # of Groups = ___________________________ Hit/Miss Hit/Miss = = DI DI ________ ________ Member + V-bit + V-bit TAG SET WORD BYTE A1 – A0 /BE3 - /BE0

  4. 7b.13 7b.14 MM & Data RAM Example A31-A4 A3,A2 = 11 A3,A2 = 10 A3,A2 = 01 A3,A2 = 00 256MB 256MB 256MB 256MB 256MB 256MB 256MB 256MB 256MB 256MB 256MB 256MB 256MB 256MB 256MB 256MB 32-bit Bidirectional XCVR 32-bit Bidirectional XCVR 32-bit Bidirectional XCVR 32-bit Bidirectional XCVR A31-A2, Set + Word /BE3 - /BE0 /BE3 - /BE0 /BE3-/BE0 A13-A2 A13-A2 Main memory organization 4 KB 4 KB 4 KB 4 KB Way 0 4 KB 4 KB 4 KB 4 KB Way 2 DRAM TECHNOLOGIES D[31:24] D[23:16] D[15:8] D[7:0] D[31:24] D[23:16] D[15:8] D[7:0] 80386 + Buffers D[31:24] D[23:16] D[15:8] D[7:0] D[31:24] D[23:16] D[15:8] D[7:0] Way 3 Way 1 4 KB 4 KB 4 KB 4 KB 4 KB 4 KB 4 KB 4 KB A13-A2 A13-A2 /BE3 - /BE0 /BE3 - /BE0 7b.15 7b.16 Memory Chip Organization Memory Module Organization Memory module is designed to • Processor with 64-bit Data Bus • Memory technologies share the 1K Bit Lines always access data in chunks the 1 DWord at address 0x000c: A[31:0] = 0000…1100 same layout but differ in their BL[0] BL[1024] size of the data bus (64-bit data Processor Core / Registers cell implementation Control bus = 64-bit accesses) 1 1 – ___________ 5 A[2:0] + SIZE Cell Cell 0x5098a7fb • Parallelizes memory access by _____ ______ 2 – ___________ WL[0] 4 accessing the byte at the same 0000..01 50 98 a7 fb c6 13 8A 57 • Memories require the row bits A[31:3] 10-bits 1 0 location in all (8) memory chips be sent first and are used to 0000000001 Byte/ 7 6 5 4 3 2 1 0 Cell Cell Row Lane select one row (aka “____ line") Addr. Decoder at once WL[1] Row Addr – Uses a hardware component • Only the desired portion will be D[63:56] D[7:0] SRAM and DRAM differ D[15:8] known as a decoder 0x000410 in how each cell is forwarded to the registers 50 8A 57 made, but the • All cells in the selected row organization is roughly • Note the difference between access their data bits and the same 10 17 11 ... 6D system processor address and ... A4 ... 22 0000010 0 output them on their respective 0 Cell Cell 8 Col f e 0x1 57 local memory chip addresses 0x1 50 3 0x1 8A 0000..01 “___________" WL[1023] 0 7 1 Byte address from 0x0 E4 0x0 F8 0x0 2C • The column address is sent next individual chip perspective and used to select the desired Byte address from XXX 000 8 bit lines (i.e. 1 byte) system/processor perspective Amplifiers & Column Mux Column – Uses a hardware component • Each chip on the module reads 1 byte and Addr known as a mux outputs it to form a collectively larger word Data[7:0] in/out on the data bus (i.e. 8-bytes = 64-bits)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend