p age 1
play

P age 1 A take on Moores Law Technology Trends Bit-level - PDF document

Outline Why Take CS252? CS252 Fundament al Abst ract ions & Concept s Graduate Computer Architecture Lecture 1 I nst ruct ion Set Archit ect ure & Organizat ion Administrivia I ntroduction Pipelined I nst ruct ion


  1. Outline • Why Take CS252? CS252 • Fundament al Abst ract ions & Concept s Graduate Computer Architecture Lecture 1 • I nst ruct ion Set Archit ect ure & Organizat ion • Administrivia I ntroduction • Pipelined I nst ruct ion Processing • Perf ormance • The Memory Abst ract ion January 22, 2002 • Summary Prof . David E Culler Comput er Science 252 Spring 2002 CS252/ Culler CS252/ Culler 1/ 22/ 02 1/ 22/ 02 Lec 1. 1 Lec 1. 2 Why take CS252? Example Hot Developments ca. 2002 • Manipulating the instruction set abstraction • To design the next great instruction set?...well... – it anium: t ranslat e I SA64 - > micro- op sequences – instruction set architecture has largely converged – t ransmet a : cont inuous dynamic t ranslat ion of I A32 – especially in the desktop / server / laptop space – t insilica: synthesize the I SA f rom the application – dictated by powerf ul market f orces – reconf igurable HW • Tremendous organizational innovation relative to • Virtualization established I SA abstractions – vmware: emulate f ull virtual machine – JI T: compile to abstract virtual machine, dynamically compile • Many New instruction sets or equivalent to host – embedded space, cont rollers, specialized devices, . . . • P arallelism • Design, analysis, implementation concepts vital to all – wide issue, dynamic instruction scheduling, EPI C aspects of EE & CS – multithreading (SMT) – syst ems, PL, t heory, circuit design, VLSI , comm. – chip multiprocessors • Communication • Equip you with an intellectual toolbox f or dealing with a host of systems design challenges – network processors, network interf aces • Exotic explorations – nanot echnology, quantum computing CS252/ Culler CS252/ Culler 1/ 22/ 02 1/ 22/ 02 Lec 1. 3 Lec 1. 4 Forces on Computer Architecture Amazing Underlying Technology Change Technology Programming Languages Applications Computer Architecture Operating Systems History (A = F / M) CS252/ Culler CS252/ Culler 1/ 22/ 02 1/ 22/ 02 Lec 1. 5 Lec 1. 6 P age 1

  2. A take on Moore’s Law Technology Trends Bit-level parallelism Instruction-level Thread-level (?) 100,000,000 • Clock Rate: ~30% per year • Transist or Densit y: ~35% N 10,000,000 N N N N • Chip Area: ~15% N N R10000 N N N N N N N N N N • Transist ors per chip: ~55% N N NN N N N N N N NN N N N N N 1,000,000 N N N • Tot al Perf ormance Capabilit y: ~100% Pentium N Transistors N N N N i80386 • by t he t ime you graduat e. . . N i80286 N N N R3000 100,000 – 3x clock rat e (3- 4 GHz) N R2000 N N – 10x transistor count (1 Billion transistors) N i8086 – 30x raw capability 10,000 i8080 N i8008 N N N • plus 16x dram densit y, 32x disk densit y N i4004 1,000 1970 1975 1980 1985 1990 1995 2000 2005 CS252/ Culler CS252/ Culler 1/ 22/ 02 1/ 22/ 02 Lec 1. 7 Lec 1. 8 Measurement and Evaluation Perf ormance Trends Architecture is an iterative process -- searching the space of possible designs -- at all levels of computer systems Design 100 Supercomputers Analysis 10 Performance Mainframes Creativity Microprocessors Minicomputers 1 Cost / Performance Analysis 0.1 1965 1970 1975 1980 1985 1990 1995 Good Ideas Good Ideas Mediocre Ideas Bad Ideas CS252/ Culler CS252/ Culler 1/ 22/ 02 1/ 22/ 02 Lec 1. 9 Lec 1. 10 What is “Computer Architecture”? Coping with CS 252 • Students with too varied background? Application – I n past, CS grad students took written prelim exams on Operating undergraduate material in hardware, sof tware, and theory System – 1st 5 weeks reviewed background, helped 252, 262, 270 Compiler Firmware – Prelims were dropped => some unprepared f or CS 252? Instruction Set Architecture • I n class exam on Tues Jan. 29 (30 mins) Instr. Set Proc. I/O system – Doesn’t af f ect grade, only admission into class Datapath & Control – 2 grades: Admitted or audit/ take CS 152 1st Digital Design – I mprove your experience if recapture common background Circuit Design • Review: Chapt ers 1, CS 152 home page, maybe Layout “Comput er Organizat ion and Design (COD)2/ e” • Coordinat ion of many levels of abst ract ion – Chapters 1 to 8 of COD if never took prerequisite – I f took a class, be sure COD Chapters 2, 6, 7 are f amiliar • Under a rapidly changing set of f orces – Copies in Bechtel Library on 2- hour reserve • Design, Measurement , and Evaluat ion • FAST review t his week of basic concept s CS252/ Culler CS252/ Culler 1/ 22/ 02 1/ 22/ 02 Lec 1. 11 Lec 1. 12 P age 2

  3. The I nstruction Set: a Critical I nterf ace Review of Fundamental Concepts • I nst ruct ion Set Archit ect ure software • Machine Organizat ion • I nst ruct ion Execut ion Cycle instruction set • Pipelining • Memory hardware • Bus (Peripheral Hierarchy) • Perf ormance I ron Triangle CS252/ Culler CS252/ Culler 1/ 22/ 02 1/ 22/ 02 Lec 1. 13 Lec 1. 14 I nstruction Set Architecture Organization . . . the attributes of a [computing] system as seen • Capabilit ies & Perf ormance Logic Designer's View by t he programmer, i. e. t he concept ual st ruct ure Charact erist ics of Principal and f unct ional behavior, as dist inct f rom t he ISA Level Functional Units organizat ion of t he dat a f lows and cont rols t he logic – (e. g. , Registers, ALU, Shif ters, Logic FUs & Interconnect design, and t he physical implement at ion. Units, . . . ) – Amdahl, Blaaw, and • Ways in which t hese component s Brooks, 1964 are int erconnect ed SOFTWARE SOFTWARE • I nf ormat ion f lows bet ween -- Organization of Programmable Storage component s -- Data Types & Data Structures: • Logic and means by which such Encodings & Representations inf ormat ion f low is cont rolled. -- Instruction Formats • Choreography of FUs to -- Instruction (or Operation Code) Set realize the I SA -- Modes of Addressing and Accessing Data Items and Instructions • Register Transf er Level (RTL) -- Exceptional Conditions Descript ion CS252/ Culler CS252/ Culler 1/ 22/ 02 1/ 22/ 02 Lec 1. 15 Lec 1. 16 Review: MI PS R3000 (core) Review: Basic I SA Classes r0 0 Programmable storage Data types ? Accumulator: r1 acc ← acc + mem[A] 2^32 x bytes ° Format ? 1 address add A ° 31 x 32-bit GPRs (R0=0) acc ← acc + mem[A + x] Addressing Modes? 1+x address addx A ° 32 x 32-bit FP regs (paired DP) r31 Stack: PC HI, LO, PC tos ← tos + next 0 address add lo hi General Purpose Register: Arithmetic logical EA(A) ← EA(A) + EA(B) 2 address add A B Add, AddU, Sub, SubU, And, Or, Xor, Nor, SLT, SLTU, EA(A) ← EA(B) + EA(C) 3 address add A B C AddI, AddIU , SLTI, SLTIU, AndI, OrI, XorI, LUI Load/ Store: SLL, SRL, SRA, SLLV, SRLV, SRAV Ra ← Rb + Rc 3 address add Ra Rb Rc Memory Access Ra ← mem[Rb] load Ra Rb mem[Rb] ← Ra LB, LBU, LH, LHU, LW, LWL,LWR store Ra Rb SB, SH, SW, SWL, SWR Control 32-bit instructions on word boundary J, JAL, JR, JALR BEq, BNE, BLEZ,BGTZ,BLTZ,BGEZ,BLTZAL,BGEZAL CS252/ Culler CS252/ Culler 1/ 22/ 02 1/ 22/ 02 Lec 1. 17 Lec 1. 18 P age 3

  4. I nstruction Formats MI PS Addressing Modes & Formats • Simple addressing modes Variable: … • All instructions 32 bits wide Fixed: Register (direct) op rs rt rd Hybrid: register Immediate op rs rt immed • Addressing modes Base+index – each operand requires addess specif ier => variable f ormat op rs rt immed Memory • code size => variable lengt h inst ruct ions register + • perf ormance => f ixed lengt h inst ruct ions PC-relative – simple decoding, predictable operations op rs rt immed Memory • Wit h load/ st ore inst ruct ion arch, only one memory PC + address and f ew addressing modes • Register Indirect? • => simple f ormat , address mode given by opcode CS252/ Culler CS252/ Culler 1/ 22/ 02 1/ 22/ 02 Lec 1. 19 Lec 1. 20 Cray- 1: the original RI SC VAX- 11: the canonical CI SC Variable format, 2 and 3 address instruction Register-Register 15 9 8 6 5 3 2 0 Byte 0 1 n m Op Rd Rs1 R2 OpCode A/M A/M A/M Load, Store and Branch • Rich set of ort hogonal address modes 6 2 15 9 8 5 3 0 15 0 – immediate, of f set, indexed, aut oinc/ dec, indirect, Op Rd Rs1 Immediate indirect+of f set – applied t o any operand • Simple and complex inst ruct ions – synchronization instructions – data structure operations (queues) – polynomial evaluation CS252/ Culler CS252/ Culler 1/ 22/ 02 1/ 22/ 02 Lec 1. 21 Lec 1. 22 Review: Load/ Store Architectures MI PS R3000 I SA (Summary) Registers • I nst ruct ion Cat egories ° 3 address GPR MEM reg – Load/ St ore ° Register to register arithmetic R0 - R31 – Computational ° Load and store with simple addressing modes (reg + immediate) ° Simple conditionals – Jump and Branch – Float ing Point compare ops + branch z op r r r » coprocessor compare&branch PC – Memory Management condition code + branch on condition op r r immed HI – Special ° Simple f ixed- f ormat encoding op offset LO 3 Instruction Formats: all 32 bits wide ° Substantial increase in instructions OP rs rd sa funct rt ° Decrease in data BW (due to many registers) OP rs rt immediate ° Even more significant decrease in CPI (pipelining) ° Cycle time, Real estate, Design time, Design complexity OP jump target CS252/ Culler CS252/ Culler 1/ 22/ 02 1/ 22/ 02 Lec 1. 23 Lec 1. 24 P age 4

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend