CS 356 Unit 0 Class Introduction Basic Hardware Organization 0.2 - PowerPoint PPT Presentation

0.1 CS 356 Unit 0 Class Introduction Basic Hardware Organization

0.2 What is This Course About? • Introduction to Computer Systems – a.k.a. Computer Organization or Architecture • Filling in the "systems" details – How is software generated (compilers, libraries) and executed (OS, etc.) – How does computer hardware work and how does it execute the software I write? • Lays a foundation for future CS courses – CS 350 (Operating Systems), ITP/CS 439 (Compilers), CS 353/EE 450 (Networks), EE 457 (Computer Architecture)

0.3 Today's Digital Environment Applications Networks C++ / Java C++ / Java Algorithms Algorithms / / Python Python Assembly / Assembly / OS / OS / Our Focus in CS 356 Machine Machine Libraries Libraries Code Code Processor / Memory / Processor / Memory / GPU / FPGAs GPU / FPGAs Digital Logic Digital Logic Transistors / Circuits Transistors / Circuits Voltage / Currents Voltage / Currents

0.4 Why is System Knowledge Important? • Increase productivity – Debugging – Build/compilation • High-level language abstractions break down at certain points • Improve performance – Take advantage of hardware features – Avoid pitfalls presented by the hardware • Basis of understanding security and exploits

0.5 What Will You Learn • Binary representation systems • Assembly • Processor organization • Memory subsystems (caching, virtual memory) • Compiler optimization and linking

0.6 Administration + Syllabus • Course Website: usc-cs356.github.io ( Install Course VM ) • Textbook Computer Systems: A Programmer’s Perspective Bryant and O’Hallaron, 2015 • Grading: – 30 points for assignments (5 assignments, equally weighted) – 40 points for midterms (25 for best MT, 15 for worst) – 30 points for final • Piazza • Expectations for getting help – Not allowed to search online! We know some code is available only (those caught using or even referencing online code will be submitted to SJACS to be assigned an F) – Acknowledge TA/CP help with comments in your code – Don’t discuss solutions with other students

0.7 ABSTRACTIONS & REALITY

0.8 Abstraction vs. Reality • Abstraction is good until reality intervenes – Bugs can result – It is important to underlying HW implementations – Sometimes abstractions don't provide the control or performance you need

0.9 Reality 1 • int s are not integers and float s aren't reals • Is x 2 >= 0 ? – Floats: Yes – Ints: Not always • 40,000*40,000 = 1,600,000,000 • 50,000*50,000 = -1,794,967,296 • Is (x+y)+z = x+(y+z)? – Ints: Yes – Floats: Not always • (1e20 + -1e20) + 3.14 = 3.14 • 1e20 + (-1e20 + 3.14) = around 0

0.10 Reality 1: Examples

0.11 Reality 1: Examples

0.12 Reality 2 • Knowing some assembly is critical • You'll probably never write much (any?) code in assembly as compilers are often better than even humans at optimizing code • But knowing assembly is critical when – Tracking down some bugs – Taking advantage of certain HW features that a compiler may not be able to use – Implementing system software (OS/compilers/libraries) – Understanding security and vulnerabilities

0.13 Reality 2: Example

0.14 Reality 3 • Memory matters! – Memory is not infinite – Memory can impact performance more than computation for many applications – Source of many bugs both for single-threaded and especially parallel programs – Source of many security vulnerabilities

0.15 Reality 4 • There's more to performance than asymptotic complexity – Constant factors matter! – Even operation counts do not predict performance • How long an instruction takes to execute is not deterministic … it depends on what other instructions have been executed before it – Understanding how to optimize for the processor organization and memory can lead to up to an order of magnitude performance increase

0.16 Drivers and Trends COMPUTER ORGANIZATION AND ARCHITECTURE

0.17 Computer Components Combine 2c. Flour Mix in 3 eggs • Processor Instructions – Executes the program and performs all the operations • Main Memory Data – Stores data and program Processor (Reads instructions, ( instructions) operates on data) – Different forms: Processor • RAM = read and write but volatile Arithmetic + (lose values when power off) • ROM = read-only but non-volatile Logic + Control (maintains values when power Software off) Circuitry Program – Significantly slower than the processor speeds • Input / Output Devices Program – Input Output Generate and consume data from (Instructions) the system Devices Devices Data – (Operands) MUCH, MUCH slower than the processor Memory (RAM) Disk Drive Data

0.18 Architecture Issues • Fundamentally, computer architecture is all about the different ways of answering the question: “What do we do with the ever-increasing number of transistors available to us” • Goal of a computer architect is to take increasing transistor budgets of a chip (i.e. Moore’s Law) and produce an equivalent increase in computational ability

0.19 Moore’s Law, Computer Architecture & Real-Estate Planning • Moore’s Law = Number of transistors able to be fabricated on a chip grows exponentially with time • Computer architects decide, “What should we do with all of this capability?” • Similarly real-estate developers ask, “How do we make best use of the land area given to us?” USC University Park Development Master Plan http://re.usc.edu/docs/University%20Park%20Development%20Project.pdf

0.20 Transistor Physics • Cross-section of transistors on an IC • Moore’s Law is founded on our ability to keep shrinking transistor sizes – Gate/channel width shrinks – Gate oxide shrinks • Transistor feature size is referred to as the implementation “technology node”

0.21 Technology Nodes

0.22 Growth of Transistors on Chip

0.23 Implications of Moore’s Law • What should we do with all these transistors – Put additional simple cores on a chip – Use transistors to make cores execute instructions faster – Use transistors for more on-chip cache memory • Cache is an on-chip memory used to store data the processor is likely to need • Faster than main-memory (RAM) which is on a separate chip and much larger (thus slower)

0.24 Memory Wall Problem • Processor performance is increasing much faster than memory performance 55%/year Processor-Memory Performance Gap 7%/year Hennessy and Patterson, Computer Architecture – A Quantitative Approach (2003)

0.25 Cache Example Processor • Small, fast, on-chip memory to Cac Cache does store copies of recently-used not have he data desired data • When processor attempts to System Bus access data it will check the RAM cache first – If the cache has the desired Processor data, it can supply it quickly – If the cache does not have the Cac Cache has desired he data, it must go to the main data memory (RAM) to access it System Bus RAM

0.26 Reality 3 & 4 Example

0.27 Pentium 4 L2 Cache L1 Data L1 Instruc.

0.28 Increase in Clock Frequency

0.29 Intel Nehalem Quad Core

0.30 Progression to Parallel Systems • If power begins to limit clock frequency, how can we continue to achieve more and more operations per second? – By running several processor cores in parallel at lower frequencies – Two cores @ 2 GHz vs. 1 core @ 4 GHz yield the same theoretical maximum ops./sec. • For various applications like graphics and computationally intensive workloads this is taken to an extreme by GPUs

0.31 GPU Chip Layout • 2560 Small Cores • Upwards of 7.2 billion transistors • 8.2 TFLOPS • 320 Gbytes/sec Source: NVIDIA Photo: http://www.theregister.co.uk/2010/01/19/nvidia_gf100/

0.32 Intel Haswell Quad Core

0.33 8 th Gen Coffee-Lake Hex-Core Intel Processor https://www.researchgate.net/figure/Die-Map-of-a-Hexa-Core-Coffee-Lake-Processor_fig6_332543387

CS 356 Unit 0 Class Introduction Basic Hardware Organization 0.2 - PowerPoint PPT Presentation

0.1 CS 356 Unit 0 Class Introduction Basic Hardware Organization 0.2 What is This Course About? Introduction to Computer Systems a.k.a. Computer Organization or Architecture Filling in the "systems" details How is

Dementia UK www.dementiauk.org www.dementiauk.org 356 Holloway road. London 356 Holloway road.

HOUSING PROJECT 1 UNIT 4 UNIT 1 UNIT 6 UNIT 5 UNIT 3 UNIT 2 Application of the Concept

Travel Services Young Building Suite 202 479-356-2034 or 3502 Cindy Pratt, Travel Administrator

Unit Identifier Unit October 21, 2014 Unit Identifiers Unit Members Representing Name Email

Unit Title: Presentation Software Unit Level: 2 Unit Credit Value: 4 GLH: 30 LASER Unit

CS 356 Unit 3 IEEE 754 Floating Point Representation 3.2 Floating Point Used to represent

Unit Title: Practical Presentation Skills For Working In the Creative Industries Unit Level:

Unit Structure Stage 5 Unit 9: Coaching Presentation Unit Aim This unit aims to enable learners

Unit T esting Framework for T cl Unit T esting Framework for T cl What is Unit T

Module 6 Social Education Course Unit 1: A Place of my Own Unit 2: Making Ends Meet

SLE352 Community Science Project INFORMATION ABOUT THE UNIT The unit team Assoc Prof Jan West

ARTH 356: Studies in Materials and Processes of Art: Of Knowing Researching in the Library

CompSci 356: Computer Network Architectures Lecture 12: Dynamic routing protocols: Link State

Pinto House, 95, 99, 103, Xatt l-Ghassara ta L-Gheneb, Marsa, MRS 1912, Malta +356

PEREGO CARS PORSCHE 356 B SC Summary Engine Year 01.1963 4 cyl. 1582 cm3 Gearbox Mileage

CS 356 Lecture 29 Wireless Security Spring 2013 Review Chapter 1: Basic Concepts and

Computer Organization Introduction CS301 Prof. Szajda Fall 2020 Course Logistics Prof

Transistor-Level Layout of High-Density Regular Circuits Yi-W ei Lin 1 , Malgorzata Marek-Sadow

Parallel & Distributed Real-Time Systems Lecture #14 Professor Jan Jonsson Department of

Space-Time Discontinuous Petrov-Galerkin Finite Elements for Transient Fluid Mechanics Truman

Information, Computation, Communication Computer Architecture 1 ICC Module System Lesson

Physical Design Considerations of One-level RRAM-based Routing Multiplexers Xifan Tang, Edouard

TIMING CLOSURE TIMING CLOSURE FOR FOR ULTRA DEEP SUBMICRON ULTRA DEEP SUBMICRON DESIGN

Rule-Based Graph Programs Detlef Plump University of York in cooperation with Tim Atkinson,

CS 356 Unit 0 Class Introduction Basic Hardware Organization 0.2 - PowerPoint PPT Presentation

0.1 CS 356 Unit 0 Class Introduction Basic Hardware Organization 0.2 What is This Course About? Introduction to Computer Systems a.k.a. Computer Organization or Architecture Filling in the "systems" details How is

Dementia UK www.dementiauk.org www.dementiauk.org 356 Holloway road. London 356 Holloway road.

HOUSING PROJECT 1 UNIT 4 UNIT 1 UNIT 6 UNIT 5 UNIT 3 UNIT 2 Application of the Concept

Travel Services Young Building Suite 202 479-356-2034 or 3502 Cindy Pratt, Travel Administrator

Unit Identifier Unit October 21, 2014 Unit Identifiers Unit Members Representing Name Email

Unit Title: Presentation Software Unit Level: 2 Unit Credit Value: 4 GLH: 30 LASER Unit

CS 356 Unit 3 IEEE 754 Floating Point Representation 3.2 Floating Point Used to represent

Unit Title: Practical Presentation Skills For Working In the Creative Industries Unit Level:

Unit Structure Stage 5 Unit 9: Coaching Presentation Unit Aim This unit aims to enable learners

Unit T esting Framework for T cl Unit T esting Framework for T cl What is Unit T

Module 6 Social Education Course Unit 1: A Place of my Own Unit 2: Making Ends Meet

SLE352 Community Science Project INFORMATION ABOUT THE UNIT The unit team Assoc Prof Jan West

ARTH 356: Studies in Materials and Processes of Art: Of Knowing Researching in the Library

CompSci 356: Computer Network Architectures Lecture 12: Dynamic routing protocols: Link State

Pinto House, 95, 99, 103, Xatt l-Ghassara ta L-Gheneb, Marsa, MRS 1912, Malta +356

PEREGO CARS PORSCHE 356 B SC Summary Engine Year 01.1963 4 cyl. 1582 cm3 Gearbox Mileage

CS 356 Lecture 29 Wireless Security Spring 2013 Review Chapter 1: Basic Concepts and

Computer Organization Introduction CS301 Prof. Szajda Fall 2020 Course Logistics Prof

Transistor-Level Layout of High-Density Regular Circuits Yi-W ei Lin 1 , Malgorzata Marek-Sadow

Parallel &amp; Distributed Real-Time Systems Lecture #14 Professor Jan Jonsson Department of

Space-Time Discontinuous Petrov-Galerkin Finite Elements for Transient Fluid Mechanics Truman

Information, Computation, Communication Computer Architecture 1 ICC Module System Lesson

Physical Design Considerations of One-level RRAM-based Routing Multiplexers Xifan Tang, Edouard

TIMING CLOSURE TIMING CLOSURE FOR FOR ULTRA DEEP SUBMICRON ULTRA DEEP SUBMICRON DESIGN

Rule-Based Graph Programs Detlef Plump University of York in cooperation with Tim Atkinson,

Parallel & Distributed Real-Time Systems Lecture #14 Professor Jan Jonsson Department of