CMPSC 311- Introduction to Systems Programming Module: Systems - - PowerPoint PPT Presentation

cmpsc 311 introduction to systems programming module
SMART_READER_LITE
LIVE PREVIEW

CMPSC 311- Introduction to Systems Programming Module: Systems - - PowerPoint PPT Presentation

CMPSC 311- Introduction to Systems Programming Module: Systems Programming Professor Patrick McDaniel Fall 2015 CMPSC 311 - Introduction to Systems Programming WARNING Warning: for those not in the class, there is an unusually large number


slide-1
SLIDE 1

CMPSC 311 - Introduction to Systems Programming

CMPSC 311- Introduction to Systems Programming Module: Systems Programming

Professor Patrick McDaniel Fall 2015

slide-2
SLIDE 2

CMPSC 311 - Introduction to Systems Programming Page

WARNING

  • Warning: for those not in the class, there is an

unusually large number of people trying to get in (4x more than any other year). I cannot make any promises that everyone will get into the class due to

  • thers dropping.
slide-3
SLIDE 3

CMPSC 311 - Introduction to Systems Programming Page

Software Systems

  • A platform, application, or other structure that:
  • is composed of multiple modules …
  • the system’s architecture defines the interfaces of and

relationships between the modules

  • usually is complex …
  • in terms of its implementation, performance, management
  • hopefully meets some requirements …
  • Performance
  • Security
  • Fault tolerance
  • Data consistency

3

These are properties of computer
 systems that people design, optimize, and test for. Some refer to the as
 "ilities” (pronounced "ill-it-tees")

slide-4
SLIDE 4

CMPSC 311 - Introduction to Systems Programming Page

100,000 Foot View of Systems

hardware

  • perating system

HW/SW interface (x86 + devices) CPU memory storage network GPU clock audio radio peripherals OS / app interface (system calls) C standard library (glibc) C application C++ STL / boost / standard library C++ application JRE Java application

4

slide-5
SLIDE 5

CMPSC 311 - Introduction to Systems Programming Page

A layered view

layer below

your system

client

layer below

client client

  • • •

understands and relies on layers below provides service to layers above

5

slide-6
SLIDE 6

CMPSC 311 - Introduction to Systems Programming Page

A layered view

layer below

your system

client

layer below

client client

  • • •

constrained by performance, footprint, behavior

  • f the layers below

more useful, portable, reliable abstractions

6

slide-7
SLIDE 7

CMPSC 311 - Introduction to Systems Programming Page

Example system

  • Operating system
  • a software layer that abstracts away the

messy details of hardware into a useful, portable, powerful interface

  • modules:
  • file system, virtual memory system,

network stack, protection system, scheduling subsystem, ...

  • each of these is a major system of its own!
  • design and implementation has many

engineering tradeoffs

  • e.g., speed vs. (portability, maintainability,

simplicity)

7

slide-8
SLIDE 8

CMPSC 311 - Introduction to Systems Programming Page

Another example system

  • Web server framework
  • a software layer that abstracts away the messy details of

OSs, HTTP protocols, database and storage systems to simplify building powerful, scalable Web services

  • modules:
  • HTTP server, HTML template system, database storage,

user authentication system, ...

  • also has many, many tradeoffs
  • programmer convenience vs. performance
  • simplicity vs. extensibility

8

Note: we will focus on the OS system this semester.

slide-9
SLIDE 9

CMPSC 311 - Introduction to Systems Programming Page

Systems and Layers

  • Layers are collections of system

functions that support some abstraction to service/app above

  • Hides the specifics of the

implementation of the layer

  • Hides the specifics of the layers below
  • Abstraction may be provided by

software or hardware

  • Examples from the OS layer
  • processes
  • files
  • virtual memory

9

slide-10
SLIDE 10

CMPSC 311 - Introduction to Systems Programming Page

A real world abstraction ...

  • What does this thing do?

10

What about this?

slide-11
SLIDE 11

CMPSC 311 - Introduction to Systems Programming Page

What makes a good abstraction?

  • An abstraction should match “cognitive model” of users
  • f the system, interface, or resources

“Cognitive science is concerned with understanding
 the processes that the brain uses to accomplish
 complex tasks including perceiving, learning,
 remembering, thinking, predicting, inference,
 problem solving, decision making, planning, and
 moving around the environment.”

  • -Jerome Busemeyer
slide-12
SLIDE 12

CMPSC 311 - Introduction to Systems Programming Page

How humans think (vastly simplified)

  • Our brain’s receive sensor data

to perceive and categorize environment (pattern matching and classification)

  • Things that are easy to assimilate

(learn) are close to things we already know

  • The simpler and more generic the
  • bject, the easier (most of the

time) it is to classify

  • See human factors, physiology,

and psychology classes ..

slide-13
SLIDE 13

CMPSC 311 - Introduction to Systems Programming Page

A good abstraction …

  • Why do computers have a desktop with files, folders,

trash bins, panels, switches …

  • … and why not streets with buildings, rooms, alleys,

dump-trucks, levers, …

slide-14
SLIDE 14

CMPSC 311 - Introduction to Systems Programming Page

In class exercise …

  • In groups of three to four:
  • Desktops are outlawed by the computer police
  • You are to come up with alternate abstractions for:
  • Data objects (i.e., replacements for files and directories)
  • Be ready to explain in 30 seconds your “environment”,

what are the metaphors, and why they are appropriate given user’s cognitive models ….

  • Bonus for being innovative and timely
slide-15
SLIDE 15

CMPSC 311 - Introduction to Systems Programming Page

Computer system abstractions

  • What are the basic abstractions that we use (and

don’t even think about) for modern computer systems?

slide-16
SLIDE 16

CMPSC 311 - Introduction to Systems Programming Page

Processes

  • Processes are independent programs running

concurrently within the operating systems

  • The execution abstraction provides is that it has sole control
  • f the entire computer (a single stack and execution context)

16

Tip: if you want to see what processes are running on your UNIX system, use the “ps” command, e.g., “ps -ax”.

slide-17
SLIDE 17

CMPSC 311 - Introduction to Systems Programming Page

Files

  • A file is an abstraction of a read only, write only, or

ready/write data object.

  • A data file is a collection of data on some media
  • often on secondary storage (hard disk)
  • Files can be much more: in UNIX nearly everything is a file
  • Devices like printers, USB buses, disks, etc.
  • System services like sources of randomness (RNG)
  • Terminal (user input/out devices)

17

Tip: /dev directory of UNIX contains real and virtual devices, e.g., “ls /dev”.

slide-18
SLIDE 18

CMPSC 311 - Introduction to Systems Programming Page

Virtual Memory

  • The virtual memory abstraction

provides control over an imaginary address space

  • Has a virtual address space which

is unique to the process

  • The OS/hardware work together to map

the address on to ...

  • Physical memory addresses
  • Addresses on disk (swap space)
  • Advantages
  • Avoids interference from other processes
  • swap allows more memory use than physically

available

18

slide-19
SLIDE 19

CMPSC 311 - Introduction to Systems Programming Page

Byte-Oriented Memory Organization

  • Programs Refer to Virtual Addresses
  • Conceptually very large array of bytes
  • Actually implemented with hierarchy of different memory types
  • System provides address space private to particular “process”
  • Program being executed
  • Program can clobber its own data, but not that of others
  • Compiler + Run-Time System Control Allocation
  • Where different program objects should be stored
  • All allocation within single virtual address space

19

  • • •
slide-20
SLIDE 20

CMPSC 311 - Introduction to Systems Programming Page

Machine Words

  • Machine Has “Word Size”
  • Nominal size of integer-valued data Including addresses
  • Many traditional machines use 32 bits (4 bytes) words
  • Limits addresses to 4GB
  • Becoming too small for memory-intensive applications
  • Recent systems use 64 bits (8 bytes) words
  • Potential address space ≈ 1.8 X 1019 bytes
  • x86-64 machines support 48-bit addresses: 256 Terabytes
  • Machines support multiple data formats
  • Fractions or multiples of word size
  • Always integral number of bytes

20

slide-21
SLIDE 21

CMPSC 311 - Introduction to Systems Programming Page

Word-Oriented Memory Organization

  • Addresses Specify Byte Locations
  • Address of first byte in word
  • Addresses of successive words differ

by 4 (32-bit) or 8 (64-bit)

21

0000 0001 0002 0003 0004 0005 0006 0007 0008 0009 0010 0011 32-bit Words Bytes Addr. 0012 0013 0014 0015 64-bit Words

Addr = ?? Addr = ?? Addr = ?? Addr = ?? Addr = ?? Addr = ?? 0000 0004 0008 0012 0000 0008

slide-22
SLIDE 22

CMPSC 311 - Introduction to Systems Programming Page

APIs

  • An Applications Programmer Interface is a set of

methods (functions) that is used to manipulate an abstraction

  • This is the “library” of calls to use the abstraction
  • Some are easy (e.g., printf)
  • Some are more complex (e.g., network sockets)
  • Mastering systems programming is the art and science of

mastering the APIs including:

  • How they are used?
  • What are the performance characteristics?
  • What are the resource uses?
  • What are their limitations
slide-23
SLIDE 23

CMPSC 311 - Introduction to Systems Programming Page

Example: Java Input/Output

  • Set of abstractions that allow

for different kinds of input and

  • utput
  • Streams …
  • Tokenizers ….
  • Readers …
  • Writers …
  • Professional Java programmers

know when and how to uses these to achieve their goals

slide-24
SLIDE 24

CMPSC 311 - Introduction to Systems Programming Page

Systems programming

  • The programming skills, engineering

discipline, and knowledge you need to build a system using these abstractions:

  • programming: C (the abstraction for ISA)
  • discipline: testing, debugging, performance

analysis

  • knowledge: long list of interesting topics
  • concurrency, OS interfaces and semantics,

techniques for consistent data management, algorithms, distributed systems, ...

  • most important: deep understanding of the

“layer below”

24

slide-25
SLIDE 25

CMPSC 311 - Introduction to Systems Programming Page

Programming languages

  • Assembly language / machine code
  • (approximately) directly executed by hardware
  • tied to a specific machine architecture, not portable
  • no notion of structure, few programmer conveniences
  • possible to write really, really fast code
  • Compilation of a programming language results in

executable code to be run by hardware.

  • gcc (C compiler) produces target machine executable code

(ISA)

  • javac (Java compiler) produces Java Virtual Machine

executable code

25

slide-26
SLIDE 26

CMPSC 311 - Introduction to Systems Programming Page

Programming languages

  • Structured but low-level languages (C, C++)
  • hides some architectural details, is kind of portable, has a few

useful abstractions, like types, arrays, procedures, objects

  • permits (forces?) programmer to handle low-level details like

memory management, locks, threads

  • low-level enough to be fast and to give the programmer

control over resources

  • double-edged sword: low-level enough to be complex, error-

prone

  • shield: engineering discipline

26

slide-27
SLIDE 27

CMPSC 311 - Introduction to Systems Programming Page

Programming languages

  • High-level languages (Python, Ruby, JavaScript, ...)
  • focus on productivity and usability over performance
  • powerful abstractions shield you from low-level gritty details

(bounded arrays, garbage collection, rich libraries, ...)

  • usually interpreted, translated, or compiled via an

intermediate representation

  • slower (by 1.2x-10x), less control

27

slide-28
SLIDE 28

CMPSC 311 - Introduction to Systems Programming Page

Discipline

  • Cultivate good habits, encourage clean code
  • coding style conventions
  • unit testing, code coverage testing, regression testing
  • documentation (code comments!, design docs)
  • code reviews
  • Will take you a lifetime to learn
  • but oh-so-important, especially for systems code
  • avoid write-once, read-never code

28

slide-29
SLIDE 29

CMPSC 311 - Introduction to Systems Programming Page

Knowledge

  • Tools
  • gcc, gdb, g++, objdump, nm, gcov/lcov, valgrind, IDEs, race

detectors, model checkers, ...

  • Lower-level systems
  • UNIX system call API, relational databases, map/reduce,

Django, ...

  • Systems foundations
  • transactions, two-phase commit, consensus, handles,

virtualization, cache coherence, applied crypto, ...

29