Processes and Threads Prof. Van Renesse and Sirer CS 4410 - - PowerPoint PPT Presentation

processes and threads
SMART_READER_LITE
LIVE PREVIEW

Processes and Threads Prof. Van Renesse and Sirer CS 4410 - - PowerPoint PPT Presentation

Processes and Threads Prof. Van Renesse and Sirer CS 4410 Cornell University Fun Starts Here! What involves starting a program or running a program? n which are misnomers How can I run multiple processes on one


slide-1
SLIDE 1

Processes and Threads

  • Prof. Van Renesse and Sirer

CS 4410 Cornell University

slide-2
SLIDE 2

Fun Starts Here!

What involves “starting a program” or

“running a program”?

n which are misnomers…

How can I run multiple processes on one

computer?

It’s all about “design and efficient

implementation of abstractions”

slide-3
SLIDE 3

What is a Process?

A process is an abstraction of a computer

slide-4
SLIDE 4

Abstractions

A file is an abstract disk A socket is an abstract network endpoint A window is an abstract screen … Abstractions hide implementation details

but expose (most of) the power

slide-5
SLIDE 5

Process Abstraction

ENVIRONMENT ADDRESS SPACE REGISTERS CPU STATE CONTROL

slide-6
SLIDE 6

Process Interface

CreateProcess(initial state) à processID SendSignal(processID, Signal) GetStatus(processID) à runningStatus Terminate(processID) WaitFor(processID) à completionStatus ListProcesses() à [ pid1, pid2, … ]

slide-7
SLIDE 7

Kernel implements processes!

P1 OS KERNEL P2 P3 Supervisor Mode User Mode Kernel is only part of the operating system

slide-8
SLIDE 8

Emulation…

One option is for the hardware to simulate

multiple instances of the same or other hardware

Useful for debugging, emulation of ancient

hardware, etc.

But too inefficient for modern-day daily use

slide-9
SLIDE 9

CPU runs each process directly

But somehow each process has its own

n Registers n Memory n I/O resources n “thread of control”

slide-10
SLIDE 10

(Simplified) RAM Layout

0x0 0x80000000 KERNEL P2 P1 P3 Base/Bound register Supervisor mode

slide-11
SLIDE 11

Typical Address Space Layout

(similar for kernel and processes)

CODE DATA STACK

slide-12
SLIDE 12

Process Control Block

  • Process Identifier
  • Process arguments (for identification)
  • Process status (runnable, waiting, zombie, …)
  • User Identifier (for security)

n beware: superuser ≠ supervisor

  • Registers
  • Interrupt Vector
  • Pending Interrupts
  • Base / Bound
  • Scheduling / accounting info
  • I/O resources
slide-13
SLIDE 13

Abstract life of a process

New Runnable Zombie

admitted interrupt --- descheduling dispatch

Running Waiting

slide-14
SLIDE 14

createProcess(initial state)

Allocate memory for address space Initialize address space

n program vs fork n program ≠ process

Allocate ProcessID Allocate process control block Put process control block on the run queue

slide-15
SLIDE 15

How does a process terminate?

External:

n Terminate(ProcessID) n SendSignal(signal) with no handler set up n Using up quota

Internal:

n Exit(processStatus) n Executing an illegal instruction n Accessing illegal memory addresses

slide-16
SLIDE 16

For now: one process running at a time (single core machine)

Kernel runs Switch to process 1 Trap to kernel Switch to another (or same) process Trap to kernel etc.

Context-switches P1 K P2 K P2 K K P1

slide-17
SLIDE 17

Processor Status Word

Supervisor Bit or Level Interrupt Priority Level or Enabled Bit Condition Codes (result of compare ops) …

Supervisor can update any part, but user can

  • nly update condition codes

Has to be saved and restored like any other register!

slide-18
SLIDE 18

Time-Sharing

Illusion: multiple processes run at same

time

Reality: only one process runs at a time

n For no more than a few 10s of milliseconds

But this can happen in parallel to another

process waiting for I/O!

Why time-share?

slide-19
SLIDE 19

Kernel Operation (conceptual)

Initialize devices Initialize “First Process” For ever

n while device interrupts pending

w handle device interrupts

n while system calls pending

w handle system calls

n if run queue is non-empty

w select a runnable process and switch to it

n otherwise

w wait for device interrupt

slide-20
SLIDE 20

Invariants

Supervisor mode à PC points to kernel code Equivalently: PC points to user code à user mode User code runs with interrupts enabled For simplicity: Kernel code runs with interrupts

disabled (for now)

slide-21
SLIDE 21

Dispatch: kernel à process

Software:

n CurProc := &PCB of current process n Set user base/bound register n Restore process registers n Execute ReturnFromInterrupt instruction

Hardware:

w Sets user mode w Enables interrupts w Restores program counter

slide-22
SLIDE 22

Trap process à kernel

Hardware:

n Disables interrupts n Sets supervisor mode n Saves user PC and SP on kernel stack

w why not on process stack?

n Sets kernel stack pointer n Sets PC to kernel-configured position

Software:

n Save process registers in PCB of CurProc n Back to kernel main loop

slide-23
SLIDE 23

Causes for traps

Clock interrupt Device interrupt System call Privileged instruction Divide by zero Bad memory access …

slide-24
SLIDE 24

System calls

How does a process specify what system

call to invoke and what parameters to use?

How does the kernel protect itself and

  • ther processes?

How does the kernel return a result to the

process?

How does the kernel prevent accidentally

returning privacy sensitive data?

slide-25
SLIDE 25

Class Projects

Implement sleep(delay) system call Implement a debugger Implement SendSignal(pid, signal)

slide-26
SLIDE 26

How Much To Abstract

  • Unix and Windows provide processes that look like

idealized machines, with nice looking file abstractions, network abstractions, graphical windows, etc.

  • Xen, KVM, etc. provide processes that look just like real

hardware

n virtualization

  • Requires different kinds of things from kernels

n Unix/Windows: implement files, network protocols, window

management

n Xen/KVM/…: emulate hardware

slide-27
SLIDE 27

Virtual Machine Abstraction

Virtual Machine Monitor kernel Unix Kernel Windows NT Kernel P1 P2 P3 P4 P5

slide-28
SLIDE 28

Things to emulate

Supervisor mode Base/Bound registers Device registers … Hardware can help

n Multi-level supervisor n Multi-level base/bound n …

FLASH / ROM BLOCK OF RAM BLOCK OF RAM DEVICE REGISTERS BITMAP / SCREEN

slide-29
SLIDE 29

Processes Under Unix/Linux

  • Fork() system call to create a new process

n Old process called parent, new process called child

  • int fork() clones the invoking process:

n Allocates a new PCB and process ID n Allocates a new address space n copies the parent’s address space into the child’s

n in parent, fork() returns PID of child n in child, fork() returns a zero.

  • int fork() returns TWICE!
slide-30
SLIDE 30

Example

int main(int argc, char **argv) { int parentPid = getpid(); int pid = fork(); if (pid == 0) {

printf(“The child of %d is %d\n”, parentPid, getpid());

exit(0); } else { printf(“My child is %d\n”, pid); exit(0); } }

What does this program print?

slide-31
SLIDE 31

Bizarre But Real

$ cc a.c $ ./a.out The child of 23873 is 23874 My child is 23874 Parent Child Kernel

fork()

retsys

v0=0 v0=23874

slide-32
SLIDE 32

Exec()

  • Fork() gets us a new address space
  • int exec(char *programName) completes the picture

n throws away the contents of the calling address space n replaces it with the program in file named by programName n starts executing at header.startPC n PCB remains the same otherwise (same PID)

  • Pros: Clean, simple
  • Con: duplicate operations
slide-33
SLIDE 33

What is a program?

  • A program is a file containing executable code (machine

instructions) and data (information manipulated by these instructions) that together describe a computation

  • Resides on disk
  • Obtained through compilation and linking
slide-34
SLIDE 34

Preparing a Program

Source files compiler/ assembler Object files Linker PROGRAM An executable file in a standard format, such as ELF on Linux, Microsoft PE on Windows Header Code Initialized data BSS Symbol table Line numbers

  • Ext. refs

static libraries (libc)

slide-35
SLIDE 35

Running a program

  • Every OS provides a “loader” that is capable of converting

a given program into an executing instance, a process

n A program in execution is called a process

  • The loader:

n reads and interprets the executable file n Allocates memory for the new process and sets process’s memory

to contain code & data from executable

n pushes “argc”, “argv”, “envp” on the stack n sets the CPU registers properly & jumps to the entry point

slide-36
SLIDE 36

Process != Program

Header Code Initialized data BSS Symbol table Line numbers

  • Ext. refs

Code Initialized data BSS Heap Stack DLL’s mapped segments

Executable Process address space Program is passive

  • Code + data

Process is running program

  • stack, regs, program counter

Example: We both run IE:

  • Same program
  • Separate processes
slide-37
SLIDE 37

Process Termination, part 1

  • Process executes last statement and calls exit syscall

n Process’ resources are deallocated by operating system

  • Parent may terminate execution of child process (kill)

n Child has exceeded allocated resources n Task assigned to child is no longer required n If parent is exiting

w Some OSes don’t allow child to continue if parent terminates

n All children terminated - cascading termination

slide-38
SLIDE 38

Process Termination, part 2

Process first goes into “zombie state” Parent can wait for zombie children

n Syscall: wait() à (pid, exit status)

After wait() returns, PCB of child is garbage

collected

slide-39
SLIDE 39

Class Project

Write a simple command line interpreter

slide-40
SLIDE 40

Multiple Cores

Modern computers often have several if not

dozens of cores

Each core has its own registers, but cores

share memory and devices

Cores can run user processes in parallel Cores need to synchronize access to PCBs

and devices

slide-41
SLIDE 41

Multi-Core Architecture

BUS CORE 1 CORE 2 CORE 3 RAM FLASH/ ROM SCREEN BUFFER

DISK

slide-42
SLIDE 42

Abstraction… multi-threaded process

THREAD 1 THREAD 2 THREAD 3 ENVIRONMENT

ADDRESS SPACE (MEMORY)

CPU + registers

slide-43
SLIDE 43

Why?

  • Make it simpler and more efficient for a process to take

advantage of multicore machines

n Instead of starting multiple processes, each with its

  • wn address space and a single thread running on a

single core

  • Not just for CPU parallelism: I/O parallelism can be

achieved even if I/O operations are “blocking”

  • Program structuring: for example, servers dealing with

concurrent incoming events

  • Might well have more threads than cores!!
slide-44
SLIDE 44

44

Processes and Address Spaces

What happens when Apache wants to

run multiple concurrent computations ?

Emacs Mail Kernel User

0x80000000 0xffffffff

Apache

0x00000000 0x7fffffff 0x00000000 0x7fffffff 0x00000000 0x7fffffff

slide-45
SLIDE 45

45

Processes and Address Spaces

Two heavyweight address spaces for two

concurrent computations ?

Emacs Mail User

0x80000000 0xffffffff

Apache

0x00000000 0x7fffffff 0x00000000 0x7fffffff 0x00000000 0x7fffffff

Apache Kernel

slide-46
SLIDE 46

46

Processes and Address Spaces

We can eliminate duplicate address

spaces and place concurrent computations in the same address space

Emacs Mail User

0x80000000 0xffffffff

Apache

0x00000000 0x7fffffff 0x00000000 0x7fffffff 0x00000000 0x7fffffff

Apache Kernel

slide-47
SLIDE 47

Architecture

Process consists of

n One address space containing chunks of memory n Shared I/O resources n Multiple threads

w Each with its own registers, in particular PC and SP w Each has its own stack in the address space w Code and data is shared

  • Other terms for threads

n

Lightweight Process

n

Thread of Control

n

Task

slide-48
SLIDE 48

Memory Layout

CODE DATA STACK 1 STACK 3 STACK 2 SP PC

slide-49
SLIDE 49

49

Sharing

  • What’s shared between threads?

n They all share the same code and data (address space) n they all share the same privileges n they share almost everything in the process

  • What don’t they share?

n Each has its own PC, registers, stack pointer, and stack

slide-50
SLIDE 50

Threads

Lighter weight than processes Threads need to be mutually trusting

n Why?

Ideal for programs that want to support

concurrent computations where lots of code and data are shared between computations

n Servers, GUI code, …

slide-51
SLIDE 51

51

Separation of Thread and Process concepts

  • Concurrency (multi-threading) is useful for:

n improving program structure n handling concurrent events (e.g., web requests) n building parallel programs

  • So, multi-threading is useful even on a uniprocessor
  • To be useful, thread operations have to be fast
slide-52
SLIDE 52

How to implement?

  • Two extreme solutions

n “Kernel threads”:

w Allocate a separate PCB for each thread w Assign each PCB the same base/size registers w Also copy I/O resources, etc.

n “User threads”:

w Built a miniature O.S. in user space

  • User threads are (generally) more efficient

n Why?

  • Kernel threads simplify system call handling and scheduling

n Why?

slide-53
SLIDE 53

User Thread Implementation

User process supports

n “Thread Control Block” table with one entry per

thread

n “context switch” operations that save/restore

thread state in TCB

w Much like kernel-level context switches

n yield() operation by which a thread releases its

core and allows another thread to use it

w Automatic pre-emption not always supported

n Thread scheduler

slide-54
SLIDE 54

System calls

With user threads, a process may have

multiple systems calls outstanding simultaneously (one per thread)

Kernel PCB must support this

slide-55
SLIDE 55

Things to Think about

Scheduling

n While runnable process / thread runs when?

Coordination

n How do cores / threads synchronize access to

shared memory and devices?