Virtual Memory 3 / I/O 1 last time working set, Zipf usage models - - PowerPoint PPT Presentation

virtual memory 3 i o
SMART_READER_LITE
LIVE PREVIEW

Virtual Memory 3 / I/O 1 last time working set, Zipf usage models - - PowerPoint PPT Presentation

Virtual Memory 3 / I/O 1 last time working set, Zipf usage models LRU page replacement approximating LRU by sampling accessed bits or mark invalid nit: said Linux marked invalid to test probably not on x86 instead periodic scanning of


slide-1
SLIDE 1

Virtual Memory 3 / I/O

1

slide-2
SLIDE 2

last time

working set, Zipf usage models LRU page replacement approximating LRU by sampling accessed bits or mark invalid

nit: said Linux marked invalid to test — probably not on x86 instead periodic scanning of referenced bits set by processor (but marking invalid would work/is needed on some platforms)

  • bservation: when LRU fails

3

slide-3
SLIDE 3
  • n the paging assignment

“(and pointing to the the original physical page) (pointing to the same physical page) Do those two lines mean the same thing?????” yes — with copy-on-write, the child uses same pages as parent difgerences are in how reference count and read-onlyness is maintained when parent page was already copy-on-write from a previous fork

4

slide-4
SLIDE 4

anonymous feedback (1)

“hi can u stop changing the assignment description. just get it right the fjrst time because every time you change, it screws with my understanding of what i’m supposed to do and i’m just super confused.”

I could try to add bullets instead of editing bullets if that’s better… (and I did make more serious edits if you started before the assignment wasn’t marked tentative, …)

“also your instructions suck. they don’t make sense.”

  • kay

5

slide-5
SLIDE 5

anonymous feedback (2)

“Super unfair how Grimshaw’s class gets 1 more week than we do

  • n FAT homework because they don’t have this paging assignment.

While we struggle on this assignment, they get more time to fjgure

  • ut the next one”’
  • ur FAT assignment is due 16 November

(checkpoint for ours is due 9 November) theirs is due 8 November should get some testing code with our version

6

slide-6
SLIDE 6

problems with LRU

question: when does LRU perform poorly?

  • nly reading things once

repeated scans of large amounts of data both common access patterns for fjles

7

slide-7
SLIDE 7

problems with LRU

question: when does LRU perform poorly?

  • nly reading things once

repeated scans of large amounts of data both common access patterns for fjles

7

slide-8
SLIDE 8

problems with LRU

question: when does LRU perform poorly?

  • nly reading things once

repeated scans of large amounts of data both common access patterns for fjles

7

slide-9
SLIDE 9

CLOCK-Pro: special casing for one-use pages

by default, Linux tries to handle scanning of fjles

  • ne read of fjle data — e.g. play a video, load fjle into memory

basic idea: don’t consider pages active until the second access single scans of fjle won’t “pollute” cache without this change: reading large fjles slows down other programs

recently read part of large fjle steals space from active programs

8

slide-10
SLIDE 10

CLOCK-Pro: special casing for one-use pages

active list inactive list not ref’d? referenced? referenced once? ignore (next scan) referenced twice? to active evict page at bottom of inactive list either fjle page referenced once or referenced multiple times, but not recently “new” fjle pages

initial guess: fjle pages will be used at most once, then can be discarded

  • nce pages become active, any reference keeps them active

count two references for inactive pages be more reluctant this is current Linux algorithm for fjle pages

9

slide-11
SLIDE 11

CLOCK-Pro: special casing for one-use pages

active list inactive list not ref’d? referenced? referenced once? ignore (next scan) referenced twice? to active evict page at bottom of inactive list either fjle page referenced once or referenced multiple times, but not recently “new” fjle pages

initial guess: fjle pages will be used at most once, then can be discarded

  • nce pages become active, any reference keeps them active

count two references for inactive pages be more reluctant this is current Linux algorithm for fjle pages

9

slide-12
SLIDE 12

CLOCK-Pro: special casing for one-use pages

active list inactive list not ref’d? referenced? referenced once? ignore (next scan) referenced twice? to active evict page at bottom of inactive list either fjle page referenced once or referenced multiple times, but not recently “new” fjle pages

initial guess: fjle pages will be used at most once, then can be discarded

  • nce pages become active, any reference keeps them active

count two references for inactive pages be more reluctant this is current Linux algorithm for fjle pages

9

slide-13
SLIDE 13

CLOCK-Pro: special casing for one-use pages

active list inactive list not ref’d? referenced? referenced once? ignore (next scan) referenced twice? to active evict page at bottom of inactive list either fjle page referenced once or referenced multiple times, but not recently “new” fjle pages

initial guess: fjle pages will be used at most once, then can be discarded

  • nce pages become active, any reference keeps them active

count two references for inactive pages be more reluctant this is current Linux algorithm for fjle pages

9

slide-14
SLIDE 14

CLOCK-Pro: special casing for one-use pages

active list inactive list not ref’d? referenced? referenced once? ignore (next scan) referenced twice? to active evict page at bottom of inactive list either fjle page referenced once or referenced multiple times, but not recently “new” fjle pages

initial guess: fjle pages will be used at most once, then can be discarded

  • nce pages become active, any reference keeps them active

count two references for inactive pages be more reluctant this is current Linux algorithm for fjle pages

9

slide-15
SLIDE 15

CLOCK-Pro: special casing for one-use pages

active list inactive list not ref’d? referenced? referenced once? ignore (next scan) referenced twice? to active evict page at bottom of inactive list either fjle page referenced once or referenced multiple times, but not recently “new” fjle pages

initial guess: fjle pages will be used at most once, then can be discarded

  • nce pages become active, any reference keeps them active

count two references for inactive pages be more reluctant this is current Linux algorithm for fjle pages

9

slide-16
SLIDE 16

CLOCK-Pro: special casing for one-use pages

active list inactive list not ref’d? referenced? referenced once? ignore (next scan) referenced twice? to active evict page at bottom of inactive list either fjle page referenced once or referenced multiple times, but not recently “new” fjle pages

initial guess: fjle pages will be used at most once, then can be discarded

  • nce pages become active, any reference keeps them active

count two references for inactive pages be more reluctant this is current Linux algorithm for fjle pages

9

slide-17
SLIDE 17

default Linux page replacement summary

Figure: https://linux-mm.org/PageReplacementDesign

10

slide-18
SLIDE 18

default Linux page replacement summary

identify inactive pages — guess: not going to be accessed soon

fjle pages which haven’t been accessed more than once, or any pages which haven’t been accessed recently

some minimum threshold of inactive pages

add to inactive list in background detecting references — scan referenced bits (I thought Linux marked as invalid — but wrong: not on x86) detect enough references — move to active

  • ldest inactive page still not used → evict that one
  • therwise: give it a second chance

11

slide-19
SLIDE 19

being proactive

previous assumption: load on demand why is something loaded?

page fault maybe because application starts

can we do better?

12

slide-20
SLIDE 20

readahead

program accesses page 4 of a fjle, page 5, page 6. What’s next? page 7 — idea: guess this

  • n page fault, does it look like contiguous accesses?

called readahead

13

slide-21
SLIDE 21

readahead

program accesses page 4 of a fjle, page 5, page 6. What’s next? page 7 — idea: guess this

  • n page fault, does it look like contiguous accesses?

called readahead

13

slide-22
SLIDE 22

readahead heuristics (1)

exercise: devise an algorithm to detect to do readahead.

when to start reads? how much to readahead?

want to detect contiguous accesses to mmap’d pages can mark pages invalid temporarily to detect references to present pages can add if statement to detect when new pages are brought in

14

slide-23
SLIDE 23

Linux readahead heuristics — how much

how much to readahead? Linux heuristic: count number of cached pages before guess we should read about that many more minimum/maximum to avoid extremes goal: readahead more when applications are using fjle more goal: don’t readahead as much with low memory

15

slide-24
SLIDE 24

Linux readahead heuristics — when

track “readahead windows” — pages read because of guess:

|<−−−−− async_size − − − − − − − − −| | − − − − − − − − − − − − − − − − − − − size − − − − − − − − − − − − − − − − − − − − >| |==================#===========================| ^start ^page marked with PG_readahead

when async_size pages left, read next chunk marked page = detect reads to this page idea: keep up with application, but not too far ahead

16

slide-25
SLIDE 25

thrashing

what if there’s just not enough space?

for program data, fjles currently being accessed

always reading things from disk causes performance collapse — disk is really slow known as thrashing

17

slide-26
SLIDE 26

‘fair’ page replacement

so far: page replacement about least recently used what about sharing fairly between users?

18

slide-27
SLIDE 27

sharing fairly?

process A

4MB of stack+code, 16MB of heap shared cached 16MB fjle X

process B

4MB of stack+code, 16MB of heap shared cached 16MB fjle X

process C

4MB of stack+code, 4MB of heap cached 32MB fjle Y

process D+E

4MB of stack+code, 64MB of heap but all heap is shared copy-on-write

19

slide-28
SLIDE 28

accounting pages

shared pages make it diffjcult to count memory usage Linux cgroups accounting: last touch

count shared fjle pages for the process that last ‘used’ them …as detected by page fault for page

20

slide-29
SLIDE 29

Linux cgroup limits

Linux “control groups” of processes can set memory limits for group of proceses: low limit: don’t ‘steal’ pages when group uses less than this

always take pages someone is using (unless no choice)

high limit: never let group use more than this

replace pages from this group before anything else

21

slide-30
SLIDE 30

Linux cgroups

Linux mechanism: seperate processes into groups:

webserver webapp … cgroup website bash (shell) ls … cgroup login

can set memory and CPU and …shares for each group

22

slide-31
SLIDE 31

Linux cgroup memory limits

memory usage low limit high limit max 0 GB memory capacity

actively deallocate pages cgroup is using if other processes need memory, take from this group do not take from this group for other groups (even if pages not recently used)

23

slide-32
SLIDE 32

recall: kernel bufgering (reads)

program

  • perating system

keyboard disk

keypress happens, read bufger: keyboard input waiting for program read char from terminal …via bufger read char from fjle read block of data from disk bufger: recently read data from disk …via bufger

24

slide-33
SLIDE 33

recall: kernel bufgering (reads)

program

  • perating system

keyboard disk

keypress happens, read bufger: keyboard input waiting for program read char from terminal …via bufger read char from fjle read block of data from disk bufger: recently read data from disk …via bufger

24

slide-34
SLIDE 34

recall: kernel bufgering (reads)

program

  • perating system

keyboard disk

keypress happens, read bufger: keyboard input waiting for program read char from terminal …via bufger read char from fjle read block of data from disk bufger: recently read data from disk …via bufger

24

slide-35
SLIDE 35

recall: kernel bufgering (reads)

program

  • perating system

keyboard disk

keypress happens, read bufger: keyboard input waiting for program read char from terminal …via bufger read char from fjle read block of data from disk bufger: recently read data from disk …via bufger

24

slide-36
SLIDE 36

recall: kernel bufgering (reads)

program

  • perating system

keyboard disk

keypress happens, read bufger: keyboard input waiting for program read char from terminal …via bufger read char from fjle read block of data from disk bufger: recently read data from disk …via bufger

24

slide-37
SLIDE 37

recall: kernel bufgering (writes)

program

  • perating system

network disk

(when ready) send data bufger: output waiting for network print char to remote machine write char to fjle (when ready) write block of data from disk bufger: data waiting to be written on disk

25

slide-38
SLIDE 38

recall: kernel bufgering (writes)

program

  • perating system

network disk

(when ready) send data bufger: output waiting for network print char to remote machine write char to fjle (when ready) write block of data from disk bufger: data waiting to be written on disk

25

slide-39
SLIDE 39

recall: kernel bufgering (writes)

program

  • perating system

network disk

(when ready) send data bufger: output waiting for network print char to remote machine write char to fjle (when ready) write block of data from disk bufger: data waiting to be written on disk

25

slide-40
SLIDE 40

recall: kernel bufgering (writes)

program

  • perating system

network disk

(when ready) send data bufger: output waiting for network print char to remote machine write char to fjle (when ready) write block of data from disk bufger: data waiting to be written on disk

25

slide-41
SLIDE 41

recall: kernel bufgering (writes)

program

  • perating system

network disk

(when ready) send data bufger: output waiting for network print char to remote machine write char to fjle (when ready) write block of data from disk bufger: data waiting to be written on disk

25

slide-42
SLIDE 42

recall: layering

application standard library system calls kernel’s fjle interface device drivers hardware interfaces

kernel’s bufgers read/write cout/printf — and their own bufgers

26

slide-43
SLIDE 43

ways to talk to I/O devices

user program read/write/mmap/etc. fjle interface

regular fjles fjlesystems device fjles device drivers

27

slide-44
SLIDE 44

devices as fjles

talking to device? open/read/write/close typically similar interface within the kernel device driver implements the fjle interface

28

slide-45
SLIDE 45

example device fjles from a Linux desktop

/dev/snd/pcmC0D0p — audio playback

confjgure, then write audio data

/dev/sda, /dev/sdb — SATA-based SSD and hard drive

usually access via fjlesystem, but can mmap/read/write directly

/dev/input/event3, /dev/input/event10 — mouse and keyboard

can read list of keypress/mouse movement/etc. events

/dev/dri/renderD128 — builtin graphics

DRI = direct rendering infrastructure

29

slide-46
SLIDE 46

devices: extra operations?

read/write/mmap not enough

audio output device — set format of audio? terminal — whether to echo back what user types? CD/DVD — open the disk tray? is a disk present? …

POSIX: ioctl (general I/O control), tcget/setaddr (for terminal settings), …

30

slide-47
SLIDE 47

Linux example: fjle operations

(selected subset — table of pointers to functions)

struct file_operations { ... ssize_t (*read) (struct file *, char __user *, size_t, loff_t *); ssize_t (*write) (struct file *, const char __user *,x size_t, loff_t *); ... long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long); ... int (*mmap) (struct file *, struct vm_area_struct *); unsigned long mmap_supported_flags; int (*open) (struct inode *, struct file *); ... int (*release) (struct inode *, struct file *); ... };

31

slide-48
SLIDE 48

special case: block devices

devices like disks often have a difgerent interface unlike normal fjle interface, works in terms of ‘blocks’

instead of bytes

used by fjlesystems — store directories on devices

fjlesystems are specialized to know disks aren’t byte-based

want to work with page cache — bytes not convenient

read/write page at a time implement read/write to use page cache, not direct

common code to translate from working with bytes to blocks

32

slide-49
SLIDE 49

Linux example: block device operations

struct block_device_operations { int (*open) (struct block_device *, fmode_t); void (*release) (struct gendisk *, fmode_t); int (*rw_page)(struct block_device *, sector_t, struct page *, bool); int (*ioctl) (struct block_device *, fmode_t, unsigned, unsigned long); ... };

read/write a page for a sector number (= block number)

33

slide-50
SLIDE 50

device driver fmow

thread making read/write/etc. “top half”

get I/O request

read/write/… system call or page cache miss/eviction…

check if satisfjed from bufgers

(e.g. previous keypresses to keyboard)

send or queue I/O operation put thread to sleep (if needed) get interrupt from device update bufgers wake up thread (if needed) send more to device (if needed) store and return request result device hardware

trap handler “bottom half”

34

slide-51
SLIDE 51

device driver fmow

thread making read/write/etc. “top half”

get I/O request

read/write/… system call or page cache miss/eviction…

check if satisfjed from bufgers

(e.g. previous keypresses to keyboard)

send or queue I/O operation put thread to sleep (if needed) get interrupt from device update bufgers wake up thread (if needed) send more to device (if needed) store and return request result device hardware

trap handler “bottom half”

34

slide-52
SLIDE 52

device driver fmow

thread making read/write/etc. “top half”

get I/O request

read/write/… system call or page cache miss/eviction…

check if satisfjed from bufgers

(e.g. previous keypresses to keyboard)

send or queue I/O operation put thread to sleep (if needed) get interrupt from device update bufgers wake up thread (if needed) send more to device (if needed) store and return request result device hardware

trap handler “bottom half”

34

slide-53
SLIDE 53

xv6: device fjles

struct devsw { int (*read)(struct inode*, char*, int); int (*write)(struct inode*, char*, int); }; extern struct devsw devsw[];

table of devices device fjle uses entry in devsw array

fjlesystem stores name to index lookup

similar scheme used on ‘real’ Unix/Linux

fjles referencing major/minor device number table of device numbers in kernel

35

slide-54
SLIDE 54

xv6: console devsw

code run at boot: devsw[CONSOLE].write = consolewrite; devsw[CONSOLE].read = consoleread; CONSOLE is a constant consoleread/consolewrite: run when you read/write console

36

slide-55
SLIDE 55

xv6: console devsw

code run at boot: devsw[CONSOLE].write = consolewrite; devsw[CONSOLE].read = consoleread; CONSOLE is a constant consoleread/consolewrite: run when you read/write console

36

slide-56
SLIDE 56

device driver fmow

thread making read/write/etc. “top half”

get I/O request

read/write/… system call or page cache miss/eviction…

check if satisfjed from bufgers

(e.g. previous keypresses to keyboard)

send or queue I/O operation put thread to sleep (if needed) get interrupt from device update bufgers wake up thread (if needed) send more to device (if needed) store and return request result device hardware

trap handler “bottom half”

37

slide-57
SLIDE 57

xv6: console top half (read)

int consoleread(struct inode *ip, char *dst, int n) { ... target = n; acquire(&cons.lock); while(n > 0){ while(input.r == input.w){ if(myproc()−>killed){ ... return −1; } sleep(&input.r, &cons.lock); } ... } release(&cons.lock) ... }

38

slide-58
SLIDE 58

device driver fmow

thread making read/write/etc. “top half”

get I/O request

read/write/… system call or page cache miss/eviction…

check if satisfjed from bufgers

(e.g. previous keypresses to keyboard)

send or queue I/O operation put thread to sleep (if needed) get interrupt from device update bufgers wake up thread (if needed) send more to device (if needed) store and return request result device hardware

trap handler “bottom half”

39

slide-59
SLIDE 59

xv6: console top half (read)

int consoleread(struct inode *ip, char *dst, int n) { ... target = n; acquire(&cons.lock); while(n > 0){ ... c = input.buf[input.r++ % INPUT_BUF]; ... *dst++ = c; −−n; if (c == '\n') break; } release(&cons.lock) ... return target − n; }

40

slide-60
SLIDE 60

xv6: console top half (read)

int consoleread(struct inode *ip, char *dst, int n) { ... target = n; acquire(&cons.lock); while(n > 0){ ... c = input.buf[input.r++ % INPUT_BUF]; ... *dst++ = c; −−n; if (c == '\n') break; } release(&cons.lock) ... return target − n; }

40

slide-61
SLIDE 61

xv6: console top half

wait for bufger to fjll

no special work to request data — keyboard input always sent

copy from bufger check if done (newline or enough chars), if not repeat

41

slide-62
SLIDE 62

device driver fmow

thread making read/write/etc. “top half”

get I/O request

read/write/… system call or page cache miss/eviction…

check if satisfjed from bufgers

(e.g. previous keypresses to keyboard)

send or queue I/O operation put thread to sleep (if needed) get interrupt from device update bufgers wake up thread (if needed) send more to device (if needed) store and return request result device hardware

trap handler “bottom half”

42

slide-63
SLIDE 63

xv6: console interrupt (one case)

void trap(struct trapframe *tf) { ... switch(tf−>trapno) { ... case T_IRQ0 + IRQ_KBD: kbdintr(); lapcieoi(); break; ... } ... }

kbdintr: atually read from keyboard device lapcieoi: tell CPU “I’m done with this interrupt”

43

slide-64
SLIDE 64

xv6: console interrupt (one case)

void trap(struct trapframe *tf) { ... switch(tf−>trapno) { ... case T_IRQ0 + IRQ_KBD: kbdintr(); lapcieoi(); break; ... } ... }

kbdintr: atually read from keyboard device lapcieoi: tell CPU “I’m done with this interrupt”

43

slide-65
SLIDE 65

device driver fmow

thread making read/write/etc. “top half”

get I/O request

read/write/… system call or page cache miss/eviction…

check if satisfjed from bufgers

(e.g. previous keypresses to keyboard)

send or queue I/O operation put thread to sleep (if needed) get interrupt from device update bufgers wake up thread (if needed) send more to device (if needed) store and return request result device hardware

trap handler “bottom half”

44

slide-66
SLIDE 66

xv6: console interrupt reading

kbdintr fuction actually reads from device adds data to bufger (if room) wakes up sleeping thread (if any)

45

slide-67
SLIDE 67

connecting devices

processor

interrupt controller memory bus

  • ther processors…

actual memory

  • ther devices

device controller

status read? write? …

control registers

bufgers/queues

external hardware?

0x80004800: 0x80004808: 0x80004810: …:

control registers have memory addresses looks like write to memory actually changes value in device controller control registers might not really be registers e.g. maybe writing to write? “control register” actually just sends the value the external hardware bufgers/queues will also have memory addresses way to send “please interrupt” signal component of processor decides when to handle (deals with ordering, interrupt disabling, which of several processors handles it, …, etc.)

46

slide-68
SLIDE 68

connecting devices

processor

interrupt controller memory bus

  • ther processors…

actual memory

  • ther devices

device controller

status read? write? …

control registers

bufgers/queues

external hardware?

0x80004800: 0x80004808: 0x80004810: …:

control registers have memory addresses looks like write to memory actually changes value in device controller control registers might not really be registers e.g. maybe writing to write? “control register” actually just sends the value the external hardware bufgers/queues will also have memory addresses way to send “please interrupt” signal component of processor decides when to handle (deals with ordering, interrupt disabling, which of several processors handles it, …, etc.)

46

slide-69
SLIDE 69

connecting devices

processor

interrupt controller memory bus

  • ther processors…

actual memory

  • ther devices

device controller

status read? write? …

control registers

bufgers/queues

external hardware?

0x80004800: 0x80004808: 0x80004810: …:

control registers have memory addresses looks like write to memory actually changes value in device controller control registers might not really be registers e.g. maybe writing to write? “control register” actually just sends the value the external hardware bufgers/queues will also have memory addresses way to send “please interrupt” signal component of processor decides when to handle (deals with ordering, interrupt disabling, which of several processors handles it, …, etc.)

46

slide-70
SLIDE 70

connecting devices

processor

interrupt controller memory bus

  • ther processors…

actual memory

  • ther devices

device controller

status read? write? …

control registers

bufgers/queues

external hardware?

0x80004800: 0x80004808: 0x80004810: …:

control registers have memory addresses looks like write to memory actually changes value in device controller control registers might not really be registers e.g. maybe writing to write? “control register” actually just sends the value the external hardware bufgers/queues will also have memory addresses way to send “please interrupt” signal component of processor decides when to handle (deals with ordering, interrupt disabling, which of several processors handles it, …, etc.)

46

slide-71
SLIDE 71

connecting devices

processor

interrupt controller memory bus

  • ther processors…

actual memory

  • ther devices

device controller

status read? write? …

control registers

bufgers/queues

external hardware?

0x80004800: 0x80004808: 0x80004810: …:

control registers have memory addresses looks like write to memory actually changes value in device controller control registers might not really be registers e.g. maybe writing to write? “control register” actually just sends the value the external hardware bufgers/queues will also have memory addresses way to send “please interrupt” signal component of processor decides when to handle (deals with ordering, interrupt disabling, which of several processors handles it, …, etc.)

46

slide-72
SLIDE 72

bus adaptors

processor

interrupt controller memory bus

  • ther processors…

actual memory

  • ther devices
  • r
  • ther bus adaptors

bus adaptor

  • ther devices

device controller

status read? write? …

control registers

bufgers/queues

external hardware? difgerent bus

47

slide-73
SLIDE 73

devices as magic memory (1)

devices expose memory locations to read/write use read/write instructions to manipulate device example: keyboard controller read from magic memory location — get last keypress/release reading location clears bufger for next keypress/release get interrupt whenever new keypress/release you haven’t read

48

slide-74
SLIDE 74

devices as magic memory (1)

devices expose memory locations to read/write use read/write instructions to manipulate device example: keyboard controller read from magic memory location — get last keypress/release reading location clears bufger for next keypress/release get interrupt whenever new keypress/release you haven’t read

48

slide-75
SLIDE 75

devices as magic memory (1)

devices expose memory locations to read/write use read/write instructions to manipulate device example: keyboard controller read from magic memory location — get last keypress/release reading location clears bufger for next keypress/release get interrupt whenever new keypress/release you haven’t read

48

slide-76
SLIDE 76

device as magic memory (2)

example: display controller write to pixels to magic memory location — displayed on screen

  • ther memory locations control format/screen size

example: network interface write to bufgers write “send now” signal to magic memory location — send data read from “status” location, bufgers to receive

49

slide-77
SLIDE 77

what about caching?

caching “last keypress/release”? I press ‘h’, OS reads ‘h’, does that get cached? …I press ‘e’, OS reads what? solution: OS can mark memory uncachable x86: bit in page table entry can say “no caching”

50

slide-78
SLIDE 78

what about caching?

caching “last keypress/release”? I press ‘h’, OS reads ‘h’, does that get cached? …I press ‘e’, OS reads what? solution: OS can mark memory uncachable x86: bit in page table entry can say “no caching”

50

slide-79
SLIDE 79

what about caching?

caching “last keypress/release”? I press ‘h’, OS reads ‘h’, does that get cached? …I press ‘e’, OS reads what? solution: OS can mark memory uncachable x86: bit in page table entry can say “no caching”

50

slide-80
SLIDE 80

aside: I/O space

x86 has a “I/O addresses” like memory addresses, but accessed with difgerent instruction

in and out instructions

historically: separate I/O bus more recent processors/devices would just use memory addresses

no need for more instructions, buses

  • ther reasons to have devices and memory close (later)

51

slide-81
SLIDE 81

xv6 keyboard access

two control registers:

KBSTATP: status register (I/O address 0x64) KBDATAP: data bufger (I/O address 0x60)

st = inb(KBSTATP); // in instruction: read from I/O address if ((st & KBS_DIB) == 0) // bit KBS_DIB indicates data in buffer? return −1; data = inb(KBDATAP); // read from data --- *clears* buffer /* interpret data to learn what kind of keypress/release */

52

slide-82
SLIDE 82

programmed I/O

“programmed I/O”: write to or read from device bufgers directly OS runs loop to transfer data to or from device might still be triggered by interrupt

know/what for “is device ready”

53

slide-83
SLIDE 83

approximating LRU: SEQ

active list inactive list guess: oldest active page is really inactive page inactive page referenced? not really inactive move to active list evict page at bottom of inactive list know: not referenced ‘recently’ “new” pages start in active list

detecting references? scan reference bits

  • r mark invalid + get fault

this is current Linux algorithm for non-fjle pages extra details needed: how big is the inactive list?

54

slide-84
SLIDE 84

approximating LRU: SEQ

active list inactive list guess: oldest active page is really inactive page inactive page referenced? not really inactive move to active list evict page at bottom of inactive list know: not referenced ‘recently’ “new” pages start in active list

detecting references? scan reference bits

  • r mark invalid + get fault

this is current Linux algorithm for non-fjle pages extra details needed: how big is the inactive list?

54

slide-85
SLIDE 85

approximating LRU: SEQ

active list inactive list guess: oldest active page is really inactive page inactive page referenced? not really inactive move to active list evict page at bottom of inactive list know: not referenced ‘recently’ “new” pages start in active list

detecting references? scan reference bits

  • r mark invalid + get fault

this is current Linux algorithm for non-fjle pages extra details needed: how big is the inactive list?

54

slide-86
SLIDE 86

approximating LRU: SEQ

active list inactive list guess: oldest active page is really inactive page inactive page referenced? not really inactive move to active list evict page at bottom of inactive list know: not referenced ‘recently’ “new” pages start in active list

detecting references? scan reference bits

  • r mark invalid + get fault

this is current Linux algorithm for non-fjle pages extra details needed: how big is the inactive list?

54

slide-87
SLIDE 87

approximating LRU: SEQ

active list inactive list guess: oldest active page is really inactive page inactive page referenced? not really inactive move to active list evict page at bottom of inactive list know: not referenced ‘recently’ “new” pages start in active list

detecting references? scan reference bits

  • r mark invalid + get fault

this is current Linux algorithm for non-fjle pages extra details needed: how big is the inactive list?

54

slide-88
SLIDE 88

approximating LRU: SEQ

active list inactive list guess: oldest active page is really inactive page inactive page referenced? not really inactive move to active list evict page at bottom of inactive list know: not referenced ‘recently’ “new” pages start in active list

detecting references? scan reference bits

  • r mark invalid + get fault

this is current Linux algorithm for non-fjle pages extra details needed: how big is the inactive list?

54

slide-89
SLIDE 89

approximating LRU: SEQ

active list inactive list guess: oldest active page is really inactive page inactive page referenced? not really inactive move to active list evict page at bottom of inactive list know: not referenced ‘recently’ “new” pages start in active list

detecting references? scan reference bits

  • r mark invalid + get fault

this is current Linux algorithm for non-fjle pages extra details needed: how big is the inactive list?

54

slide-90
SLIDE 90

swapping timeline

… program A pages … program B pages program A page fault OS start read evicted loaded interrupt OS needs to choose page to replace hopefully copy on disk is already up-to-date? fjrst step of replacement: mark evicted page invalid in each page table this example: only process B real case: possibly many page tables

  • ther processes can run while reading page

OS will get interrupt when disk is done process A’s page table updated and restarted from point of fault

55

slide-91
SLIDE 91

swapping timeline

… program A pages … program B pages program A page fault OS start read evicted loaded interrupt OS needs to choose page to replace hopefully copy on disk is already up-to-date? fjrst step of replacement: mark evicted page invalid in each page table this example: only process B real case: possibly many page tables

  • ther processes can run while reading page

OS will get interrupt when disk is done process A’s page table updated and restarted from point of fault

55

slide-92
SLIDE 92

swapping timeline

… program A pages … program B pages program A page fault OS start read evicted loaded interrupt OS needs to choose page to replace hopefully copy on disk is already up-to-date? fjrst step of replacement: mark evicted page invalid in each page table this example: only process B real case: possibly many page tables

  • ther processes can run while reading page

OS will get interrupt when disk is done process A’s page table updated and restarted from point of fault

55

slide-93
SLIDE 93

swapping timeline

… program A pages … program B pages program A page fault OS start read evicted loaded interrupt OS needs to choose page to replace hopefully copy on disk is already up-to-date? fjrst step of replacement: mark evicted page invalid in each page table this example: only process B real case: possibly many page tables

  • ther processes can run while reading page

OS will get interrupt when disk is done process A’s page table updated and restarted from point of fault

55

slide-94
SLIDE 94

swapping timeline

… program A pages … program B pages program A page fault OS start read evicted loaded interrupt OS needs to choose page to replace hopefully copy on disk is already up-to-date? fjrst step of replacement: mark evicted page invalid in each page table this example: only process B real case: possibly many page tables

  • ther processes can run while reading page

OS will get interrupt when disk is done process A’s page table updated and restarted from point of fault

55

slide-95
SLIDE 95

POSIX: everything is a fjle

the fjle: one interface for

devices (terminals, printers, …) regular fjles on disk networking (sockets) local interprocess communication (pipes, sockets)

basic operations: open(), read(), write(), close()

56

slide-96
SLIDE 96

the fjle interface

  • pen before use

setup, access control happens here

byte-oriented

real device isn’t? operating system needs to hide that

explicit close

57

slide-97
SLIDE 97

the fjle interface

  • pen before use

setup, access control happens here

byte-oriented

real device isn’t? operating system needs to hide that

explicit close

57

slide-98
SLIDE 98

kernel bufgering (reads)

program

  • perating system

keyboard disk

keypress happens, read bufger: keyboard input waiting for program read char from terminal …via bufger read char from fjle read block of data from disk bufger: recently read data from disk …via bufger

58

slide-99
SLIDE 99

kernel bufgering (reads)

program

  • perating system

keyboard disk

keypress happens, read bufger: keyboard input waiting for program read char from terminal …via bufger read char from fjle read block of data from disk bufger: recently read data from disk …via bufger

58

slide-100
SLIDE 100

kernel bufgering (reads)

program

  • perating system

keyboard disk

keypress happens, read bufger: keyboard input waiting for program read char from terminal …via bufger read char from fjle read block of data from disk bufger: recently read data from disk …via bufger

58

slide-101
SLIDE 101

kernel bufgering (reads)

program

  • perating system

keyboard disk

keypress happens, read bufger: keyboard input waiting for program read char from terminal …via bufger read char from fjle read block of data from disk bufger: recently read data from disk …via bufger

58

slide-102
SLIDE 102

kernel bufgering (reads)

program

  • perating system

keyboard disk

keypress happens, read bufger: keyboard input waiting for program read char from terminal …via bufger read char from fjle read block of data from disk bufger: recently read data from disk …via bufger

58

slide-103
SLIDE 103

kernel bufgering (writes)

program

  • perating system

network disk

(when ready) send data bufger: output waiting for network print char to remote machine write char to fjle (when ready) write block of data from disk bufger: data waiting to be written on disk

59

slide-104
SLIDE 104

kernel bufgering (writes)

program

  • perating system

network disk

(when ready) send data bufger: output waiting for network print char to remote machine write char to fjle (when ready) write block of data from disk bufger: data waiting to be written on disk

59

slide-105
SLIDE 105

kernel bufgering (writes)

program

  • perating system

network disk

(when ready) send data bufger: output waiting for network print char to remote machine write char to fjle (when ready) write block of data from disk bufger: data waiting to be written on disk

59

slide-106
SLIDE 106

kernel bufgering (writes)

program

  • perating system

network disk

(when ready) send data bufger: output waiting for network print char to remote machine write char to fjle (when ready) write block of data from disk bufger: data waiting to be written on disk

59

slide-107
SLIDE 107

kernel bufgering (writes)

program

  • perating system

network disk

(when ready) send data bufger: output waiting for network print char to remote machine write char to fjle (when ready) write block of data from disk bufger: data waiting to be written on disk

59

slide-108
SLIDE 108

read/write operations

read/write: move data into/out of bufger block (make process wait) if bufger is empty (read)/full (write)

(default behavior, possibly changeable)

actual I/O operations — wait for device to be ready

trigger process to stop waiting if needed

60

slide-109
SLIDE 109

layering

application standard library system calls kernel’s fjle interface device drivers hardware interfaces

kernel’s bufgers read/write cout/printf — and their own bufgers

61