 
              security and protection memory-mapped devices, I/O buses, … applications application 1 memory CPU hardware 28 interrupts, memory addresses, special registers, OS’s interface operating system processes, fjles, sockets, … hardware interface threads, address spaces, s e g m e n t a t i o n f a u l t keyboard/mouse monitor disks network …
goal: protection run multiple applications, and … keep them from crashing the OS keep them from crashing each other (keep parts of OS from crashing other parts?) 29
mechanism 1: dual-mode operation processor has two modes: kernel (privileged) and user OS controls what runs in kernel mode 30 some operations require kernel mode
mechanism 2: address translation Program B code = kernel-mode only trigger error real memory … OS data Program B data Program A data Program A code Program A (set by OS) mapping (set by OS) mapping addresses Program B addresses 31
aside: alternate mechanisms dual mode operation and address translation are common today not the only ways to implement operating system features (plausibly not even the most effjcient…) 32 …so we’ll talk about them a lot
hardware support for running OS: exception problem: OS needs to respond to events keypress happens? program using CPU for too long? … need hardware support because CPU is running application instructions 33
problem: OS needs to respond to events keypress happens? program using CPU for too long? … hardware support for running OS: exception need hardware support because CPU is running application instructions 33
exceptions and dual-mode operation rule: user code always runs in user mode rule: only OS code ever runs in kernel mode on exception : changes from user mode to kernel mode …and is only mechanism for doing so how OS controls what runs in kernel mode 34
exception terminology CS 3330 terms: interrupt: triggered by external event timer, keyboard, network, … fault: triggered by program doing something “bad” invalid memory access, divide-by-zero, … traps: triggered by explicit program action system calls aborts: something in the hardware broke 35
xv6 exception terms everything is a called a trap or sometimes an interrupt no real distinction in name about kinds 36
real world exception terms it’s all over the place… context clues 37
kernel services allocating memory? (change address space) reading/writing to fjle? (communicate with hard drive) read input? (communicate with keyboard) all need privileged instructions! 38 need to run code in kernel mode
hardware mechanism: deliberate exceptions some instructions exist to trigger exceptions still works like normal exception starts executing OS-chosen handler …in kernel mode allows program requests privilieged instructions OS handler decides what program can request OS handler decides format of requests 39
system call timeline (x86-64 Linux) ‘priviliged’ operations because of pointer set during boot hardware knows to go here (the “kernel”) in kernel mode (the standard library) in user mode ... testq %rax, %rax // now use return value // special instruction iret actually do read and 40 syscall_handler: // special instruction syscall mov $BUFFER_LEN, %r8 mov $buffer, %rdi mov $FILENO_stdout, %rsi mov $SYS_write, %rax (change memory layout, I/O, exceptions) allowed ‘priviliged’ operations prohibited /* set arguments in registers */ /* trigger exception */ /* ... save registers and set return value ... */ /* go back to "user" code */
system call timeline (x86-64 Linux) ‘priviliged’ operations because of pointer set during boot hardware knows to go here (the “kernel”) in kernel mode (the standard library) in user mode ... testq %rax, %rax // now use return value // special instruction iret actually do read and 40 syscall_handler: // special instruction syscall mov $BUFFER_LEN, %r8 mov $buffer, %rdi mov $FILENO_stdout, %rsi mov $SYS_write, %rax (change memory layout, I/O, exceptions) allowed ‘priviliged’ operations prohibited /* set arguments in registers */ /* trigger exception */ /* ... save registers and set return value ... */ /* go back to "user" code */
system call timeline (x86-64 Linux) ‘priviliged’ operations because of pointer set during boot hardware knows to go here (the “kernel”) in kernel mode (the standard library) in user mode ... testq %rax, %rax // now use return value // special instruction iret actually do read and 40 syscall_handler: // special instruction syscall mov $BUFFER_LEN, %r8 mov $buffer, %rdi mov $FILENO_stdout, %rsi mov $SYS_write, %rax (change memory layout, I/O, exceptions) allowed ‘priviliged’ operations prohibited /* set arguments in registers */ /* trigger exception */ /* ... save registers and set return value ... */ /* go back to "user" code */
system call timeline (x86-64 Linux) ‘priviliged’ operations because of pointer set during boot hardware knows to go here (the “kernel”) in kernel mode (the standard library) in user mode ... testq %rax, %rax // now use return value // special instruction iret actually do read and 40 syscall_handler: // special instruction syscall mov $BUFFER_LEN, %r8 mov $buffer, %rdi mov $FILENO_stdout, %rsi mov $SYS_write, %rax (change memory layout, I/O, exceptions) allowed ‘priviliged’ operations prohibited /* set arguments in registers */ /* trigger exception */ /* ... save registers and set return value ... */ /* go back to "user" code */
the classic Unix design pipes the OS? the OS? … device controllers memory management unit login… login the shell libc (C standard library) … swapping signals applications device drivers virtual memory networking fjlesystems CPU scheduler hardware hardware interface kernel system call interface utility programs standard libraries and standard library functions / shell commands 41
the classic Unix design pipes the OS? the OS? … device controllers memory management unit login… login the shell libc (C standard library) … swapping signals applications device drivers virtual memory networking fjlesystems CPU scheduler hardware hardware interface kernel system call interface utility programs standard libraries and standard library functions / shell commands 41
the classic Unix design pipes the OS? the OS? … device controllers memory management unit login… login the shell libc (C standard library) … swapping signals applications device drivers virtual memory networking fjlesystems CPU scheduler hardware hardware interface kernel system call interface utility programs standard libraries and standard library functions / shell commands 41
aside: is the OS the kernel? OS = stufg that runs in kernel mode? OS = stufg that runs in kernel mode + libraries to use it? OS = stufg that runs in kernel mode + libraries + utility programs (e.g. shell, fjnder)? OS = everything that comes with machine? no consensus on where the line is each piece can be replaced separately… 42
xv6 we will be using an teaching OS called “xv6” based on Sixth Edition Unix 43 modifjed to be multicore and use 32-bit x86 (not PDP-11)
xv6 setup/assignment fjrst assignment — adding two simple xv6 system calls includes xv6 download instructions and link to xv6 book 44
xv6 technical requirements you will need a Linux environment we will supply one (VM on website), or get your own (it’s probably possible to use OS X, but you need a cross-compiler and we don’t have instructions) …with qemu installed qemu (for us) = emulator for 32-bit x86 system Ubuntu/Debian package qemu-system-i386 45
fjrst assignment get compiled and xv6 working …toolkit uses an emulator could run on real hardware or a standard VM, but a lot of details also, emulator lets you use GDB 46
xv6: what’s included Unix-like kernel very small set of syscalls some less featureful (e.g. exit without exit status) userspace library very limited userspace programs command line, ls, mkdir, echo, cat, etc. some self-testing programs 47
xv6: echo.c #include "types.h" #include "stat.h" #include "user.h" int { int i; for (i = 1; i < argc; i++) exit(); } 48 main( int argc, char *argv[]) printf(1, "%s%s", argv[i], i+1 < argc ? " " : "\n");
xv6: echo.c #include "types.h" #include "stat.h" #include "user.h" int { int i; for (i = 1; i < argc; i++) printf(1, "%s%s", argv[i], i+1 < argc ? " " : "\n"); exit(); } 48 main( int argc, char *argv[])
xv6: echo.c #include "types.h" #include "stat.h" #include "user.h" int { int i; for (i = 1; i < argc; i++) exit(); } 48 main( int argc, char *argv[]) printf(1, "%s%s", argv[i], i+1 < argc ? " " : "\n");
xv6 demo 49
xv6 demo 50
+ stack + stack write syscall in xv6 asm saves regs return via trap() HW switches stacks return from interrupt using syscall # from eax read args from user stack C function: sys_write() C function: syscall() ( struct trapframe ) HW switches stacks + calls C function: trap() user mode assembly func: vector64() trigger exception HW does lookup interrupt table function call: write() syscall wrapper ( int $64 ) user program kernel mode 53
write syscall in xv6 C function: trap() return via trap() HW switches stacks return from interrupt using syscall # from eax read args from user stack C function: sys_write() C function: syscall() ( struct trapframe ) asm saves regs HW switches stacks + calls user mode assembly func: vector64() trigger exception HW does lookup interrupt table function call: write() syscall wrapper ( int $64 ) user program kernel mode 53 + stack + stack
write syscall in xv6: user mode (partial, after macro replacement) otherwise: same as 32-bit x86 calling convention (arguments on stack ) eax = syscall number xv6 syscall calling convention: parameter ( 64 in this case) — type of exception int errupt — trigger an exception similar to a keypress usys.S ret int $T_SYSCALL movl $SYS_write, %eax write: .globl write syscall.h / traps.h ... ... 64 #define T_SYSCALL ... 16 #define SYS_write ... main.c ... 14); "Hello, World!\n", write(1, 55
write syscall in xv6: user mode (partial, after macro replacement) otherwise: same as 32-bit x86 calling convention (arguments on stack ) eax = syscall number xv6 syscall calling convention: parameter ( 64 in this case) — type of exception int errupt — trigger an exception similar to a keypress usys.S ret int $T_SYSCALL movl $SYS_write, %eax write: .globl write syscall.h / traps.h ... ... 64 #define T_SYSCALL ... 16 #define SYS_write ... main.c ... 14); "Hello, World!\n", write(1, 55
write syscall in xv6: user mode (partial, after macro replacement) otherwise: same as 32-bit x86 calling convention (arguments on stack ) eax = syscall number xv6 syscall calling convention: parameter ( 64 in this case) — type of exception int errupt — trigger an exception similar to a keypress usys.S ret int $T_SYSCALL movl $SYS_write, %eax write: .globl write syscall.h / traps.h ... ... 64 #define T_SYSCALL ... 16 #define SYS_write ... main.c ... 14); "Hello, World!\n", write(1, 55
trap( struct trapframe *tf) write syscall in xv6: interrupt table setup ... 1: do not disable interrupts during syscalls e.g. keypress/timer handling can interrupt slow syscall vectors[T_SYSCALL] — OS function for processor to run set to pointer to assembly function vector64 eventually calls C function trap trap returns to alltraps alltraps restores registers from tf , then returns to user-mode vector64: pushl $0 pushl $64 jmp alltraps vectors.S meaning: run in kernel mode hardware jumps here alltraps: ... call trap ... iret trapasm.S void { ... trap.c (yes, code segments specifjes more than that — nothing we care about) set it to use the kernel “code segment” ... // - istrap: 1 for a trap gate, 0 for an interrupt gate. ... SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); ... trap.c (run on boot) lidt — function (in x86.h) wrapping lidt instruction sets the interrupt descriptor table to idt idt = array of pointers to handler functions for each exception type (plus a few bits of information about those handler functions) (from mmu.h): // Set up a normal interrupt/trap gate descriptor. // (otherwise: triggers fault like privileged instruction) interrupt gate clears FL_IF, trap gate leaves FL_IF alone // - sel: Code segment selector for interrupt/trap handler // - off: Offset in code segment for interrupt/trap handler // - dpl: Descriptor Privilege Level - // the privilege level required for software to invoke // this interrupt/trap gate explicitly using an int instruction. #define SETGATE(gate, istrap, sel, off, d) \ set the T_SYSCALL interrupt to be callable from user mode via int instruction 56 lidt(idt, sizeof (idt));
trap( struct trapframe *tf) write syscall in xv6: interrupt table setup ... 1: do not disable interrupts during syscalls e.g. keypress/timer handling can interrupt slow syscall vectors[T_SYSCALL] — OS function for processor to run set to pointer to assembly function vector64 eventually calls C function trap trap returns to alltraps alltraps restores registers from tf , then returns to user-mode vector64: pushl $0 pushl $64 jmp alltraps vectors.S meaning: run in kernel mode hardware jumps here alltraps: ... call trap ... iret trapasm.S void { ... trap.c (yes, code segments specifjes more than that — nothing we care about) set it to use the kernel “code segment” ... // - istrap: 1 for a trap gate, 0 for an interrupt gate. lidt(idt, sizeof (idt)); ... SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); ... trap.c (run on boot) lidt — function (in x86.h) wrapping lidt instruction sets the interrupt descriptor table to idt idt = array of pointers to handler functions for each exception type (plus a few bits of information about those handler functions) (from mmu.h): // Set up a normal interrupt/trap gate descriptor. // (otherwise: triggers fault like privileged instruction) interrupt gate clears FL_IF, trap gate leaves FL_IF alone // - sel: Code segment selector for interrupt/trap handler // - off: Offset in code segment for interrupt/trap handler // - dpl: Descriptor Privilege Level - // the privilege level required for software to invoke // this interrupt/trap gate explicitly using an int instruction. #define SETGATE(gate, istrap, sel, off, d) \ set the T_SYSCALL interrupt to be callable from user mode via int instruction 56
trap( struct trapframe *tf) write syscall in xv6: interrupt table setup ... 1: do not disable interrupts during syscalls e.g. keypress/timer handling can interrupt slow syscall vectors[T_SYSCALL] — OS function for processor to run set to pointer to assembly function vector64 eventually calls C function trap trap returns to alltraps alltraps restores registers from tf , then returns to user-mode vector64: pushl $0 pushl $64 jmp alltraps vectors.S meaning: run in kernel mode hardware jumps here alltraps: ... call trap ... iret trapasm.S void { ... trap.c (yes, code segments specifjes more than that — nothing we care about) set it to use the kernel “code segment” ... // - istrap: 1 for a trap gate, 0 for an interrupt gate. ... SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); ... trap.c (run on boot) lidt — function (in x86.h) wrapping lidt instruction sets the interrupt descriptor table to idt idt = array of pointers to handler functions for each exception type (plus a few bits of information about those handler functions) (from mmu.h): // Set up a normal interrupt/trap gate descriptor. // (otherwise: triggers fault like privileged instruction) interrupt gate clears FL_IF, trap gate leaves FL_IF alone // - sel: Code segment selector for interrupt/trap handler // - off: Offset in code segment for interrupt/trap handler // - dpl: Descriptor Privilege Level - // the privilege level required for software to invoke // this interrupt/trap gate explicitly using an int instruction. #define SETGATE(gate, istrap, sel, off, d) \ set the T_SYSCALL interrupt to be callable from user mode via int instruction 56 lidt(idt, sizeof (idt));
trap( struct trapframe *tf) write syscall in xv6: interrupt table setup ... 1: do not disable interrupts during syscalls e.g. keypress/timer handling can interrupt slow syscall vectors[T_SYSCALL] — OS function for processor to run set to pointer to assembly function vector64 eventually calls C function trap trap returns to alltraps alltraps restores registers from tf , then returns to user-mode vector64: pushl $0 pushl $64 jmp alltraps vectors.S meaning: run in kernel mode hardware jumps here alltraps: ... call trap ... iret trapasm.S void { ... trap.c (yes, code segments specifjes more than that — nothing we care about) set it to use the kernel “code segment” ... // - istrap: 1 for a trap gate, 0 for an interrupt gate. ... SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); ... trap.c (run on boot) lidt — function (in x86.h) wrapping lidt instruction sets the interrupt descriptor table to idt idt = array of pointers to handler functions for each exception type (plus a few bits of information about those handler functions) (from mmu.h): // Set up a normal interrupt/trap gate descriptor. // (otherwise: triggers fault like privileged instruction) interrupt gate clears FL_IF, trap gate leaves FL_IF alone // - sel: Code segment selector for interrupt/trap handler // - off: Offset in code segment for interrupt/trap handler // - dpl: Descriptor Privilege Level - // the privilege level required for software to invoke // this interrupt/trap gate explicitly using an int instruction. #define SETGATE(gate, istrap, sel, off, d) \ set the T_SYSCALL interrupt to be callable from user mode via int instruction 56 lidt(idt, sizeof (idt));
trap( struct trapframe *tf) write syscall in xv6: interrupt table setup ... 1: do not disable interrupts during syscalls e.g. keypress/timer handling can interrupt slow syscall vectors[T_SYSCALL] — OS function for processor to run set to pointer to assembly function vector64 eventually calls C function trap trap returns to alltraps alltraps restores registers from tf , then returns to user-mode vector64: pushl $0 pushl $64 jmp alltraps vectors.S meaning: run in kernel mode hardware jumps here alltraps: ... call trap ... iret trapasm.S void { ... trap.c (yes, code segments specifjes more than that — nothing we care about) set it to use the kernel “code segment” ... // - istrap: 1 for a trap gate, 0 for an interrupt gate. ... SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); ... trap.c (run on boot) lidt — function (in x86.h) wrapping lidt instruction sets the interrupt descriptor table to idt idt = array of pointers to handler functions for each exception type (plus a few bits of information about those handler functions) (from mmu.h): // Set up a normal interrupt/trap gate descriptor. // (otherwise: triggers fault like privileged instruction) interrupt gate clears FL_IF, trap gate leaves FL_IF alone // - sel: Code segment selector for interrupt/trap handler // - off: Offset in code segment for interrupt/trap handler // - dpl: Descriptor Privilege Level - // the privilege level required for software to invoke // this interrupt/trap gate explicitly using an int instruction. #define SETGATE(gate, istrap, sel, off, d) \ set the T_SYSCALL interrupt to be callable from user mode via int instruction 56 lidt(idt, sizeof (idt));
trap( struct trapframe *tf) write syscall in xv6: interrupt table setup ... 1: do not disable interrupts during syscalls e.g. keypress/timer handling can interrupt slow syscall vectors[T_SYSCALL] — OS function for processor to run set to pointer to assembly function vector64 eventually calls C function trap trap returns to alltraps alltraps restores registers from tf , then returns to user-mode vector64: pushl $0 pushl $64 jmp alltraps vectors.S meaning: run in kernel mode hardware jumps here alltraps: ... call trap ... iret trapasm.S void { ... trap.c (yes, code segments specifjes more than that — nothing we care about) set it to use the kernel “code segment” ... // - istrap: 1 for a trap gate, 0 for an interrupt gate. ... SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); ... trap.c (run on boot) lidt — function (in x86.h) wrapping lidt instruction sets the interrupt descriptor table to idt idt = array of pointers to handler functions for each exception type (plus a few bits of information about those handler functions) (from mmu.h): // Set up a normal interrupt/trap gate descriptor. // (otherwise: triggers fault like privileged instruction) interrupt gate clears FL_IF, trap gate leaves FL_IF alone // - sel: Code segment selector for interrupt/trap handler // - off: Offset in code segment for interrupt/trap handler // - dpl: Descriptor Privilege Level - // the privilege level required for software to invoke // this interrupt/trap gate explicitly using an int instruction. #define SETGATE(gate, istrap, sel, off, d) \ set the T_SYSCALL interrupt to be callable from user mode via int instruction 56 lidt(idt, sizeof (idt));
trap( struct trapframe *tf) write syscall in xv6: interrupt table setup ... 1: do not disable interrupts during syscalls e.g. keypress/timer handling can interrupt slow syscall vectors[T_SYSCALL] — OS function for processor to run set to pointer to assembly function vector64 eventually calls C function trap trap returns to alltraps alltraps restores registers from tf , then returns to user-mode vector64: pushl $0 pushl $64 jmp alltraps vectors.S meaning: run in kernel mode hardware jumps here alltraps: ... call trap ... iret trapasm.S void { ... trap.c (yes, code segments specifjes more than that — nothing we care about) set it to use the kernel “code segment” ... // - istrap: 1 for a trap gate, 0 for an interrupt gate. ... SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); ... trap.c (run on boot) lidt — function (in x86.h) wrapping lidt instruction sets the interrupt descriptor table to idt idt = array of pointers to handler functions for each exception type (plus a few bits of information about those handler functions) (from mmu.h): // Set up a normal interrupt/trap gate descriptor. // (otherwise: triggers fault like privileged instruction) interrupt gate clears FL_IF, trap gate leaves FL_IF alone // - sel: Code segment selector for interrupt/trap handler // - off: Offset in code segment for interrupt/trap handler // - dpl: Descriptor Privilege Level - // the privilege level required for software to invoke // this interrupt/trap gate explicitly using an int instruction. #define SETGATE(gate, istrap, sel, off, d) \ set the T_SYSCALL interrupt to be callable from user mode via int instruction 56 lidt(idt, sizeof (idt));
trap( struct trapframe *tf) write syscall in xv6: interrupt table setup jmp alltraps con: makes writing system calls safely more complicated (what if keypress handler runs during system call?) pro: slow system calls don’t stop timers, keypresses, etc. from working non-system call exceptions: interrupts disabled vectors[T_SYSCALL] — OS function for processor to run set to pointer to assembly function vector64 eventually calls C function trap trap returns to alltraps alltraps restores registers from tf , then returns to user-mode vector64: pushl $0 pushl $64 ... 1: do not disable interrupts during syscalls vectors.S hardware jumps here alltraps: ... call trap ... iret trapasm.S void { ... trap.c e.g. keypress/timer handling can interrupt slow syscall (yes, code segments specifjes more than that — nothing we care about) ... // ... SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); ... trap.c (run on boot) lidt — function (in x86.h) wrapping lidt instruction sets the interrupt descriptor table to idt idt = array of pointers to handler functions for each exception type (plus a few bits of information about those handler functions) (from mmu.h): // Set up a normal interrupt/trap gate descriptor. // - istrap: 1 for a trap gate, 0 for an interrupt gate. interrupt gate clears FL_IF, trap gate leaves FL_IF alone meaning: run in kernel mode // - sel: Code segment selector for interrupt/trap handler // - off: Offset in code segment for interrupt/trap handler // - dpl: Descriptor Privilege Level - // the privilege level required for software to invoke // this interrupt/trap gate explicitly using an int instruction. #define SETGATE(gate, istrap, sel, off, d) \ set the T_SYSCALL interrupt to be callable from user mode via int instruction (otherwise: triggers fault like privileged instruction) set it to use the kernel “code segment” 57 lidt(idt, sizeof (idt));
write syscall in xv6: interrupt table setup jmp alltraps (yes, code segments specifjes more than that — nothing we care about) 1: do not disable interrupts during syscalls e.g. keypress/timer handling can interrupt slow syscall vectors[T_SYSCALL] — OS function for processor to run set to pointer to assembly function vector64 eventually calls C function trap trap returns to alltraps alltraps restores registers from tf , then returns to user-mode vector64: pushl $0 pushl $64 ... ... vectors.S hardware jumps here alltraps: ... call trap ... iret trapasm.S void { ... trap.c meaning: run in kernel mode set it to use the kernel “code segment” (otherwise: triggers fault like privileged instruction) be callable from user mode via int instruction ... SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); ... trap.c (run on boot) lidt — function (in x86.h) wrapping lidt instruction sets the interrupt descriptor table to idt idt = array of pointers to handler functions for each exception type (plus a few bits of information about those handler functions) (from mmu.h): // Set up a normal interrupt/trap gate descriptor. // - istrap: 1 for a trap gate, 0 for an interrupt gate. // interrupt gate clears FL_IF, trap gate leaves FL_IF alone // - sel: Code segment selector for interrupt/trap handler // - off: Offset in code segment for interrupt/trap handler // - dpl: Descriptor Privilege Level - // the privilege level required for software to invoke // this interrupt/trap gate explicitly using an int instruction. #define SETGATE(gate, istrap, sel, off, d) \ set the T_SYSCALL interrupt to 58 lidt(idt, sizeof (idt)); trap( struct trapframe *tf)
interrupt type, application registers, … write syscall in xv6: the trap function } what operation to do for program uses myproc()->tf to determine syscall() — actual implementations much more on this later in semester represents currently running process myproc() — pseudo-global variable example: tf >eax = old value of eax struct trapframe — set by assembly trap.c ... void } return ; exit(); syscall(); exit(); { 59 trap( struct trapframe *tf) if (tf − >trapno == T_SYSCALL){ if (myproc() − >killed) myproc() − >tf = tf; if (myproc() − >killed)
write syscall in xv6: the trap function void what operation to do for program uses myproc()->tf to determine syscall() — actual implementations much more on this later in semester represents currently running process myproc() — pseudo-global variable struct trapframe — set by assembly trap.c } ... } return ; exit(); syscall(); exit(); { 59 trap( struct trapframe *tf) interrupt type, application registers, … if (tf − >trapno == T_SYSCALL){ example: tf − >eax = old value of eax if (myproc() − >killed) myproc() − >tf = tf; if (myproc() − >killed)
interrupt type, application registers, … write syscall in xv6: the trap function } what operation to do for program uses myproc()->tf to determine syscall() — actual implementations much more on this later in semester represents currently running process myproc() — pseudo-global variable example: tf >eax = old value of eax struct trapframe — set by assembly trap.c ... void } return ; exit(); syscall(); exit(); { 59 trap( struct trapframe *tf) if (tf − >trapno == T_SYSCALL){ if (myproc() − >killed) myproc() − >tf = tf; if (myproc() − >killed)
interrupt type, application registers, … write syscall in xv6: the trap function } what operation to do for program uses myproc()->tf to determine syscall() — actual implementations much more on this later in semester represents currently running process myproc() — pseudo-global variable example: tf >eax = old value of eax struct trapframe — set by assembly trap.c ... void } return ; exit(); syscall(); exit(); { 59 trap( struct trapframe *tf) if (tf − >trapno == T_SYSCALL){ if (myproc() − >killed) myproc() − >tf = tf; if (myproc() − >killed)
write syscall in xv6: the syscall function } else { copies tf >eax into %eax ) (assembly code this returns to result assigned to eax store result in user’s eax register call sys_…function from table (if system call number in range) ‘ [number] value ’: syscalls[number] = value array of functions — one for syscall syscall.c ... if (num > 0 && num < NELEM(syscalls) && syscalls[num]) { static int (*syscalls[])( void ) = { ... { syscall( void ) void ... }; ... sys_write, [SYS_write] ... 60 num = curproc − >tf − >eax; curproc − >tf − >eax = syscalls[num]();
write syscall in xv6: the syscall function } else { copies tf >eax into %eax ) (assembly code this returns to result assigned to eax store result in user’s eax register call sys_…function from table (if system call number in range) ‘ [number] value ’: syscalls[number] = value array of functions — one for syscall syscall.c ... if (num > 0 && num < NELEM(syscalls) && syscalls[num]) { static int (*syscalls[])( void ) = { ... { syscall( void ) void ... }; ... sys_write, [SYS_write] ... 60 num = curproc − >tf − >eax; curproc − >tf − >eax = syscalls[num]();
write syscall in xv6: the syscall function } else { copies tf >eax into %eax ) (assembly code this returns to result assigned to eax store result in user’s eax register call sys_…function from table (if system call number in range) ‘ [number] value ’: syscalls[number] = value array of functions — one for syscall syscall.c ... if (num > 0 && num < NELEM(syscalls) && syscalls[num]) { static int (*syscalls[])( void ) = { ... { syscall( void ) void ... }; ... sys_write, [SYS_write] ... 60 num = curproc − >tf − >eax; curproc − >tf − >eax = syscalls[num]();
write syscall in xv6: the syscall function static int (*syscalls[])( void ) = { (assembly code this returns to result assigned to eax store result in user’s eax register call sys_…function from table (if system call number in range) ‘ [number] value ’: syscalls[number] = value array of functions — one for syscall syscall.c ... } else { if (num > 0 && num < NELEM(syscalls) && syscalls[num]) { ... { syscall( void ) void ... }; ... sys_write, [SYS_write] ... 60 copies tf − >eax into %eax ) num = curproc − >tf − >eax; curproc − >tf − >eax = syscalls[num]();
write syscall in xv6: sys_write } (the terminal counts as a fjle) actual internal function that implements writing to a fjle (note: 32-bit x86 calling convention puts all args on stack) (more on this later) returns -1 on error (e.g. stack pointer invalid) utility functions that read arguments from user’s stack sysfjle.c return filewrite(f, p, n); int if (argfd(0, 0, &f) < 0 || argint(2, &n) < 0 || argptr(1, &p, n) < 0) int n; { sys_write( void ) 61 struct file *f; char *p; return − 1;
write syscall in xv6: sys_write } (the terminal counts as a fjle) actual internal function that implements writing to a fjle (note: 32-bit x86 calling convention puts all args on stack) (more on this later) returns -1 on error (e.g. stack pointer invalid) utility functions that read arguments from user’s stack sysfjle.c return filewrite(f, p, n); int if (argfd(0, 0, &f) < 0 || argint(2, &n) < 0 || argptr(1, &p, n) < 0) int n; { sys_write( void ) 61 struct file *f; char *p; return − 1;
write syscall in xv6: sys_write } (the terminal counts as a fjle) actual internal function that implements writing to a fjle (note: 32-bit x86 calling convention puts all args on stack) (more on this later) returns -1 on error (e.g. stack pointer invalid) utility functions that read arguments from user’s stack sysfjle.c return filewrite(f, p, n); int if (argfd(0, 0, &f) < 0 || argint(2, &n) < 0 || argptr(1, &p, n) < 0) int n; { sys_write( void ) 61 struct file *f; char *p; return − 1;
write syscall in xv6: interrupt table setup jmp alltraps (yes, code segments specifjes more than that — nothing we care about) 1: do not disable interrupts during syscalls e.g. keypress/timer handling can interrupt slow syscall vectors[T_SYSCALL] — OS function for processor to run set to pointer to assembly function vector64 eventually calls C function trap trap returns to alltraps alltraps restores registers from tf , then returns to user-mode vector64: pushl $0 pushl $64 ... ... vectors.S hardware jumps here alltraps: ... call trap ... iret trapasm.S void { ... trap.c meaning: run in kernel mode set it to use the kernel “code segment” (otherwise: triggers fault like privileged instruction) be callable from user mode via int instruction ... SETGATE(idt[T_SYSCALL], 1, SEG_KCODE<<3, vectors[T_SYSCALL], DPL_USER); ... trap.c (run on boot) lidt — function (in x86.h) wrapping lidt instruction sets the interrupt descriptor table to idt idt = array of pointers to handler functions for each exception type (plus a few bits of information about those handler functions) (from mmu.h): // Set up a normal interrupt/trap gate descriptor. // - istrap: 1 for a trap gate, 0 for an interrupt gate. // interrupt gate clears FL_IF, trap gate leaves FL_IF alone // - sel: Code segment selector for interrupt/trap handler // - off: Offset in code segment for interrupt/trap handler // - dpl: Descriptor Privilege Level - // the privilege level required for software to invoke // this interrupt/trap gate explicitly using an int instruction. #define SETGATE(gate, istrap, sel, off, d) \ set the T_SYSCALL interrupt to 62 lidt(idt, sizeof (idt)); trap( struct trapframe *tf)
write syscall in xv6 C function: trap() return via trap() HW switches stacks return from interrupt using syscall # from eax read args from user stack C function: sys_write() C function: syscall() ( struct trapframe ) asm saves regs HW switches stacks + calls user mode assembly func: vector64() trigger exception HW does lookup interrupt table function call: write() syscall wrapper ( int $64 ) user program kernel mode 63 + stack + stack
+ stack + stack write syscall in xv6 C function: trap() return via trap() HW switches stacks return from interrupt using syscall # from eax C function: sys_write() C function: syscall() ( struct trapframe ) asm saves regs HW switches stacks + calls user mode assembly func: vector64() trigger exception HW does lookup interrupt table function call: write() syscall wrapper ( int $64 ) user program kernel mode 64 read args from user stack
xv6intro homework get familiar with xv6 OS add a new system call: writecount() returns total number of times write call happened 65
homework steps system call implementation: sys_writecount hint in writeup: imitate sys_uptime need a counter for number of writes add writecount to several tables/lists (list of handlers, list of library functions to create, etc.) recommendation: imitate how other system calls are listed create a userspace program that calls writecount recommendation: copy from given programs repeat, adding setwritecount 66
note on locks some existing code uses acquire/release you do not have to do this only for multiprocessor support 67
68
backup slides 69
time multiplexing really loop.exe ssh.exe firefox.exe loop.exe ssh.exe = operating system exception happens return from exception 70
time multiplexing really loop.exe ssh.exe firefox.exe loop.exe ssh.exe = operating system exception happens return from exception 70
Recommend
More recommend