Changelog jmp start_loop jge end_loop # rbx >= 10? cmpq $10, %rbx ... ... ... ... end_loop: addq $1, %rbx call foo call foo jge end_loop # rbx >= 10? cmpq $10, %rbx start_loop: while ( b < 10) { foo (); b += 1; } while — levels of optimization 3 end_loop: start_loop: addq $1, %rbx jne start_loop subq %rbx, %rax end_loop: movq $10, %rbx jne start_loop # rbx != 0 decq %rbx call foo start_loop: movq %rax, %rbx movq $10, %rax cmpq $10, %rbx jge end_loop # rbx >= 10 cmpq $10, %rbx ... ... ... end_loop: jne start_loop # rbx != 10? movq $10, %rbx # rbx != 0 Changes made in this version not seen in fjrst lecture: start_loop: ... end_loop: jmp start_loop addq $1, %rbx call foo jge end_loop # rbx >= 10? cmpq $10, %rbx while ( b < 10) { foo (); b += 1; } ... while — levels of optimization 1 Y86 / Binary Ops 0 instructions 7 September 2017: slide 32: was missing rrmovq near end of decoded 64-bit division is much slower) division is weirdly not much slower than 1-byte division on Skylake (but 7 September 2017: slide 37: correct text about division speed: four-byte ... ... decq %rbx ... call foo start_loop: movq %rax, %rbx subq %rbx, %rax movq $10, %rax jge end_loop # rbx >= 10 cmpq $10, %rbx ... ... cmpq $10, %rbx end_loop: jne start_loop # rbx != 10? cmpq $10, %rbx addq $1, %rbx call foo start_loop: jge end_loop # rbx >= 10? 3
while — levels of optimization microarchitecture vs. instruction set architecutre (ISA) movq %rax, %rbx start_loop: call foo decq %rbx # rbx != 0 jne start_loop movq $10, %rbx end_loop: 3 last time condition codes: ZF (zero), SF (sign), OF (overfmow), CF (carry) read address of next instruction from table cmov CC : conditional move while ( b < 10) { foo (); b += 1; } 4 pre-quiz next week textbooks are defjnitely available quiz on reading for next week get a textbook if you don’t have one 5 bomb HW grades are on the gradebook please check: possible you registered a bomb with an invalid computing ID some transient weirdness with gradebook if you had used multiple bombs, now fjxed subq %rbx, %rax movq $10, %rax jge end_loop cmpq $10, %rbx start_loop: cmpq $10, %rbx # rbx >= 10? jge end_loop call foo addq $1, %rbx jmp start_loop end_loop: ... ... ... ... # rbx >= 10? # rbx >= 10 jge end_loop start_loop: call foo addq $1, %rbx cmpq $10, %rbx # rbx != 10? jne start_loop end_loop: ... ... ... cmpq $10, %rbx 6 jump tables: jmp *table(%rax) Y86: movq → { rrmovq , irmovq , mrmovq , rmmovq }
strlen/strsep lab pushq /* ptr points to &string[5]: "is a test" */ 9 Y86-64 instruction set based on x86 omits most of the 1000+ instructions leaves addq jmp subq j CC /* token points to &string[0], string "this" */ popq andq cmov CC movq (renamed) xorq call hlt (renamed) nop ret much, much simpler encoding /* ' ' after "this" replaced by '\0' */ 10 next week: in-lab quiz to write two functions: while (( token = strsep (& ptr , ' ' )) != NULL ) { strlen — length of nul-terminated string strsep (simplifjed) — divide string into ‘tokens’ 7 strsep (1) char * strsep (char ** ptrToString , char delimiter ); char string [] = "this is a test" ; char * ptr = string ; char * token ; printf ( "[%s]" , token ); char * token ; } /* output: [this][is][a][test] */ /* final value of buffer: "this\0is\0a\0test" */ 8 strsep (2) char * strsep (char ** ptrToString , char delimiter ); char string [] = "this is a test" ; char * ptr = string ; token = strsep (& ptr , ' ' );
Y86-64: specifying addresses addq %r11, %r12 Valid: rmmovq %r11, 10(%r12) Y86-64: accessing memory (1) Invalid: addq 10(%r11), %r12 Instead: mrmovq 10(%r11), %r11 /* overwrites %r11 */ 12 Invalid: Y86-64: accessing memory (1) Invalid: addq 10(%r11), %r12 Instead: mrmovq 10(%r11), %r11 /* overwrites %r11 */ addq %r11, %r12 rmmovq %r11, 10(%r12,%r13,4) 11 rmmovq %r11, 10(,%r12,4) Invalid: Invalid: rmmovq %r11, 10(%r12,%r13) Invalid: rmmovq %r11, 10(,%r12,4) Invalid: rmmovq %r11, 10(%r12,%r13,4) 11 Y86-64: specifying addresses Valid: rmmovq %r11, 10(%r12) Invalid: rmmovq %r11, 10(%r12,%r13) 12 ❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤ ✭ ✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭ ❤ ❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤ ✭ ✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭ ❤ ❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤ ✭ ✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭ ❤ r12 ← memory[10 + r11] + r12 r12 ← memory[10 + r11] + r12 ❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤ ✭ ❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤ ✭ ✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭ ✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭ ❤ ❤
Y86-64: accessing memory (2) Y86-64 constants (1) addq %r11, %r11 addq %r11, %r11 addq %r11, %r11 mrmovq 10(%r11), %r11 addq %r11, %r12 13 irmovq $100, %r11 addq 10(,%r11,8), %r12 only instruction with non-address constant operand 14 Y86-64 constants (2) addq $1, %r12 Instead, need an extra register: irmovq $1, %r11 addq %r11, %r12 /* replace %r11 with 8*%r11 */ Instead: 15 addq %r11, %r11 Invalid: addq 10(,%r11,8), %r12 Instead: /* replace %r11 with 8*%r11 */ addq %r11, %r11 addq %r11, %r11 mrmovq 10(%r11), %r11 addq %r11, %r12 13 Y86-64: accessing memory (2) Invalid: r12 ← memory[10 + 8 * r11] + r12 r12 ← memory[10 + 8 * r11] + r12 ❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤ ✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭ ✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭✭ ❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤ r12 ← r12 + 1 Invalid: ✭✭✭✭✭✭✭✭✭✭✭✭ ❤❤❤❤❤❤❤❤❤❤❤❤
Y86-64 constants (2) not set by anything else missing OF (overfmow fmag); CF (carry fmag) g ge ne e l le value test condition code bit test cmov __ j __ subq SECOND, FIRST (value = FIRST - SECOND) Y86-64: using condition codes 17 or this course: no OF , CF (to simplify assignments) only one kind of value for each operand addq $1, %r12 Instead, need an extra register: irmovq $1, %r11 addq %r11, %r12 15 Y86-64: operand uniqueness 18 ZF — value was zero? Y86-64: condition codes SF — sign bit was set? i.e. value was negative? 16 r12 ← r12 + 1 Invalid: ✭✭✭✭✭✭✭✭✭✭✭✭ ❤❤❤❤❤❤❤❤❤❤❤❤ instruction name tells you the kind (why movq was ‘split’ into four names) SF = 1 or ZF = 1 value ≤ 0 SF = 1 value < 0 ZF = 1 value = 0 value � = 0 ZF = 0 SF = 0 value ≥ 0 set by addq , subq , andq , xorq SF = 0 and ZF = 0 value > 0
Y86-64: conditionals (1) push/pop 19 Y86-64: conditionals (1) test instead: use side efgect of normal arithmetic instead of cmpq %r11, %r12 jle somewhere maybe: subq %r11, %r12 jle (but changes %r12 ) 19 pushq %rbx jle popq %rbx . . . memory[ %rsp + 16] memory[ %rsp + 8] memory[ %rsp ] memory[ %rsp - 8] memory[ %rsp - 16] value to pop where to push stack growth (but changes %r12 ) 20 subq %r11, %r12 Y86-64: conditionals (1) test instead: use side efgect of normal arithmetic instead of cmpq %r11, %r12 jle somewhere maybe: subq %r11, %r12 jle (but changes %r12 ) maybe: 19 cmpq %r11, %r12 jle somewhere test instead: use side efgect of normal arithmetic instead of ❳❳❳ ✘ ❳❳❳ ✘ cmp , ✘✘✘ cmp , ✘✘✘ ❳❳❳ ✘✘✘ ✘✘✘ ❳❳❳ ❳ ❳ ❳❳❳ ✘ cmp , ✘✘✘ ✘✘✘ ❳❳❳ ❳ %rsp ← %rsp − 8 memory[ %rsp ] ← %rbx %rbx ← memory[ %rsp ] %rsp ← %rsp + 8
call/ret 2 cc rA rB mrmovq D(rB), rA 0 rA rB 4 rmmovq rA, D(rB) F rB 0 3 irmovq V, rB rrmovq / cmovCC rA, rB 0 rA rB 0 1 nop 0 0 halt 9 8 7 5 OP q rA, rB 5 A Dest Dest D D V 0 rA F B popq rA 0 rA F pushq rA 6 0 9 ret 0 8 call Dest 7 cc j CC Dest fn rA rB 6 4 call LABEL memory[ %rsp ] Y86-64 state 21 growth stack where call stores return address address ret jumps to memory[ %rsp - 16] memory[ %rsp - 8] memory[ %rsp + 8] %r15 missing — replaced with “no register” memory[ %rsp + 16] . . . jmp to that address pop address from stack ret jmp to LABEL address push PC (next instruction address) on stack %r XX — 15 registers smaller parts of registers missing 3 more registers 2 1 0 byte: Y86-64 instruction formats 23 few addressing modes no instructions with two memory operands no “loops” within single instructions fjxed-length instructions seperate instructions to access memory fewer, simpler instructions typical RISC ISA properties 22 main memory PC — p rogram c ounter (AKA instruction pointer) Stat — processor status — halted? CF (carry) missing (no unsigned jumps) book has OF , we’ll not use it 24 ZF (zero), SF (sign), OF (overfmow)
Recommend
More recommend