cs 3330 introduction
play

CS 3330 introduction 1 layers of abstraction Higher-level - PowerPoint PPT Presentation

CS 3330 introduction 1 layers of abstraction Higher-level language: C x += y Assembly: X86-64 add %rbx, %rax Machine code: Y86 6 0 03 SIXTEEN Hardware Design Language: HCLRS Gates / Transistors / Wires / Registers 2 layers of


  1. goals/other topics understand how hardware works for… program performance what compilers are/do weird program behaviors 18

  2. weird program behaviors what is a segmentation fault really? how does the operating system interact with programs? if you want to handle them — writing OSs 19

  3. interlude: powers of two K (or Ki) … G (or Gi) … M (or Mi) … … … 20 2 11 2 048 2 0 2 12 1 4 096 2 1 2 13 2 8 192 2 2 2 14 4 16 384 2 3 2 15 8 32 768 2 4 2 16 16 65 536 2 5 32 2 6 2 20 64 1 048 576 2 7 128 2 8 2 30 256 1 073 741 824 2 9 2 31 512 2 147 483 648 2 10 2 32 1 024 4 294 967 296

  4. powers of two: forward (30 = G) (20 = M) 21 2 35 2 21 2 9 2 14

  5. powers of two: forward (20 = M) 21 2 35 = 2 5 · 2 30 = 32 G (30 = G) 2 21 2 9 2 14

  6. powers of two: forward (20 = M) 21 2 35 = 2 5 · 2 30 = 32 G (30 = G) 2 21 2 9 2 14

  7. powers of two: forward 21 2 35 = 2 5 · 2 30 = 32 G (30 = G) 2 21 = 2 1 · 2 20 = 2 M (20 = M) 2 9 2 14

  8. powers of two: forward 21 2 35 = 2 5 · 2 30 = 32 G (30 = G) 2 21 = 2 1 · 2 20 = 2 M (20 = M) 2 9 = 512 2 14

  9. powers of two: forward 21 2 35 = 2 5 · 2 30 = 32 G (30 = G) 2 21 = 2 1 · 2 20 = 2 M (20 = M) 2 9 = 512 2 14 = 2 4 · 2 10 = 16 K

  10. powers of two: backward 16G 128K 4M 256T 22

  11. powers of two: backward 128K 4M 256T 22 16G = 16 · 2 30 = 2 30+4 = 2 34

  12. powers of two: backward 4M 256T 22 16G = 16 · 2 30 = 2 30+4 = 2 34 128K = 128 · 2 10 = 2 10+7 = 2 17

  13. powers of two: backward 22 16G = 16 · 2 30 = 2 30+4 = 2 34 128K = 128 · 2 10 = 2 10+7 = 2 17 4M = 4 · 2 20 = 2 20+2 = 2 22 256T = 256 · 2 40 = 2 40+8 = 2 48

  14. lecturers Graham and I co-teaching two lecture sections mostly alternating: one week me, one week Graham same(ish) lecture in each section 23

  15. coursework labs — grading: did you make reasonable progress? collaboration permitted homework assignments — introduced by lab (mostly) due Tuesday night before next lab complete individually exams weekly quizzes 24

  16. on lecture/lab/HW synchronization labs/HWs not quite synchronized with lectures main problem: want to cover material before you need it in lab/HW 25

  17. quizzes? linked ofg course website (demo) after each week primarily based on lecture material from previous week some questions from reading for next week one quiz dropped fjrst quiz — after this week 26

  18. quiz demo 27

  19. attendance? lecture: strongly recommended. we will try to record lectures best-efgort — sometimes technical diffjculties lab: generally electronic, remote-possible submission 28

  20. late policy exceptional circumstance? contact us. otherwise, for homeworks only: -10% 0 to 48 hours late -15% 48 to 72 hours late -100% otherwise late quizzes, labs: no we release answers talk to us if illness, etc. 29

  21. TAs/Offjce Hours offjce hours will be posted on calendar on the website should be plenty use them 30

  22. your TODO list department account and/or C environment working department accounts should happen by this weekend before lab next week 31

  23. grading Quizzes: 10% Midterms (2): 30% Final Exam (cumulative): 20% Homework + Labs: 40% 32

  24. 33

  25. quiz demo 34

  26. memory 0x00041FFE 0x00042001 0x00 0x00042000 0x03 0x00041FFF 0x60 … 0x00042002 … 0xFE 0x00000002 0xE0 0x00000001 0xA0 0x00000000 0x01 0x02 address … 0x14 0xFFFFFFFF 0x45 0xFFFFFFFE 0xDE 0xFFFFFFFD … 0x00042003 0x06 0x00042006 0x05 0x00042005 0x04 0x00042004 0x03 value CPU interprets based on how accessed address … 0x04 0x00042004 0x05 0x00042005 0x06 0x00042006 … 0x03 0xDE 0xFFFFFFFD 0x45 0xFFFFFFFE 0x14 0xFFFFFFFF value 0x00042003 0x00042002 array of bytes (byte = 8 bits) … 0xA0 0x00000000 0xE0 0x00000001 0xFE 0x00000002 … 0x60 0x02 0x00041FFE 0x03 0x00041FFF 0x00 0x00042000 0x01 0x00042001 35

  27. memory 0x00041FFE 0x00042001 0x00 0x00042000 0x03 0x00041FFF 0x60 … 0x00042002 … 0xFE 0x00000002 0xE0 0x00000001 0xA0 0x00000000 0x01 0x02 address … 0x14 0xFFFFFFFF 0x45 0xFFFFFFFE 0xDE 0xFFFFFFFD … 0x00042003 0x06 0x00042006 0x05 0x00042005 0x04 0x00042004 0x03 value CPU interprets based on how accessed address … 0x04 0x00042004 0x05 0x00042005 0x06 0x00042006 … 0x03 0xDE 0xFFFFFFFD 0x45 0xFFFFFFFE 0x14 0xFFFFFFFF value 0x00042003 0x00042002 array of bytes (byte = 8 bits) … 0xA0 0x00000000 0xE0 0x00000001 0xFE 0x00000002 … 0x60 0x02 0x00041FFE 0x03 0x00041FFF 0x00 0x00042000 0x01 0x00042001 35

  28. memory 0x00041FFE 0x00042001 0x00 0x00042000 0x03 0x00041FFF 0x60 … 0x00042002 … 0xFE 0x00000002 0xE0 0x00000001 0xA0 0x00000000 0x01 0x02 address … 0x14 0xFFFFFFFF 0x45 0xFFFFFFFE 0xDE 0xFFFFFFFD … 0x00042003 0x06 0x00042006 0x05 0x00042005 0x04 0x00042004 0x03 value CPU interprets based on how accessed address … 0x04 0x00042004 0x05 0x00042005 0x06 0x00042006 … 0x03 0xDE 0xFFFFFFFD 0x45 0xFFFFFFFE 0x14 0xFFFFFFFF value 0x00042003 0x00042002 array of bytes (byte = 8 bits) … 0xA0 0x00000000 0xE0 0x00000001 0xFE 0x00000002 … 0x60 0x02 0x00041FFE 0x03 0x00041FFF 0x00 0x00042000 0x01 0x00042001 35

  29. endianness 0xFE 0x00042000 0x00 0x00041FFF 0x03 0x00041FFE 0x60 … … 0x00000002 0x00000001 0x00042001 0xE0 0x00000000 0xA0 int * x = (int*)0x42000; // or printf("%d\n", *x); 0x03020100 50462976 0x00010203 66051 0x01 0x02 little endian 0xFFFFFFFD (least signifjcant byte has lowest address) big endian (most signifjcant byte has lowest address) address value 0xFFFFFFFF 0x14 0xFFFFFFFE 0x45 0xDE 0x00042002 … … 0x00042006 0x06 0x00042005 0x05 0x00042004 0x04 0x00042003 0x03 36 cout << * x << endl ;

  30. endianness 0xFE 0x00042000 0x00 0x00041FFF 0x03 0x00041FFE 0x60 … … 0x00000002 0x00000001 0x00042001 0xE0 0x00000000 0xA0 int * x = (int*)0x42000; // or printf("%d\n", *x); 0x03020100 50462976 0x00010203 66051 0x01 0x02 little endian 0xFFFFFFFD (least signifjcant byte has lowest address) big endian (most signifjcant byte has lowest address) address value 0xFFFFFFFF 0x14 0xFFFFFFFE 0x45 0xDE 0x00042002 … … 0x00042006 0x06 0x00042005 0x05 0x00042004 0x04 0x00042003 0x03 36 cout << * x << endl ;

  31. endianness … 0x00042001 0x01 0x00042000 0x00 0x00041FFF 0x03 0x00041FFE 0x60 … little endian 0x00000002 0xFE 0x00000001 0xE0 0x00000000 0xA0 int * x = (int*)0x42000; // or printf("%d\n", *x); 0x02 0x00042002 0x03 0x00042003 (least signifjcant byte has lowest address) big endian (most signifjcant byte has lowest address) address value 0xFFFFFFFF 0x14 0xFFFFFFFE 0x45 0xFFFFFFFD 0xDE … … 0x00042006 0x06 0x00042005 0x05 0x00042004 0x04 36 cout << * x << endl ; 0x03020100 = 50462976 0x00010203 = 66051

  32. endianness … 0x00042001 0x01 0x00042000 0x00 0x00041FFF 0x03 0x00041FFE 0x60 … little endian 0x00000002 0xFE 0x00000001 0xE0 0x00000000 0xA0 int * x = (int*)0x42000; // or printf("%d\n", *x); 0x02 0x00042002 0x03 0x00042003 (least signifjcant byte has lowest address) big endian (most signifjcant byte has lowest address) address value 0xFFFFFFFF 0x14 0xFFFFFFFE 0x45 0xFFFFFFFD 0xDE … … 0x00042006 0x06 0x00042005 0x05 0x00042004 0x04 36 cout << * x << endl ; 0x03020100 = 50462976 0x00010203 = 66051

  33. endianness … 0x00042001 0x01 0x00042000 0x00 0x00041FFF 0x03 0x00041FFE 0x60 … little endian 0x00000002 0xFE 0x00000001 0xE0 0x00000000 0xA0 int * x = (int*)0x42000; // or printf("%d\n", *x); 0x02 0x00042002 0x03 0x00042003 (least signifjcant byte has lowest address) big endian (most signifjcant byte has lowest address) address value 0xFFFFFFFF 0x14 0xFFFFFFFE 0x45 0xFFFFFFFD 0xDE … … 0x00042006 0x06 0x00042005 0x05 0x00042004 0x04 36 cout << * x << endl ; 0x03020100 = 50462976 0x00010203 = 66051

  34. program memory (x86-64 Linux) “top” has smallest address (next thing on stack) local variables callee saved registers return address … argument 7 argument 6 … stack grows down Used by OS 0x0000 0000 0040 0000 Code + Constants Writable data Heap / other dynamic 0x7F… Stack 0xFFFF 8000 0000 0000 0xFFFF FFFF FFFF FFFF 37

  35. program memory (x86-64 Linux) “top” has smallest address (next thing on stack) local variables callee saved registers return address … argument 7 argument 6 … stack grows down Used by OS 0x0000 0000 0040 0000 Code + Constants Writable data Heap / other dynamic 0x7F… Stack 0xFFFF 8000 0000 0000 0xFFFF FFFF FFFF FFFF 37

  36. program memory (x86-64 Linux) “top” has smallest address (next thing on stack) local variables callee saved registers return address … argument 7 argument 6 … stack grows down Used by OS 0x0000 0000 0040 0000 Code + Constants Writable data Heap / other dynamic 0x7F… Stack 0xFFFF 8000 0000 0000 0xFFFF FFFF FFFF FFFF 37

  37. program memory (x86-64 Linux) “top” has smallest address (next thing on stack) local variables callee saved registers return address … argument 7 argument 6 … stack grows down Used by OS 0x0000 0000 0040 0000 Code + Constants Writable data Heap / other dynamic 0x7F… Stack 0xFFFF 8000 0000 0000 0xFFFF FFFF FFFF FFFF 37

  38. compilation pipeline (executable) (object fjle) printf.o } printf ( "Hello, World!\n" ); int main (void) { #include <stdio.h> main.c: (machine code) main.exe main.c linking (machine code) (object fjle) main.o assemble (assembly) main.s compile (C code) 38

  39. compilation pipeline (executable) (object fjle) printf.o } printf ( "Hello, World!\n" ); int main (void) { #include <stdio.h> main.c: (machine code) main.exe main.c linking (machine code) (object fjle) main.o assemble (assembly) main.s compile (C code) 38

  40. compilation pipeline (executable) (object fjle) printf.o } printf ( "Hello, World!\n" ); int main (void) { #include <stdio.h> main.c: (machine code) main.exe main.c linking (machine code) (object fjle) main.o assemble (assembly) main.s compile (C code) 38

  41. compilation commands compile: … file gcc -o file file.c c+a+l: file.o gcc -c file.c c+a: file (executable) 39 gcc -o file file.o link: file.o (object fjle) gcc -c file.s assemble: file.s (assembly) gcc -S file.c ⇒ ⇒ ⇒ ⇒ ⇒

  42. what’s in those fjles? data segment, byte 0 code than mov) Linux x86-64 calling convention: stack addr. must be multiple of 16 text (code) segment: 48 83 EC 08 BF 00 00 00 00 E8 00 00 00 00 31 C0 48 83 C4 08 C3 data segment: 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 relocations : take 0s at and replace with address of puts #include <stdio.h> symbol table : main text byte 0 hello.o (actually binary, but shown as hexadecimal) … 48 83 EC 08 BF A7 02 04 00 83 C4 08 C3 … …(code from stdio.o) … 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 … …(data from stdio.o) … hello.exe + stdio.o (shorter machine sets eax to 0 hello.s (Intel syntax) .Lstr: .string "Hello, World!" int main (void) { puts ( "Hello, World!" ); return 0; } hello.c .text main: sub $8, %rsp mov $.Lstr, %rdi call puts xor %eax, %eax add 40 ret $8, %rsp ret .data .Lstr: .string "Hello, World!" hello.s .text main: .data sub RSP, 8 mov RDI, .Lstr call puts xor EAX, EAX add RSP, 8 text, byte 6 ( ) text, byte 11 ( ) E8 08 4A 04 00 31 C0 48

  43. what’s in those fjles? data segment, byte 0 code than mov) Linux x86-64 calling convention: stack addr. must be multiple of 16 text (code) segment: 48 83 EC 08 BF 00 00 00 00 E8 00 00 00 00 31 C0 48 83 C4 08 C3 data segment: 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 relocations : take 0s at and replace with address of puts #include <stdio.h> symbol table : main text byte 0 hello.o (actually binary, but shown as hexadecimal) … 48 83 EC 08 BF A7 02 04 00 83 C4 08 C3 … …(code from stdio.o) … 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 … …(data from stdio.o) … hello.exe + stdio.o (shorter machine sets eax to 0 hello.s (Intel syntax) .Lstr: .string "Hello, World!" int main (void) { puts ( "Hello, World!" ); return 0; } hello.c .text main: sub $8, %rsp mov $.Lstr, %rdi call puts xor %eax, %eax add 40 ret $8, %rsp ret .data .Lstr: .string "Hello, World!" hello.s .text main: .data sub RSP, 8 mov RDI, .Lstr call puts xor EAX, EAX add RSP, 8 text, byte 6 ( ) text, byte 11 ( ) E8 08 4A 04 00 31 C0 48

  44. what’s in those fjles? data segment, byte 0 code than mov) Linux x86-64 calling convention: stack addr. must be multiple of 16 text (code) segment: 48 83 EC 08 BF 00 00 00 00 E8 00 00 00 00 31 C0 48 83 C4 08 C3 data segment: 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 relocations : take 0s at and replace with address of puts #include <stdio.h> symbol table : main text byte 0 hello.o (actually binary, but shown as hexadecimal) … 48 83 EC 08 BF A7 02 04 00 83 C4 08 C3 … …(code from stdio.o) … 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 … …(data from stdio.o) … hello.exe + stdio.o (shorter machine sets eax to 0 hello.s (Intel syntax) .Lstr: .string "Hello, World!" int main (void) { puts ( "Hello, World!" ); return 0; } hello.c .text main: sub $8, %rsp mov $.Lstr, %rdi call puts xor %eax, %eax add 40 ret $8, %rsp ret .data .Lstr: .string "Hello, World!" hello.s .text main: .data sub RSP, 8 mov RDI, .Lstr call puts xor EAX, EAX add RSP, 8 text, byte 6 ( ) text, byte 11 ( ) E8 08 4A 04 00 31 C0 48

  45. what’s in those fjles? data segment, byte 0 code than mov) Linux x86-64 calling convention: stack addr. must be multiple of 16 text (code) segment: 48 83 EC 08 BF 00 00 00 00 E8 00 00 00 00 31 C0 48 83 C4 08 C3 data segment: 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 relocations : take 0s at and replace with address of puts #include <stdio.h> symbol table : main text byte 0 hello.o (actually binary, but shown as hexadecimal) … 48 83 EC 08 BF A7 02 04 00 83 C4 08 C3 … …(code from stdio.o) … 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 … …(data from stdio.o) … hello.exe + stdio.o (shorter machine sets eax to 0 hello.s (Intel syntax) .Lstr: .string "Hello, World!" int main (void) { puts ( "Hello, World!" ); return 0; } hello.c .text main: sub $8, %rsp mov $.Lstr, %rdi call puts xor %eax, %eax add 40 ret $8, %rsp ret .data .Lstr: .string "Hello, World!" hello.s .text main: .data sub RSP, 8 mov RDI, .Lstr call puts xor EAX, EAX add RSP, 8 text, byte 6 ( ) text, byte 11 ( ) E8 08 4A 04 00 31 C0 48

  46. what’s in those fjles? data segment, byte 0 code than mov) Linux x86-64 calling convention: stack addr. must be multiple of 16 text (code) segment: 48 83 EC 08 BF 00 00 00 00 E8 00 00 00 00 31 C0 48 83 C4 08 C3 data segment: 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 relocations : take 0s at and replace with address of puts #include <stdio.h> symbol table : main text byte 0 hello.o (actually binary, but shown as hexadecimal) … 48 83 EC 08 BF A7 02 04 00 83 C4 08 C3 … …(code from stdio.o) … 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 … …(data from stdio.o) … hello.exe + stdio.o (shorter machine sets eax to 0 hello.s (Intel syntax) .Lstr: .string "Hello, World!" int main (void) { puts ( "Hello, World!" ); return 0; } hello.c .text main: sub $8, %rsp mov $.Lstr, %rdi call puts xor %eax, %eax add 40 ret $8, %rsp ret .data .Lstr: .string "Hello, World!" hello.s .text main: .data sub RSP, 8 mov RDI, .Lstr call puts xor EAX, EAX add RSP, 8 text, byte 6 ( ) text, byte 11 ( ) E8 08 4A 04 00 31 C0 48

  47. what’s in those fjles? data segment, byte 0 code than mov) Linux x86-64 calling convention: stack addr. must be multiple of 16 text (code) segment: 48 83 EC 08 BF 00 00 00 00 E8 00 00 00 00 31 C0 48 83 C4 08 C3 data segment: 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 relocations : take 0s at and replace with address of puts #include <stdio.h> symbol table : main text byte 0 hello.o (actually binary, but shown as hexadecimal) … 48 83 EC 08 BF A7 02 04 00 83 C4 08 C3 … …(code from stdio.o) … 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 … …(data from stdio.o) … hello.exe + stdio.o (shorter machine sets eax to 0 hello.s (Intel syntax) .Lstr: .string "Hello, World!" int main (void) { puts ( "Hello, World!" ); return 0; } hello.c .text main: sub $8, %rsp mov $.Lstr, %rdi call puts xor %eax, %eax add 40 ret $8, %rsp ret .data .Lstr: .string "Hello, World!" hello.s .text main: .data sub RSP, 8 mov RDI, .Lstr call puts xor EAX, EAX add RSP, 8 text, byte 6 ( ) text, byte 11 ( ) E8 08 4A 04 00 31 C0 48

  48. what’s in those fjles? data segment, byte 0 code than mov) Linux x86-64 calling convention: stack addr. must be multiple of 16 text (code) segment: 48 83 EC 08 BF 00 00 00 00 E8 00 00 00 00 31 C0 48 83 C4 08 C3 data segment: 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 relocations : take 0s at and replace with address of puts #include <stdio.h> symbol table : main text byte 0 hello.o (actually binary, but shown as hexadecimal) … 48 83 EC 08 BF A7 02 04 00 83 C4 08 C3 … …(code from stdio.o) … 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 … …(data from stdio.o) … hello.exe + stdio.o (shorter machine sets eax to 0 hello.s (Intel syntax) .Lstr: .string "Hello, World!" int main (void) { puts ( "Hello, World!" ); return 0; } hello.c .text main: sub $8, %rsp mov $.Lstr, %rdi call puts xor %eax, %eax add 40 ret $8, %rsp ret .data .Lstr: .string "Hello, World!" hello.s .text main: .data sub RSP, 8 mov RDI, .Lstr call puts xor EAX, EAX add RSP, 8 text, byte 6 ( ) text, byte 11 ( ) E8 08 4A 04 00 31 C0 48

  49. what’s in those fjles? data segment, byte 0 code than mov) Linux x86-64 calling convention: stack addr. must be multiple of 16 text (code) segment: 48 83 EC 08 BF 00 00 00 00 E8 00 00 00 00 31 C0 48 83 C4 08 C3 data segment: 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 relocations : take 0s at and replace with address of puts #include <stdio.h> symbol table : main text byte 0 hello.o (actually binary, but shown as hexadecimal) … 48 83 EC 08 BF A7 02 04 00 83 C4 08 C3 … …(code from stdio.o) … 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 … …(data from stdio.o) … hello.exe + stdio.o (shorter machine sets eax to 0 hello.s (Intel syntax) .Lstr: .string "Hello, World!" int main (void) { puts ( "Hello, World!" ); return 0; } hello.c .text main: sub $8, %rsp mov $.Lstr, %rdi call puts xor %eax, %eax add 40 ret $8, %rsp ret .data .Lstr: .string "Hello, World!" hello.s .text main: .data sub RSP, 8 mov RDI, .Lstr call puts xor EAX, EAX add RSP, 8 text, byte 6 ( ) text, byte 11 ( ) E8 08 4A 04 00 31 C0 48

  50. what’s in those fjles? data segment, byte 0 code than mov) Linux x86-64 calling convention: stack addr. must be multiple of 16 text (code) segment: 48 83 EC 08 BF 00 00 00 00 E8 00 00 00 00 31 C0 48 83 C4 08 C3 data segment: 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 relocations : take 0s at and replace with address of puts #include <stdio.h> symbol table : main text byte 0 hello.o (actually binary, but shown as hexadecimal) … 48 83 EC 08 BF A7 02 04 00 83 C4 08 C3 … …(code from stdio.o) … 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 … …(data from stdio.o) … hello.exe + stdio.o (shorter machine sets eax to 0 hello.s (Intel syntax) .Lstr: .string "Hello, World!" int main (void) { puts ( "Hello, World!" ); return 0; } hello.c .text main: sub $8, %rsp mov $.Lstr, %rdi call puts xor %eax, %eax add 40 ret $8, %rsp ret .data .Lstr: .string "Hello, World!" hello.s .text main: .data sub RSP, 8 mov RDI, .Lstr call puts xor EAX, EAX add RSP, 8 text, byte 6 ( ) text, byte 11 ( ) E8 08 4A 04 00 31 C0 48

  51. what’s in those fjles? data segment, byte 0 code than mov) Linux x86-64 calling convention: stack addr. must be multiple of 16 text (code) segment: 48 83 EC 08 BF 00 00 00 00 E8 00 00 00 00 31 C0 48 83 C4 08 C3 data segment: 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 relocations : take 0s at and replace with address of puts #include <stdio.h> symbol table : main text byte 0 hello.o (actually binary, but shown as hexadecimal) … 48 83 EC 08 BF A7 02 04 00 83 C4 08 C3 … …(code from stdio.o) … 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 … …(data from stdio.o) … hello.exe + stdio.o (shorter machine sets eax to 0 hello.s (Intel syntax) .Lstr: .string "Hello, World!" int main (void) { puts ( "Hello, World!" ); return 0; } hello.c .text main: sub $8, %rsp mov $.Lstr, %rdi call puts xor %eax, %eax add 40 ret $8, %rsp ret .data .Lstr: .string "Hello, World!" hello.s .text main: .data sub RSP, 8 mov RDI, .Lstr call puts xor EAX, EAX add RSP, 8 text, byte 6 ( ) text, byte 11 ( ) E8 08 4A 04 00 31 C0 48

  52. what’s in those fjles? data segment, byte 0 code than mov) Linux x86-64 calling convention: stack addr. must be multiple of 16 text (code) segment: 48 83 EC 08 BF 00 00 00 00 E8 00 00 00 00 31 C0 48 83 C4 08 C3 data segment: 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 relocations : take 0s at and replace with address of puts #include <stdio.h> symbol table : main text byte 0 hello.o (actually binary, but shown as hexadecimal) … 48 83 EC 08 BF A7 02 04 00 83 C4 08 C3 … …(code from stdio.o) … 48 65 6C 6C 6F 2C 20 57 6F 72 6C 00 … …(data from stdio.o) … hello.exe + stdio.o (shorter machine sets eax to 0 hello.s (Intel syntax) .Lstr: .string "Hello, World!" int main (void) { puts ( "Hello, World!" ); return 0; } hello.c .text main: sub $8, %rsp mov $.Lstr, %rdi call puts xor %eax, %eax add 40 ret $8, %rsp ret .data .Lstr: .string "Hello, World!" hello.s .text main: .data sub RSP, 8 mov RDI, .Lstr call puts xor EAX, EAX add RSP, 8 text, byte 6 ( ) text, byte 11 ( ) E8 08 4A 04 00 31 C0 48

  53. hello.s movl ret $8, %rsp addq $0, %eax movl puts call $.LC0, %edi $8, %rsp .section subq main: main .globl .text .string "Hello, World!" .LC0: .rodata.str1.1,"aMS",@progbits,1 41

  54. exercise (1) sayHello (); F. something else C. main.exe (executable) E. A, B and C B. main.o (object) D. B and C A. main.s (assembly) Which fjles contain the memory address of sayHello ? } 7 6 main.c: int main (void) { 5 } 4 puts ( "Hello, World!" ); 3 void sayHello (void) { 2 #include <stdio.h> 1 42

  55. exercise (2) sayHello (); F. something else C. main.exe (executable) E. A, B and C B. main.o (object) D. B and C A. main.s (assembly) Which fjles contain the literal ASCII string of Hello, World! ? } 7 6 main.c: int main (void) { 5 } 4 puts ( "Hello, World!" ); 3 void sayHello (void) { 2 #include <stdio.h> 1 43

  56. dynamic linking (very briefmy) dynamic linking — done when application is loaded other type of linking: static ( gcc -static ) load executable fjle + its libraries into memory when app starts often extra indirection: call functionTable[number_for_printf] linker fjlls in functionTable instead of changing call s ls.exe emacs.exe libc.so call functionTable[number_for_printf] printf: … 44 idea: don’t have N copies of printf on disk

  57. ldd /bin/ls $ ldd /bin/ls linux-vdso.so.1 => (0x00007ffcca9d8000) libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1 (0x00007f851756f000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f85171a5000) libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f8516f35000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f8516d31000) /lib64/ld-linux-x86-64.so.2 (0x00007f8517791000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f8516b14000) 45

  58. relocation types machine code doesn’t always use addresses as is “call function 4303 bytes later” linker needs to compute “4303” extra ‘type’ fjeld on relocation list e.g. call puts is 0x48 (4-byte ofgset to puts function) 46

  59. AT&T versus Intel syntax by example movq $42, (%rbx) mov QWORD PTR [rbx], 42 subq %rax, %r8 sub r8, rax movq $42, 100(%rbx,%rcx,4) mov QWORD PTR [rbx+rcx*4+100], 42 jmp *%rax jmp rax jmp *1000(%rax,%rbx,8) jmp QWORD PTR [RAX+RBX*8+1000] 47

  60. AT&T versus Intel syntax (1) AT&T syntax: movq $42, (%rbx) Intel syntax: mov QWORD PTR [rbx], 42 efgect (pseudo-C): memory[rbx] <- 42 48

  61. AT&T syntax example (1) mov q $42, (%rbx) destination last () s represent value in memory constants start with $ registers start with % q (‘quad’) indicates length (8 bytes) l : 4; w : 2; b : 1 sometimes can be omitted 49 // memory[rbx] ← 42

  62. AT&T syntax example (1) mov q $42, (%rbx) destination last () s represent value in memory constants start with $ registers start with % q (‘quad’) indicates length (8 bytes) l : 4; w : 2; b : 1 sometimes can be omitted 49 // memory[rbx] ← 42

  63. AT&T syntax example (1) mov q $42, (%rbx) destination last () s represent value in memory constants start with $ registers start with % q (‘quad’) indicates length (8 bytes) l : 4; w : 2; b : 1 sometimes can be omitted 49 // memory[rbx] ← 42

  64. AT&T syntax example (1) mov q $42, (%rbx) destination last () s represent value in memory constants start with $ registers start with % q (‘quad’) indicates length (8 bytes) l : 4; w : 2; b : 1 sometimes can be omitted 49 // memory[rbx] ← 42

  65. AT&T syntax example (1) mov q $42, (%rbx) destination last () s represent value in memory constants start with $ registers start with % l : 4; w : 2; b : 1 sometimes can be omitted 49 // memory[rbx] ← 42 q (‘quad’) indicates length (8 bytes)

  66. AT&T versus Intel syntax (2) AT&T syntax: movq $42, 100(%rbx,%rcx,4) Intel syntax: mov QWORD PTR [rbx+rcx*4+100], 42 efgect (pseudo-C): memory[rbx + rcx * 4 + 100] <- 42 50

  67. AT&T versus Intel syntax (2) AT&T syntax: Intel syntax: mov QWORD PTR [rbx+rcx*4+100], 42 efgect (pseudo-C): 50 movq $42, 100(%rbx,%rcx,4) memory[rbx + rcx * 4 + 100] <- 42

  68. AT&T versus Intel syntax (2) AT&T syntax: movq $42, 100(%rbx,%rcx,4) Intel syntax: mov QWORD PTR [rbx+rcx*4+100], 42 efgect (pseudo-C): memory[rbx + rcx * 4 + 100] <- 42 50

  69. AT&T versus Intel syntax (2) AT&T syntax: movq $42, 100(%rbx,%rcx,4) Intel syntax: mov QWORD PTR [rbx+rcx*4+100], 42 efgect (pseudo-C): 50 memory[rbx + rcx * 4 + 100] <- 42

  70. AT&T syntax: addressing 100(%rbx) : memory[rbx + 100] 100(%rbx,8) : memory[rbx * 8 + 100] 100(,%rbx,8) : memory[rbx * 8 + 100] 100(%rcx,%rbx,8) : memory[rcx + rbx * 8 + 100] 100 : memory[100] 100(%rbx,%rcx) : memory[rbx+rcx+100] 51

  71. AT&T versus Intel syntax (3) Intel syntax: sub r8, rax 52 r8 ← r8 - rax AT&T syntax: subq %rax, %r8 same for cmpq

  72. AT&T syntax: addresses addq 0x1000, %rax // Intel syntax: add rax, QWORD PTR [0x1000] addq $0x1000, %rax // Intel syntax: add rax, 0x1000 no $ — probably memory address 53 // rax ← rax + memory[0x1000] // rax ← rax + 0x1000

  73. AT&T syntax in one slide destination last () means value in memory disp(base, index, scale) same as memory[disp + base + index * scale] omit disp (defaults to 0 ) and/or omit base (defaults to 0 ) and/or scale (defualts to 1 ) $ means constant 54 plain number/label means value in memory

  74. extra detail: computed jumps // Intel syntax: jmp RAX // goto RAX // go to that address 55 jmpq *%rax jmpq *1000(%rax,%rbx,8) // Intel syntax: jmp QWORD PTR[RAX + RBX*8 + 1000] // read address from memory at RAX + RBX * 8 + 1000

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend