project application development
play

Project Application Development Building an IJVM emulator Sebastian - PowerPoint PPT Presentation

Project Application Development Building an IJVM emulator Sebastian Osterlund s.osterlund@vu.nl Department of Computer Science Vrije Universiteit Amsterdam June, 2020 Sebastian Osterlund Project Application Development June, 2020 1


  1. Decoding 1 No argument ( IADD, ISUB ) 2 Byte argument ( BIPUSH 0x42, BIPUSH 0x1 ) 3 2 bytes argument/ short ( GOTO 0x0010, IINC 0x2 0x1 ) 0x10 0x1 ? BIPUSH 0x1 0x84 0x1 0x5 ? Sebastian ¨ Osterlund Project Application Development June, 2020 39 / 108

  2. Decoding 1 No argument ( IADD, ISUB ) 2 Byte argument ( BIPUSH 0x42, BIPUSH 0x1 ) 3 2 bytes argument/ short ( GOTO 0x0010, IINC 0x2 0x1 ) 0x10 0x1 ? BIPUSH 0x1 0x84 0x1 0x5 ? IINC 0x1 0x5 Sebastian ¨ Osterlund Project Application Development June, 2020 39 / 108

  3. Decoding Note: some arguments are signed while others are not Sebastian ¨ Osterlund Project Application Development June, 2020 40 / 108

  4. Signed vs. unsigned Sebastian ¨ Osterlund Project Application Development June, 2020 41 / 108

  5. IJVM memory model pc program counter incremented after each instruction modified by branching instructions sp stack pointer memory consists of local frames used for methods keeps a link to the previous frame Sebastian ¨ Osterlund Project Application Development June, 2020 42 / 108

  6. IJVM memory model pc program counter incremented after each instruction modified by branching instructions sp stack pointer memory consists of local frames used for methods keeps a link to the previous frame Sebastian ¨ Osterlund Project Application Development June, 2020 42 / 108

  7. IJVM memory model pc program counter incremented after each instruction modified by branching instructions sp stack pointer memory consists of local frames used for methods keeps a link to the previous frame Sebastian ¨ Osterlund Project Application Development June, 2020 42 / 108

  8. IJVM memory model pc program counter incremented after each instruction modified by branching instructions sp stack pointer memory consists of local frames used for methods keeps a link to the previous frame Sebastian ¨ Osterlund Project Application Development June, 2020 42 / 108

  9. IJVM memory model pc program counter incremented after each instruction modified by branching instructions sp stack pointer memory consists of local frames used for methods keeps a link to the previous frame Sebastian ¨ Osterlund Project Application Development June, 2020 42 / 108

  10. IJVM memory model pc program counter incremented after each instruction modified by branching instructions sp stack pointer memory consists of local frames used for methods keeps a link to the previous frame Sebastian ¨ Osterlund Project Application Development June, 2020 42 / 108

  11. IJVM memory model pc program counter incremented after each instruction modified by branching instructions sp stack pointer memory consists of local frames used for methods keeps a link to the previous frame Sebastian ¨ Osterlund Project Application Development June, 2020 42 / 108

  12. IJVM programs IJVM executes bytecode Not really human-readable. . . Java ASsembly (JAS) We have written our custom assembler for this course (goJASM) Sebastian ¨ Osterlund Project Application Development June, 2020 43 / 108

  13. Assembler Translates text into machine code Translates variable names into indices labels into offsets Sebastian ¨ Osterlund Project Application Development June, 2020 44 / 108

  14. JAS program .main L1: BIPUSH 0xA // push 10 to stack L2: BIPUSH 0x1 ISUB // Subtract 1 DUP IFEQ END // Jump to end if zero BIPUSH 0x31 OUT // Print 1 GOTO L2 // Repeat loop END: HALT .end-main Sebastian ¨ Osterlund Project Application Development June, 2020 45 / 108

  15. IJVM program ./gojasm -o count.ijvm count.jas xxd count.ijvm 00000000: 1dea dfad 0001 0000 0000 0000 0000 0000 ................ 00000010: 0000 0010 100a 1001 6459 9900 0910 31fd ........dY....1. 00000020: a7ff f6ff .... Sebastian ¨ Osterlund Project Application Development June, 2020 46 / 108

  16. IJVM program 00000000: 1dea dfad 0001 0000 0000 0000 0000 0000 ................ 00000010: 0000 0010 100a 1001 6459 9900 0910 31fd ........dY....1. 00000020: a7ff f6ff .... Sebastian ¨ Osterlund Project Application Development June, 2020 47 / 108

  17. IJVM program 2 L1: 3 BIPUSH 0xA // push 10 to stack 4 L2: 5 BIPUSH 0x1 6 ISUB // Subtract 1 7 DUP 8 IFEQ END // Jump to end if zero 9 BIPUSH 0x31 10 OUT // Print 1 11 GOTO L2 // Repeat loop 12 13 END: 14 HALT 15 00000000: 1dea dfad 0001 0000 0000 0000 0000 0000 ................ 00000010: 0000 0010 100a 1001 6459 9900 0910 31fd ........dY....1. 00000020: a7ff f6ff .... Sebastian ¨ Osterlund Project Application Development June, 2020 48 / 108

  18. Recap So far we have seen: stack (machines) instruction decoding assembly branching Sebastian ¨ Osterlund Project Application Development June, 2020 49 / 108

  19. Methods Last part: methods Sebastian ¨ Osterlund Project Application Development June, 2020 50 / 108

  20. Methods .main BIPUSH 0x0 // OBJREF BIPUSH 0x2 BIPUSH 0x3 INVOKEVIRTUAL add re-usable subroutines BIPUSH 0x30 for example add(a, b) IADD OUT also supports recursion HALT last thing you have to .end-main implement (but can be .method add(a, b) challenging. . . ) ILOAD a ILOAD b IADD IRETURN .end-method Sebastian ¨ Osterlund Project Application Development June, 2020 51 / 108

  21. Invocation Sebastian ¨ Osterlund Project Application Development June, 2020 52 / 108

  22. Return Sebastian ¨ Osterlund Project Application Development June, 2020 53 / 108

  23. IJVM IJVM: done! Sebastian ¨ Osterlund Project Application Development June, 2020 54 / 108

  24. Outline Introduction 1 IJVM 2 The C Programming Language 3 Assignment 4 Sebastian ¨ Osterlund Project Application Development June, 2020 55 / 108

  25. Where are we? Introduction 1 IJVM 2 The C Programming Language 3 Assignment 4 Sebastian ¨ Osterlund Project Application Development June, 2020 56 / 108

  26. History Developed by Dennis Ritchie at Bell Labs 1960-1973 Used for developing UNIX No 2 on TIOBE programming language popularity index Still used for most systems programming nowadays! Sebastian ¨ Osterlund Project Application Development June, 2020 57 / 108

  27. Famous programs written in C Operating systems kernels: Windows, OSX, Linux , Unix Wolfenstein 3D, DOOM, Quake (1/2/3) Git Most linux tools (ls,awk, etc) Embedded systems (your microwave?) Much much more Sebastian ¨ Osterlund Project Application Development June, 2020 58 / 108

  28. Hello World Sebastian ¨ Osterlund Project Application Development June, 2020 59 / 108

  29. C vs. C++ No classes No generics/ templates No references (only pointers) Much smaller standard library No operator overloading No namespaces No streams ( cin/ cout ) No exception handling No RAII Sebastian ¨ Osterlund Project Application Development June, 2020 60 / 108

  30. C vs. C++ Upside: more control and is (arguably) easier to master Sebastian ¨ Osterlund Project Application Development June, 2020 61 / 108

  31. C vs. C++ Note: C++ is a superset of C This means that most C code compiles as C++ Sebastian ¨ Osterlund Project Application Development June, 2020 62 / 108

  32. C is notorious for memory errors Source of most (serious) errors and vulnerabilities! (In fact Sebastian’s research is on memory safety) Sebastian ¨ Osterlund Project Application Development June, 2020 63 / 108

  33. Basics #include <stdio.h> // Include standard I/O header // main is called on start, with the number of command // line arguments stored in argc, and the pointers to the // arguments in argv. argv is an array of char pointers. int main(int argc, char **argv) { char *binary_name = argv[0]; if (argc < 2) { printf("expected 2 command line arguments, got %d\n", argc - 1); return 1; // return status code 1 for error } char *name = argv[1]; char *code = argv[2]; printf("Hello %s, you entered the code %s\n", name, code) return 0; // return status code 0 for normal execution } Sebastian ¨ Osterlund Project Application Development June, 2020 64 / 108

  34. What we will cover 1 Arrays 2 Pointers 3 Memory management 4 Strings 5 Structs 6 I/O 7 Style Sebastian ¨ Osterlund Project Application Development June, 2020 65 / 108

  35. Warning I will be very brief. Feel free to ask questions at any time! Sebastian ¨ Osterlund Project Application Development June, 2020 66 / 108

  36. Constants in C Define constant as # define MINUTES_PER_HOUR 60 This is essentially a text replace by the preprocessor Cannot use C++ construct ( const int MINUTES_PER_HOUR = 60; ) Sebastian ¨ Osterlund Project Application Development June, 2020 67 / 108

  37. Type alias define a type using: typedef existingtype newtype typedef int32_t word_t; Note: (u)intX t defined in stdint.h Sebastian ¨ Osterlund Project Application Development June, 2020 68 / 108

  38. Pointers A memory item that references an address Reference using & . char *myptr; char mychar = 0x42; myptr = &mychar; myptr points to the address of mychar char *myptr = &mychar; . Can be confusing! View char * as the type. Dereference using * : *myptr == 0x42; Sebastian ¨ Osterlund Project Application Development June, 2020 69 / 108

  39. Arrays A contiguous list of elements of a certain type Many ways to define: int myarr[1024]; int myarr[] = {1,2,3}; int *myarr = malloc(1024); Dereference an element: First element: myarr[0]; n:th element: myarr[n]; Is essentially a pointer: myarr[42] is same as 42[myarr] is same as *(myarr + 42) No bounds checking! Sebastian ¨ Osterlund Project Application Development June, 2020 70 / 108

  40. Types of Memory management Static memory mangement (global variables) Automatic memory management ( stack ) Manual memory management, manual allocation/ deallocation on the heap Sebastian ¨ Osterlund Project Application Development June, 2020 71 / 108

  41. Heap Allocate using void *malloc(size_t n); (stdlib.h) Allocates n bytes of contiguous memory Returns a pointer Valid until void free(void *ptr) is called on that pointer Sebastian ¨ Osterlund Project Application Development June, 2020 72 / 108

  42. Pointer/ array int *myArray = malloc(sizeof(int) * 5); free(myArray); Sebastian ¨ Osterlund Project Application Development June, 2020 73 / 108

  43. Strings Unlike C++, C has no string type! A string is an array of characters, terminated by a NULL ( ’\0’ ) character. char mystr[] = {’h’, ’e’, ’l’, ’l’, ’o’, ’\0’}; char *mystr = "hello"; You can use it just like an array: mystr[2] == ’l’ Or point to a substring: char *mystr2 = &mystr[1]; // "ello" Built-in functions in C standard library (string.h): strdup() , strcat() , etc. Do not use ’==’ to compare strings! Use strcmp() . Sebastian ¨ Osterlund Project Application Development June, 2020 74 / 108

  44. Structs A composite data type (record) Physically groups a list of variables under one name struct tag_name { type member1; type member2; /* declare as many members as desired, * but the entire structure size must * be known to the compiler. */ }; struct tag_name mystruct; Access fields: mystruct.member1 If you have a pointer: mystructptr->member1 Sebastian ¨ Osterlund Project Application Development June, 2020 75 / 108

  45. I/O Writing: printf("Hello\n "); fprintf(stderr, "OH NO!\n "); Read printf documentation! ( man printf ) Reading: char c = getc(stdin); fread(buf, sizeof(char), 1, stdin); scanf(...); Sebastian ¨ Osterlund Project Application Development June, 2020 76 / 108

  46. I/O Raw access using fopen() , fread() , fwrite() #include <stdio.h> int main(void) { FILE *f = fopen("myfile.txt", "r"); char buf[10]; fread(buf, sizeof(char), 10, f); } Sebastian ¨ Osterlund Project Application Development June, 2020 77 / 108

  47. Using variables from other file Access variable from other file (translation unit) using the extern keyword extern int mycounter; static variables cannot be accessed from other file (private) Sebastian ¨ Osterlund Project Application Development June, 2020 78 / 108

  48. Switch statement switch (c) { 42 case ’+’: 43 a = pop(); 44 b = pop(); 45 push(a + b); 46 break; 47 case ’-’: 48 a = pop(); 49 b = pop(); 50 push(b - a); 51 break; 52 case ’*’: 53 a = pop(); 54 b = pop(); 55 push(a * b); 56 break; 57 case EOF: 58 // Terminate on CTRL-D 59 printf("= %d\n", top()); 60 exit(0); 61 break; 62 case ’\n’: 63 printf("= %d\n", top()); 64 break; 65 default: 66 push(c - ’0’); // Push ASCII character as integer 67 } 68 Sebastian ¨ Osterlund Project Application Development June, 2020 79 / 108

  49. Code style “Any fool can write code that a computer can understand. Good programmers write code that humans can understand.” – Martin Fowler Sebastian ¨ Osterlund Project Application Development June, 2020 80 / 108

  50. Code style Code should explain itself to reader Keep functions short ( ≈≤ 20 lines) Use meaningful names Do not repeat yourself (DRY) Avoid global variables if possible Sebastian ¨ Osterlund Project Application Development June, 2020 81 / 108

  51. What is wrong here? void bad(int* a, int len_a, int* b, int len_b){ 1 int sum_a = 0; 2 for(int i = 0 ; i < len_a; i++){ 3 sum_a+= a[i]; 4 } 5 int average_a = sum_a / len_a; 6 int sum_b = 0; 7 for(int i = 0 ; i < len_b; i++){ 8 sum_b+= a[i]; 9 } 10 int average_b = sum_b / len_b; 11 12 ... 13 } 14 15 Sebastian ¨ Osterlund Project Application Development June, 2020 82 / 108

  52. Better double average(int* a, int len) { 1 double sum = 0; 2 for(int i = 0 ; i < len; i++){ 3 sum+= a[i]; 4 } 5 return sum / len; 6 } 7 8 9 void better(int* a, int len_a, int* b, int len_b){ 10 double average_a = average(a,len_a); 11 double average_b = average(b,len_b); 12 ... 13 } 14 If you are copy and pasting code, you are doing something wrong. Do not repeat yourself (DRY) Easier to read Sebastian ¨ Osterlund Project Application Development June, 2020 83 / 108

  53. What is wrong here? void iadd(int* pc, word_t* stack, int* sp, bool* halted) { 1 ... 2 } 3 4 void isub(int* pc, word_t* stack, int* sp, bool* halted) { 5 ... 6 } 7 8 void bipush(byte a, int* pc, word_t* stack, int* sp, bool* halted) { 9 ... 10 } 11 Sebastian ¨ Osterlund Project Application Development June, 2020 84 / 108

  54. Better typedef struct { 1 int pc; 2 word_t* stack; 3 int sp; 4 bool halted 5 } jvm_state; 6 7 8 9 void iadd(jvm_state* state) { 10 ... 11 } 12 13 void isub(jvm_state* state) { 14 ... 15 } 16 17 void bipush(byte a, jvm_state* state) { 18 ... 19 } 20 If you are copy and pasting code, you are doing something wrong. Do not repeat yourself (DRY) Easier to read Sebastian ¨ Osterlund Project Application Development June, 2020 85 / 108

  55. What does this do? #include <stdio.h> 1 2 3 void main(){ 4 int f, c; 5 int l = 0; 6 int u = 300; 7 int s = 20; 8 for(f = l ; f <= u ; f+=s ) { 9 c = 5 * (f - 32) / 9; 10 printf("%d\t%d\n",f,c); 11 } 12 } 13 Sebastian ¨ Osterlund Project Application Development June, 2020 86 / 108

  56. Better #include <stdio.h> 1 2 int fahr_to_celcius(int fahr) { 3 return 5 * (fahr - 32) / 9; 4 } 5 6 void print_fahrenheit_to_celcius_table() { 7 int fahr, celcius; 8 int lower_fahr = 0; 9 int upper_fahr = 300; 10 int step_fahr = 20; 11 for(fahr = lower_fahr; fahr <= upper_fahr; fahr+= step_fahr) { 12 printf("%d\t%d\n",fahr,fahr_to_celcius(fahr)); 13 } 14 } 15 16 void main(){ 17 print_fahrenheit_to_celcius_table(); 18 } 19 Sebastian ¨ Osterlund Project Application Development June, 2020 87 / 108

  57. A word about style Use more restrictive compiler flags -Wall -Werror -Wpedantic -Wextra . The compiler can prevent many common bugs by using pedantic flags (see Makefile for complete list) Avoid memory leaks (test with valgrind and LLVM sanitizers) Sebastian ¨ Osterlund Project Application Development June, 2020 88 / 108

  58. Outline Introduction 1 IJVM 2 The C Programming Language 3 Assignment 4 Sebastian ¨ Osterlund Project Application Development June, 2020 89 / 108

  59. Overview Sebastian ¨ Osterlund Project Application Development June, 2020 90 / 108

  60. Skeleton How to checkout skeleton https://github.com/VU-Programming/pad-skeleton-c.git How to build make ./ijvm mybinary.ijvm How to test make test1 ./test1 make testbasic make testall Sebastian ¨ Osterlund Project Application Development June, 2020 91 / 108

Recommend


More recommend