bitwise (fjnish) / SEQ part 1 1 Changelog Changes made in this - - PowerPoint PPT Presentation

bitwise fjnish seq part 1
SMART_READER_LITE
LIVE PREVIEW

bitwise (fjnish) / SEQ part 1 1 Changelog Changes made in this - - PowerPoint PPT Presentation

bitwise (fjnish) / SEQ part 1 1 Changelog Changes made in this version not seen in fjrst lecture: 14 September 2017: slide 16-17: the x86 arithmetic shift instruction is sar , not sra 1 last time bitwise strategies: construct/apply mask =


slide-1
SLIDE 1

bitwise (fjnish) / SEQ part 1

1

slide-2
SLIDE 2

Changelog

Changes made in this version not seen in fjrst lecture:

14 September 2017: slide 16-17: the x86 arithmetic shift instruction is sar, not sra

1

slide-3
SLIDE 3

last time

bitwise strategies:

construct/apply mask = number w/1s to mark important bits

AND/&— keep only marked OR/| — set marked XOR/^ — fmipped marked

shift bits to desired positions divide and conquer — fjnd subproblems

bitwise-like parallelism —

multiple copies of operation in difgerent part of number example: OR all pairs of bits, not just last and second-to-last

2

slide-4
SLIDE 4

exercise

Which of these will swap last and second-to-last bit of an unsigned int x? (abcdef becomes abcd fe)

/* version A */ return ((x >> 1) & 1) | (x & (~1)); /* version B */ return ((x >> 1) & 1) | ((x << 1) & (~2)) | (x & (~3)); /* version C */ return (x & (~3)) | ((x & 1) << 1) | ((x >> 1) & 1); /* version D */ return (((x & 1) << 1) | ((x & 3) >> 1)) ^ x;

3

slide-5
SLIDE 5

version A

/* version A */ return ((x >> 1) & 1) | (x & (~1)); // ^^^^^^^^^^^^^^ // abcdef --> 0abcde -> 00000e // ^^^^^^^^^^ // abcdef --> abcde0 // ^^^^^^^^^^^^^^^^^^^^^^^^^^^ // 00000e | abcde0 = abcdee

4

slide-6
SLIDE 6

version B

/* version B */ return ((x >> 1) & 1) | ((x << 1) & (~2)) | (x & (~3)); // ^^^^^^^^^^^^^^ // abcdef --> 0abcde --> 00000e // ^^^^^^^^^^^^^^^ // abcdef --> bcdef0 --> bcde00 // ^^^^^^^^^ // abcdef --> abcd00

5

slide-7
SLIDE 7

version C

/* version C */ return (x & (~3)) | ((x & 1) << 1) | ((x >> 1) & 1); // ^^^^^^^^^^ // abcdef --> abcd00 // ^^^^^^^^^^^^^^ // abcdef --> 00000f --> 0000f0 // ^^^^^^^^^^^^^ // abcdef --> 0abcde --> 00000e

6

slide-8
SLIDE 8

version D

/* version D */ return (((x & 1) << 1) | ((x & 3) >> 1)) ^ x; // ^^^^^^^^^^^^^^^ // abcdef --> 00000f --> 0000f0 // ^^^^^^^^^^^^^^ // abcdef --> 0000ef --> 00000e // ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ // 0000fe ^ abcdef --> abcd(f XOR e)(e XOR f)

7

slide-9
SLIDE 9

int lastBit = x & 1; int secondToLastBit = x & 2; int rest = x & ~3; int lastBitInPlace = lastBit << 1; int secondToLastBitInPlace = secondToLastBit >> 1; return rest | lastBitInPlace | secondToLastBitInPlace;

8

slide-10
SLIDE 10

9

slide-11
SLIDE 11

aside: homework

random types of lists (of shorts)

sentinel-terminated array — special value at end range — structure of pointer + size linked list

convert fjrst to second type append second type to second type

modify the list pointed to by fjrst argument

remove_if_equal all elements equal to a value from second type

modify the list pointed to by fjrst argument

10

slide-12
SLIDE 12

some lists

short sentinel = -9999; short *x; x = malloc(sizeof(short)*4); x[3] = sentinel; ...

x

x[0] x[1] x[2] x[3]

1 2 3 −9999

typedef struct range_t { unsigned int length; short *ptr; } range; range x; x.length = 3; x.ptr = malloc(sizeof(short)*3); ...

x len: 3 ptr: 1 2 3

typedef struct node_t { short payload; list *next; } node; node *x; x = malloc(sizeof(node_t)); ...

x payload: 1 ptr: *x

  • n stack
  • r regs
  • n heap

11

slide-13
SLIDE 13

some lists

short sentinel = -9999; short *x; x = malloc(sizeof(short)*4); x[3] = sentinel; ...

x

x[0] x[1] x[2] x[3]

1 2 3 −9999

typedef struct range_t { unsigned int length; short *ptr; } range; range x; x.length = 3; x.ptr = malloc(sizeof(short)*3); ...

x len: 3 ptr: 1 2 3

typedef struct node_t { short payload; list *next; } node; node *x; x = malloc(sizeof(node_t)); ...

x payload: 1 ptr: *x

← on stack

  • r regs
  • n heap →

11

slide-14
SLIDE 14

multiplication

10 << 2 == 10 * 4 = 10 + 10 + 10 + 10 10 << 3 == 10 * 8 (10 << 3) + (10 << 2) == 10 * 12

  • 10 << 2 == -10 * 4 == (-10)+(-10)+(-10)+(-10)
  • 10 << 3 == -10 * 8

(-10 << 3) + (-10 << 2) == -10 * 12

12

slide-15
SLIDE 15

more division

int divide_by_32(int x) { return x / 32; } // INCORRECT generated code divide_by_32: shrl $5, %edi // ← this is WRONG mov %edi, %eax

example input with wrong output: −32 exercise: what does this assembly return? what is the correct result?

13

slide-16
SLIDE 16

wrong division

−32 result of shr = 134 217 727 0 0 0 0 0 1 1 1 1 1 1 … … … … 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 … … result of division = −1

14

slide-17
SLIDE 17

wrong division

−32 result of shr = 134 217 727 0 0 0 0 0 1 1 1 1 1 1 … … … … 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 … … result of division = −1

14

slide-18
SLIDE 18

dividing negative by two

start with −x fmip all bits and add one to get +x right shift by one to get +x/2 fmip all bits and add one to get −x/2 same as right shift by one, adding 1s instead of 0s except for rounding

15

slide-19
SLIDE 19

dividing negative by two

start with −x fmip all bits and add one to get +x right shift by one to get +x/2 fmip all bits and add one to get −x/2 same as right shift by one, adding 1s instead of 0s except for rounding

15

slide-20
SLIDE 20

arithmetic right shift

x86 instruction: sar — arithmetic shift right sar $amount, %reg (or variable: sar %cl, %reg)

%reg (initial value) %reg (fjnal value) 1 0 1 1 1 1 … … … … 1 1 0 0 0 0 1 1 1 1 1 1

16

slide-21
SLIDE 21

arithmetic right shift

x86 instruction: sar — arithmetic shift right sar $amount, %reg (or variable: sar %cl, %reg)

%reg (initial value) %reg (fjnal value) 1 0 1 1 1 1 … … … … 1 1 0 0 0 0 1 1 1 1 1 1

16

slide-22
SLIDE 22

right shift in C

int shift_signed(int x) { return x >> 5; } unsigned shift_unsigned(unsigned x) { return x >> 5; } shift_signed: movl %edi, %eax sarl $5, %eax ret shift_unsigned: movl %edi, %eax shrl $5, eax ret

17

slide-23
SLIDE 23

dividing negative by two

start with −x fmip all bits and add one to get +x right shift by one to get +x/2 fmip all bits and add one to get −x/2 same as right shift by one, adding 1s instead of 0s except for rounding

18

slide-24
SLIDE 24

divide with proper rounding

C division: rounds towards zero (truncate) arithmetic shift: rounds towards negative infjnity solution: “bias” adjustments — described in textbook

divide_by_32: // GCC generated code leal 31(%rdi), %eax // eax edi 31 testl %edi, %edi // set cond. codes based on %edi cmovns %edi, %eax // if (edi sign bit = 0) eax edi sarl $5, %eax // arithmetic shift ret

19

slide-25
SLIDE 25

divide with proper rounding

C division: rounds towards zero (truncate) arithmetic shift: rounds towards negative infjnity solution: “bias” adjustments — described in textbook

divide_by_32: // GCC generated code leal 31(%rdi), %eax // eax ← edi + 31 testl %edi, %edi // set cond. codes based on %edi cmovns %edi, %eax // if (edi sign bit = 0) eax ← edi sarl $5, %eax // arithmetic shift ret

19

slide-26
SLIDE 26

standards and shifts in C

signed right shift is implementation-defjned

compilers can choose which type of shift to do all compilers I know of — arithmetic (copy sign bit)

unsigned right shift is always logical (fjll with zeroes) shift amount ≥ width of type: undefjned behavior

x86 assembly: only uses lower bits of shift amount

20

slide-27
SLIDE 27

miscellaneous bit manipulation

common bit manipulation instructions are not in C: rotate (x86: ror, rol) — like shift, but wrap around index of fjrst/last bit set (x86: bsf, bsr) population count (some x86: popcnt) — number of bits set

21

slide-28
SLIDE 28

registers

PC

updates every clock cycle

register output register input

22

slide-29
SLIDE 29

state in Y86-64

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF Stat

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

23

slide-30
SLIDE 30

state in Y86-64

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF Stat

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

23

slide-31
SLIDE 31

state in Y86-64

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF Stat

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

23

slide-32
SLIDE 32

state in Y86-64

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF Stat

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

23

slide-33
SLIDE 33

state in Y86-64

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF Stat

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

23

slide-34
SLIDE 34

memories

Instr. Mem. data address Data Mem. data output address input to write write enable? read enable? address input data output

time

address input input to write value in memory

24

slide-35
SLIDE 35

memories

Instr. Mem. data address Data Mem. data output address input to write write enable? read enable? address input data output

time

address input input to write value in memory

24

slide-36
SLIDE 36

memories

Instr. Mem. data address Data Mem. data output address input to write write enable? read enable? address input data output

time

address input input to write value in memory

24

slide-37
SLIDE 37

register fjle

register fjle

%rax, %rdx, … reg values read reg #s write reg #s data to write

register number input register value output

time

register number input data input value in register

write register #15: write is ignored read register #15: value is always 0

25

slide-38
SLIDE 38

register fjle

register fjle

%rax, %rdx, … reg values read reg #s write reg #s data to write

register number input register value output

time

register number input data input value in register

write register #15: write is ignored read register #15: value is always 0

25

slide-39
SLIDE 39

register fjle

register fjle

%rax, %rdx, … reg values read reg #s write reg #s data to write

register number input register value output

time

register number input data input value in register

write register #15: write is ignored read register #15: value is always 0

25

slide-40
SLIDE 40

register fjle

register fjle

%rax, %rdx, … reg values read reg #s write reg #s data to write

register number input register value output

time

register number input data input value in register

write register #15: write is ignored read register #15: value is always 0

25

slide-41
SLIDE 41

ALUs

ALU A OP B A B

  • peration select

Operations needed: add — addq, addresses sub — subq xor — xorq and — andq more?

26

slide-42
SLIDE 42

simple ISA 1: addq

addq %rXX, %rYY encoding:

%rXX %rYY (two 4-bit register #s)

1 byte instructions, no opcode

no other instructions

27

slide-43
SLIDE 43

addq CPU

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

%rXX %rYY

split add (contains ALU)

/* 0x00: */ addq %rax, %rdx /* 0x01: */ addq %rbx, %rdx initially: PC = 0x00, rax = 1, rbx = 2, rdx = 3 after cycle 1: PC = ????, rax = 1, rbx = 2, rdx = 4 after cycle 2: PC = ????, rax = ??, rbx = ??, rdx = ?? plus one /* 0x00: */ addq %rax, %rdx /* 0x01: */ addq %rbx, %rdx initially: PC = 0x00, rax = 1, rbx = 2, rdx = 3 after cycle 1: PC = 0x01, rax = 1, rbx = 2, rdx = 4 after cycle 2: PC = 0x02, rax = 1, rbx = 2, rdx = 6

28

slide-44
SLIDE 44

addq CPU

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

%rXX %rYY

split add (contains ALU)

/* 0x00: */ addq %rax, %rdx /* 0x01: */ addq %rbx, %rdx initially: PC = 0x00, rax = 1, rbx = 2, rdx = 3 after cycle 1: PC = ????, rax = 1, rbx = 2, rdx = 4 after cycle 2: PC = ????, rax = ??, rbx = ??, rdx = ?? plus one /* 0x00: */ addq %rax, %rdx /* 0x01: */ addq %rbx, %rdx initially: PC = 0x00, rax = 1, rbx = 2, rdx = 3 after cycle 1: PC = 0x01, rax = 1, rbx = 2, rdx = 4 after cycle 2: PC = 0x02, rax = 1, rbx = 2, rdx = 6

28

slide-45
SLIDE 45

addq CPU

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

%rXX %rYY

split add (contains ALU)

/* 0x00: */ addq %rax, %rdx /* 0x01: */ addq %rbx, %rdx initially: PC = 0x00, rax = 1, rbx = 2, rdx = 3 after cycle 1: PC = ????, rax = 1, rbx = 2, rdx = 4 after cycle 2: PC = ????, rax = ??, rbx = ??, rdx = ?? plus one /* 0x00: */ addq %rax, %rdx /* 0x01: */ addq %rbx, %rdx initially: PC = 0x00, rax = 1, rbx = 2, rdx = 3 after cycle 1: PC = 0x01, rax = 1, rbx = 2, rdx = 4 after cycle 2: PC = 0x02, rax = 1, rbx = 2, rdx = 6

28

slide-46
SLIDE 46

addq CPU

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

%rXX %rYY

split add (contains ALU)

/* 0x00: */ addq %rax, %rdx /* 0x01: */ addq %rbx, %rdx initially: PC = 0x00, rax = 1, rbx = 2, rdx = 3 after cycle 1: PC = ????, rax = 1, rbx = 2, rdx = 4 after cycle 2: PC = ????, rax = ??, rbx = ??, rdx = ?? plus one /* 0x00: */ addq %rax, %rdx /* 0x01: */ addq %rbx, %rdx initially: PC = 0x00, rax = 1, rbx = 2, rdx = 3 after cycle 1: PC = 0x01, rax = 1, rbx = 2, rdx = 4 after cycle 2: PC = 0x02, rax = 1, rbx = 2, rdx = 6

28

slide-47
SLIDE 47

addq CPU

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

%rXX %rYY

split add (contains ALU)

/* 0x00: */ addq %rax, %rdx /* 0x01: */ addq %rbx, %rdx initially: PC = 0x00, rax = 1, rbx = 2, rdx = 3 after cycle 1: PC = ????, rax = 1, rbx = 2, rdx = 4 after cycle 2: PC = ????, rax = ??, rbx = ??, rdx = ?? plus one /* 0x00: */ addq %rax, %rdx /* 0x01: */ addq %rbx, %rdx initially: PC = 0x00, rax = 1, rbx = 2, rdx = 3 after cycle 1: PC = 0x01, rax = 1, rbx = 2, rdx = 4 after cycle 2: PC = 0x02, rax = 1, rbx = 2, rdx = 6

28

slide-48
SLIDE 48

addq CPU

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

%rXX %rYY

split add (contains ALU)

/* 0x00: */ addq %rax, %rdx /* 0x01: */ addq %rbx, %rdx initially: PC = 0x00, rax = 1, rbx = 2, rdx = 3 after cycle 1: PC = ????, rax = 1, rbx = 2, rdx = 4 after cycle 2: PC = ????, rax = ??, rbx = ??, rdx = ?? plus one /* 0x00: */ addq %rax, %rdx /* 0x01: */ addq %rbx, %rdx initially: PC = 0x00, rax = 1, rbx = 2, rdx = 3 after cycle 1: PC = 0x01, rax = 1, rbx = 2, rdx = 4 after cycle 2: PC = 0x02, rax = 1, rbx = 2, rdx = 6

28

slide-49
SLIDE 49

addq CPU

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

%rXX %rYY

split add (contains ALU)

/* 0x00: */ addq %rax, %rdx /* 0x01: */ addq %rbx, %rdx initially: PC = 0x00, rax = 1, rbx = 2, rdx = 3 after cycle 1: PC = ????, rax = 1, rbx = 2, rdx = 4 after cycle 2: PC = ????, rax = ??, rbx = ??, rdx = ?? plus one /* 0x00: */ addq %rax, %rdx /* 0x01: */ addq %rbx, %rdx initially: PC = 0x00, rax = 1, rbx = 2, rdx = 3 after cycle 1: PC = 0x01, rax = 1, rbx = 2, rdx = 4 after cycle 2: PC = 0x02, rax = 1, rbx = 2, rdx = 6

28

slide-50
SLIDE 50

addq CPU

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

%rXX %rYY

split add (contains ALU)

/* 0x00: */ addq %rax, %rdx /* 0x01: */ addq %rbx, %rdx initially: PC = 0x00, rax = 1, rbx = 2, rdx = 3 after cycle 1: PC = ????, rax = 1, rbx = 2, rdx = 4 after cycle 2: PC = ????, rax = ??, rbx = ??, rdx = ?? plus one /* 0x00: */ addq %rax, %rdx /* 0x01: */ addq %rbx, %rdx initially: PC = 0x00, rax = 1, rbx = 2, rdx = 3 after cycle 1: PC = 0x01, rax = 1, rbx = 2, rdx = 4 after cycle 2: PC = 0x02, rax = 1, rbx = 2, rdx = 6

28

slide-51
SLIDE 51

Simple ISA 2: jmp

jmp label encoding: 8-byte little-endian address

8 byte instructions, no opcode

29

slide-52
SLIDE 52

jmp CPU

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF /* 0x00: */ jmp 0x10 /* 0x08: */ jmp 0x00 /* 0x10: */ jmp 0x08 initially: PC = 0x00 after cycle 1: PC = 0x10 after cycle 2: PC = 0x08 after cycle 3: PC = 0x00

30

slide-53
SLIDE 53

jmp CPU

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF /* 0x00: */ jmp 0x10 /* 0x08: */ jmp 0x00 /* 0x10: */ jmp 0x08 initially: PC = 0x00 after cycle 1: PC = 0x10 after cycle 2: PC = 0x08 after cycle 3: PC = 0x00

30

slide-54
SLIDE 54

jmp CPU

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF /* 0x00: */ jmp 0x10 /* 0x08: */ jmp 0x00 /* 0x10: */ jmp 0x08 initially: PC = 0x00 after cycle 1: PC = 0x10 after cycle 2: PC = 0x08 after cycle 3: PC = 0x00

30

slide-55
SLIDE 55

multiplexers

MUX a b c d

  • utput

select = 0 or 1 or 2 or 3 = a or b or c or d truth table: select bit 1 select bit 0

  • utput (many bits)

a 1 b 1 c 1 1 d

31

slide-56
SLIDE 56

multiplexers

MUX a b c d

  • utput

select = 0 or 1 or 2 or 3 = a or b or c or d truth table: select bit 1 select bit 0

  • utput (many bits)

a 1 b 1 c 1 1 d

31

slide-57
SLIDE 57

multiplexers

MUX a b c d

  • utput

select = 0 or 1 or 2 or 3 = a or b or c or d truth table: select bit 1 select bit 0

  • utput (many bits)

a 1 b 1 c 1 1 d

31

slide-58
SLIDE 58

Simple ISA 3: Jmp or No-Op

actual subset of Y86-64 jmp LABEL — encoded as 0x70 + address nop — encoded as 0x10

32

slide-59
SLIDE 59

jmp+nop CPU

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

split

MUX

1 if jmp 0 if nop

  • pcode

dest

+ 1 (nop size)

nop 1 jmp Dest 7 Dest

nop jmp dest 1 icode valC valP PC not in listing

33

slide-60
SLIDE 60

jmp+nop CPU

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

split

MUX

1 if jmp 0 if nop

  • pcode

dest

+ 1 (nop size)

nop 1 jmp Dest 7 Dest

nop jmp dest 1 icode valC valP PC not in listing

33

slide-61
SLIDE 61

jmp+nop CPU

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

split

MUX

1 if jmp 0 if nop

  • pcode

dest

+ 1 (nop size)

nop 1 jmp Dest 7 Dest

nop jmp dest 1 icode valC valP PC not in listing

33

slide-62
SLIDE 62

jmp+nop CPU

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

split

MUX

1 if jmp 0 if nop

  • pcode

dest

+ 1 (nop size)

nop 1 jmp Dest 7 Dest

nop jmp dest 1 icode valC valP PC not in listing

33

slide-63
SLIDE 63

jmp+nop CPU

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

split

MUX

1 if jmp 0 if nop

  • pcode

dest

+ 1 (nop size)

nop 1 jmp Dest 7 Dest

nop jmp dest 1 icode valC valP PC not in listing

33

slide-64
SLIDE 64

exercise: nop/add CPU

Let’s say we wanted to make nop+add CPU. Where would need MUXes?

  • A. before one or both of the register fjle ‘register number to read’

inputs

  • B. before the PC register’s input
  • C. before one of the register fjle ‘register number to write’ inputs
  • D. before one of the register fjle ‘register value to write’ inputs
  • E. before the instruction memory’s address input

34

slide-65
SLIDE 65

Summary

each instruction takes one cycle divided into stages for design convenience read values from previous cycle send new values to state components control what is sent with MUXes

35

slide-66
SLIDE 66

Backup Slides

36

slide-67
SLIDE 67

conditional movs

absoluteValueJumps: andq %rdi, %rdi jge same ; if rdi >= 0, goto same irmovq $0, %rax ; rax <− 0 subq %rdi, %rax ; rax <− rax (0) − rdi ret same: rrmovq %rdi, %rax ret absoluteValueCMov: irmovq $0, %rax subq %rdi, %rax ; rax <− −rdi andq %rdi, %rdi cmovge %rdi, %rax ; if (rdi > 0) rax <− rdi ret

37

slide-68
SLIDE 68

Stages: pushq/popq

stage pushq popq fetch icode : ifun ← M1[PC] rA : rB ← M1[PC + 1] valP ← PC + 2 icode : ifun ← M1[PC] rA : rB ← M1[PC + 1] valP ← PC + 2 decode valA ← R[rA] valB ← R[%rsp] valA ← R[%rsp] valB ← R[%rsp] execute valE ← valB + (−8) valE ← valB + 8 memory M8[valE] ← valA valM ← M8[ valA ] write back R[%rsp] ← valE R[%rsp] ← valE R[rA] ← valM PC update PC ← valP PC ← valP

38

slide-69
SLIDE 69

Stages: pushq/popq

stage pushq popq fetch icode : ifun ← M1[PC] rA : rB ← M1[PC + 1] valP ← PC + 2 icode : ifun ← M1[PC] rA : rB ← M1[PC + 1] valP ← PC + 2 decode valA ← R[rA] valB ← R[%rsp] valA ← R[%rsp] valB ← R[%rsp] execute valE ← valB + (−8) valE ← valB + 8 memory M8[valE] ← valA valM ← M8[ valA ] write back R[%rsp] ← valE R[%rsp] ← valE R[rA] ← valM PC update PC ← valP PC ← valP

38

slide-70
SLIDE 70

connections in Y86-64

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

addq %r8, %r9 pushq %r8 (and %rsp) addq%r8, %r9 mrmovq 1000(%r9), %r8 rmmovq %r8, 1000(%r9) call function (saves next PC) addq %r9, %r8 irmovq $1000, %r8 popq %rax mrmovq 1000(%r9), %r8 popq %rax (update %rsp) most instructions (instruction length) ret call function jmp label

39

slide-71
SLIDE 71

connections in Y86-64

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

addq %r8, %r9 pushq %r8 (and %rsp) addq%r8, %r9 mrmovq 1000(%r9), %r8 rmmovq %r8, 1000(%r9) call function (saves next PC) addq %r9, %r8 irmovq $1000, %r8 popq %rax mrmovq 1000(%r9), %r8 popq %rax (update %rsp) most instructions (instruction length) ret call function jmp label

39

slide-72
SLIDE 72

connections in Y86-64

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

addq %r8, %r9 pushq %r8 (and %rsp) addq%r8, %r9 mrmovq 1000(%r9), %r8 rmmovq %r8, 1000(%r9) call function (saves next PC) addq %r9, %r8 irmovq $1000, %r8 popq %rax mrmovq 1000(%r9), %r8 popq %rax (update %rsp) most instructions (instruction length) ret call function jmp label

39

slide-73
SLIDE 73

connections in Y86-64

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

addq %r8, %r9 pushq %r8 (and %rsp) addq%r8, %r9 mrmovq 1000(%r9), %r8 rmmovq %r8, 1000(%r9) call function (saves next PC) addq %r9, %r8 irmovq $1000, %r8 popq %rax mrmovq 1000(%r9), %r8 popq %rax (update %rsp) most instructions (instruction length) ret call function jmp label

39

slide-74
SLIDE 74

connections in Y86-64

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

addq %r8, %r9 pushq %r8 (and %rsp) addq%r8, %r9 mrmovq 1000(%r9), %r8 rmmovq %r8, 1000(%r9) call function (saves next PC) addq %r9, %r8 irmovq $1000, %r8 popq %rax mrmovq 1000(%r9), %r8 popq %rax (update %rsp) most instructions (instruction length) ret call function jmp label

39

slide-75
SLIDE 75

connections in Y86-64

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

addq %r8, %r9 pushq %r8 (and %rsp) addq%r8, %r9 mrmovq 1000(%r9), %r8 rmmovq %r8, 1000(%r9) call function (saves next PC) addq %r9, %r8 irmovq $1000, %r8 popq %rax mrmovq 1000(%r9), %r8 popq %rax (update %rsp) most instructions (instruction length) ret call function jmp label

39

slide-76
SLIDE 76

connections in Y86-64

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

addq %r8, %r9 pushq %r8 (and %rsp) addq%r8, %r9 mrmovq 1000(%r9), %r8 rmmovq %r8, 1000(%r9) call function (saves next PC) addq %r9, %r8 irmovq $1000, %r8 popq %rax mrmovq 1000(%r9), %r8 popq %rax (update %rsp) most instructions (instruction length) ret call function jmp label

39

slide-77
SLIDE 77

connections in Y86-64

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

addq %r8, %r9 pushq %r8 (and %rsp) addq%r8, %r9 mrmovq 1000(%r9), %r8 rmmovq %r8, 1000(%r9) call function (saves next PC) addq %r9, %r8 irmovq $1000, %r8 popq %rax mrmovq 1000(%r9), %r8 popq %rax (update %rsp) most instructions (instruction length) ret call function jmp label

39

slide-78
SLIDE 78

connections in Y86-64

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

addq %r8, %r9 pushq %r8 (and %rsp) addq%r8, %r9 mrmovq 1000(%r9), %r8 rmmovq %r8, 1000(%r9) call function (saves next PC) addq %r9, %r8 irmovq $1000, %r8 popq %rax mrmovq 1000(%r9), %r8 popq %rax (update %rsp) most instructions (instruction length) ret call function jmp label

39

slide-79
SLIDE 79

connections in Y86-64

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

addq %r8, %r9 pushq %r8 (and %rsp) addq%r8, %r9 mrmovq 1000(%r9), %r8 rmmovq %r8, 1000(%r9) call function (saves next PC) addq %r9, %r8 irmovq $1000, %r8 popq %rax mrmovq 1000(%r9), %r8 popq %rax (update %rsp) most instructions (instruction length) ret call function jmp label

39

slide-80
SLIDE 80

connections in Y86-64

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

addq %r8, %r9 pushq %r8 (and %rsp) addq%r8, %r9 mrmovq 1000(%r9), %r8 rmmovq %r8, 1000(%r9) call function (saves next PC) addq %r9, %r8 irmovq $1000, %r8 popq %rax mrmovq 1000(%r9), %r8 popq %rax (update %rsp) most instructions (instruction length) ret call function jmp label

39

slide-81
SLIDE 81

connections in Y86-64

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

addq %r8, %r9 pushq %r8 (and %rsp) addq%r8, %r9 mrmovq 1000(%r9), %r8 rmmovq %r8, 1000(%r9) call function (saves next PC) addq %r9, %r8 irmovq $1000, %r8 popq %rax mrmovq 1000(%r9), %r8 popq %rax (update %rsp) most instructions (instruction length) ret call function jmp label

39

slide-82
SLIDE 82

connections in Y86-64

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

addq %r8, %r9 pushq %r8 (and %rsp) addq%r8, %r9 mrmovq 1000(%r9), %r8 rmmovq %r8, 1000(%r9) call function (saves next PC) addq %r9, %r8 irmovq $1000, %r8 popq %rax mrmovq 1000(%r9), %r8 popq %rax (update %rsp) most instructions (instruction length) ret call function jmp label

39

slide-83
SLIDE 83

stages

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

fetch decode execute memory write back PC update

40

slide-84
SLIDE 84

Systematic construction

MUX OPq ret callq pushq … next PC PC + len memory out from instr PC + len … srcB rB — — %rsp …

41

slide-85
SLIDE 85

stages

PC

Instr. Mem.

register fjle

srcA srcB R[srcA] R[srcB] dstE next R[dstE] dstM next R[dstM]

Data Mem.

ZF/SF

logic logic (with ALU) l

  • g

i c

to reg

l

  • g

i c

to PC

fetch decode execute memory write back PC update

42

slide-86
SLIDE 86

Stages

conceptual division of instruction: fetch — read instruction memory, split instruction decode — read register fjle execute — arithmetic (including of addresses) memory — read or write data memory write back — write to register fjle PC update — compute next value of PC

43