Coherence and Consistency
30
Coherence and Consistency 30 The Meaning of Programs An ISA is a - - PowerPoint PPT Presentation
Coherence and Consistency 30 The Meaning of Programs An ISA is a programming language To be useful, programs written in it must have meaning or semantics Any sequence of instructions must have a meaning. The semantics of
30
simple: R[4] = R[8] + R[12]
31
(i.e., it’s address)
number of bytes between the two addresses
32
33
the sequential, one-at-a-time execution
instructions
have precise meanings
executed in that order.
34
$s0, $0, 0 check: addi $s0, $s0, 1 bge $s0, $a0, done lw $t1, 0($s3) addi $s3, $s3, 4 add $s1, $s1, $t1 j check done:
$s0, $0, 0 addi $s0, $s0, 1 bge $s0, $a0, done lw $t1, 0($s3) addi $s3, $s3, 4 add $s1, $s1, $t1 j check addi $s0, $s0, 1 bge $s0, $a0, done lw $t1, 0($s3) addi $s3, $s3, 4 add $s1, $s1, $t1 j check addi $s0, $s0, 1 bge $s0, $a0, done lw $t1, 0($s3) addi $s3, $s3, 4 add $s1, $s1, $t1 j check addi $s0, $s0, 1 bge $s0, $a0, done
set of symbols (with no cycles)
arrangement of the symbols that is consistent with the ordered pairs
disagree
inconsistent with c->b
35
the previous store to address to A
some previous store
stored by some previous store, S1. If another load L2 comes after L1, the value it returns will be the valued stored by a Store, and S2, will either be S1 or come after S1.
36
37
$s0, $0, 0
$s3, $0, 0 addi $s0, $s0, 1 bge $s0, $a0, done sw $s0, 0($s3) ; Mem[0] = 1 addi $s3, $s3, 4 add $s1, $s1, $t1 j check addi $s0, $s0, 1 bge $s0, $a0, done sw $s0, 0($s3) ; Mem[4] = 2 addi $s3, $s3, 4 add $s1, $s1, $t1 j check addi $s0, $s0, 1 bge $s0, $a0, done sw $s0, 0($s3) ; Mem[8] = 3 addi $s3, $s3, 4 add $s1, $s1, $t1 j check addi $s0, $s0, 1 bge $s0, $a0, done
38
39
$s0, $0, 0
$s3, $0, 0 addi $s0, $s0, 1 bge $s0, $a0, done sw $s0, 0($s3) ; Mem[0] = 1 addi $s3, $s3, 4 add $s1, $s1, $t1 j check addi $s0, $s0, 1 bge $s0, $a0, done sw $s0, 0($s3) ; Mem[4] = 2 addi $s3, $s3, 4 add $s1, $s1, $t1 j check addi $s0, $s0, 1 bge $s0, $a0, done sw $s0, 0($s3) ; Mem[8] = 3 addi $s3, $s3, 4 add $s1, $s1, $t1 j check addi $s0, $s0, 1 bge $s0, $a0, done
$s0, $0, 1000
$s3, $0, 0 addi $s0, $s0, 1 bge $s0, $a0, done sw $s0, 0($s3) ; Mem[0] = 1001 addi $s3, $s3, 4 add $s1, $s1, $t1 j check addi $s0, $s0, 1 bge $s0, $a0, done sw $s0, 0($s3) ; Mem[4] = 1002 addi $s3, $s3, 4 add $s1, $s1, $t1 j check addi $s0, $s0, 1 bge $s0, $a0, done sw $s0, 0($s3) ; Mem[8] = 1003 addi $s3, $s3, 4 add $s1, $s1, $t1 j check addi $s0, $s0, 1 bge $s0, $a0, done
have obvious means for instructions on different CPUs
40
41 sw $s0, 0($s3) ; Mem[0] = 1 sw $s0, 0($s3) ; Mem[4] = 2 sw $s0, 0($s3) ; Mem[8] = 3 sw $s0, 0($s3) ; Mem[0] = 1001 sw $s0, 0($s3) ; Mem[4] = 1002 sw $s0, 0($s3) ; Mem[8] = 1003
sw $s0, 0($s3) ; Mem[0] = 1 sw $s0, 0($s3) ; Mem[4] = 2 sw $s0, 0($s3) ; Mem[8] = 3 sw $s0, 0($s3) ; Mem[0] = 1001 sw $s0, 0($s3) ; Mem[4] = 1002 sw $s0, 0($s3) ; Mem[8] = 1003
sw $s0, 0($s3) ; Mem[0] = 1 sw $s0, 0($s3) ; Mem[4] = 2 sw $s0, 0($s3) ; Mem[8] = 3 sw $s0, 0($s3) ; Mem[0] = 1001 sw $s0, 0($s3) ; Mem[4] = 1002 sw $s0, 0($s3) ; Mem[8] = 1003
previous store to address to A
accesses to an address A.
the processors.
previous (in that total order) store to address to A
42
1003 or vice versa. Exactly one of these
execution.
43 sw $s0, 0($s3) ; Mem[0] = 1 sw $s0, 0($s3) ; Mem[4] = 2 sw $s0, 0($s3) ; Mem[8] = 3 sw $s0, 0($s3) ; Mem[0] = 1001 sw $s0, 0($s3) ; Mem[4] = 1002 sw $s0, 0($s3) ; Mem[8] = 1003
44
1: A = 10; 2: A_is_valid = true;
while(1) 3: if (A_is_valid) 4: break; 5: B = A;
45
multiple addresses.
1: A = 10; 2: A_is_valid = true; while(1) 3: if (A_is_valid) 4: break; 5: B = A;
46
47
accesses to an address A.
all the processors.
previous (in that total order) store to address to A
48
1: A = 10; 2: A_is_valid = true; while(1) 3: if (A_is_valid) 4: break; 5: B = A;
happen.
global coordination to determine the global
seen out of order.
consistency.
implement inter-CPU communication?
49
$s0, $0, 0
$s3, $0, 0 addi $s0, $s0, 1 bge $s0, $a0, done sw $s0, 0($s3) ; Mem[0] = 1 addi $s3, $s3, 4 add $s1, $s1, $t1 j check addi $s0, $s0, 1 bge $s0, $a0, done sw $s0, 0($s3) ; Mem[4] = 2 addi $s3, $s3, 4 add $s1, $s1, $t1 j check addi $s0, $s0, 1 bge $s0, $a0, done sw $s0, 0($s3) ; Mem[8] = 3 addi $s3, $s3, 4 add $s1, $s1, $t1 j check addi $s0, $s0, 1 bge $s0, $a0, done
exactly where they are needed
50
memory accesses to an address A.
the previous (in that total order) store to address to A
51
52
1: A = 10; 2: A_is_valid = true; while(1) 3: if (A_is_valid) 4: break; 5: B = A;
53
54
0x1000: B 0x1000: A
perspective, the “freshest” version is always visible.
processor to circumvent the cache to see DRAM’s copy.
55
0x1000: A 0x1000: B
Store 0x1000 Read 0x1000 Store 0x1000
0x1000: ?? 0x1000: C
56
57
the same. Only reading is allowed
Reading and write are allowed
58
59
Exclusive 0x1000: Z
Store 0x1000
0x1000: A
60
Shared 0x1000: A
Store 0x1000
0x1000: A Shared 0x1000:A
Read 0x1000
61
invalid 0x1000: A
Store 0x1000
0x1000: A invalid 0x1000:A Owned 0x1000: C
Read 0x1000 Store 0x1000
62
63