Memory consistency in C++ Computer Architecture J. Daniel Garca - - PowerPoint PPT Presentation

memory consistency in c
SMART_READER_LITE
LIVE PREVIEW

Memory consistency in C++ Computer Architecture J. Daniel Garca - - PowerPoint PPT Presentation

Memory consistency in C++ Memory consistency in C++ Computer Architecture J. Daniel Garca Snchez (coordinator) David Expsito Singh Francisco Javier Garca Blas ARCOS Group Computer Science and Engineering Department University Carlos


slide-1
SLIDE 1

Memory consistency in C++

Memory consistency in C++

Computer Architecture

  • J. Daniel García Sánchez (coordinator)

David Expósito Singh Francisco Javier García Blas

ARCOS Group Computer Science and Engineering Department University Carlos III of Madrid

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 1/41

slide-2
SLIDE 2

Memory consistency in C++ Memory model

1

Memory model

2

Atomic types

3

Ordering relationships

4

Consistency models

5

Barriers

6

Conclusion

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 2/41

slide-3
SLIDE 3

Memory consistency in C++ Memory model

C++ and memory consistency

C++11 defines its own concurrency model as part of the language. Goal: Avoid the need to write code in lower level languages (C, assembler, . . . ) to obtain better performance.

Atomic types. Low level synchronization mechanisms.

Allows to build lock free data structures.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 3/41

slide-4
SLIDE 4

Memory consistency in C++ Memory model

Objects and memory locations

Object: Is a storage region.

A sequence of one or more bytes.

Memory location: Is an object of scalar type or a sequence of contiguous bit fields. An object is stored in one or more memory locations.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 4/41

slide-5
SLIDE 5

Memory consistency in C++ Memory model

Example

Structure:

struct { int i ; char c; int d: 10; int e: 16; double f; };

Memory locations:

1 i. 2 c. 3 d, e. 4 f.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 5/41

slide-6
SLIDE 6

Memory consistency in C++ Memory model

Rules

Two threads may access to different memory locations simultaneously. Two threads may access to the same memory locations simultaneously if both accesses are for reading. If two threads try to access simultaneously to the same memory location and any access is a write, there is a potential race condition.

Depends on whether an ordering between both accesses is stablished.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 6/41

slide-7
SLIDE 7

Memory consistency in C++ Memory model

Ordering and race conditions

Classic solution: Use synchronization mechanisms.

Allow to guarantee mutual exclusion. Based on OS → Might be costly.

Alternative: Use atomic operations to ensure ordering.

If ordering between two accesses to a memory location is not established, some of the accesses is not atomic, and at least one of the accesses is a write, those are a data race and program behavior is not defined.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 7/41

slide-8
SLIDE 8

Memory consistency in C++ Memory model

Modification order

Modification order: Sequence of writes on an object.

If two threads see different modification orders on an object there is a data race. Modifications do not need to be visible in the same instant in all threads.

A subsequent read to a write on the same thread observes the written value or a subsequent value in its modification

  • rder.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 8/41

slide-9
SLIDE 9

Memory consistency in C++ Atomic types

1

Memory model

2

Atomic types

3

Ordering relationships

4

Consistency models

5

Barriers

6

Conclusion

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 9/41

slide-10
SLIDE 10

Memory consistency in C++ Atomic types

Atomic operations

They are indivisible operations.

If a thread performs an atomic read from a variable and

  • ther thread performs an atomic write on the same

variable and there is no more threads accessing:

The read returns the previous value to the write or the written value.

If any of the operations (read or write) is non atomic the behavior is not defined.

A value can be obtained that is not the previous or the subsequent one.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 10/41

slide-11
SLIDE 11

Memory consistency in C++ Atomic types

Atomic types

A generic type atomic<T> allows to define atomic variables for type T, where T is:

An integral type. A pointer type. Type bool. It is undefined for real number types (float, double). Also available for user defined types fulfilling some constraints.

All atomic types have a member is_lock_free().

Determine if their implementation is lock-free.

Additionally there is a type atomic_flag:

The only type that is guaranteed to be lock-free.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 11/41

slide-12
SLIDE 12

Memory consistency in C++ Atomic types

Operations on atomic types

Operations on atomics may optionally specify a memory

  • rder.

By default memory_order_seq_cst.

Store operations:

memory_order_relaxed, memory_order_release, memory_order_seq_cst.

Read operations:

memory_order_relaxed, memory_order_consume, memory_order_acquire, memory_order_seq_cst

Read-modify-write operations:

memory_order_relaxed, memory_order_consume, memory_order_acquire, memory_order_release, memory_order_acq_rel, memory_order_seq_cst.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 12/41

slide-13
SLIDE 13

Memory consistency in C++ Atomic types

atomic_flag

Most simple possible atomic type.

Two possible states: enabled o disabled. It is always lock-free. Always must be explicitly initiated to disabled.

std :: atomic_flag f1 = ATOMIC_FLAG_INIT;

Operations:

Disable:

f1.clear () ;

Enable and check previous value:

f1.test_and_set();

May provide memory order for operation.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 13/41

slide-14
SLIDE 14

Memory consistency in C++ Atomic types

Example: A spin lock

Lock not using OS services.

Useful for very short lockings when you desire to avoid context switching problems.

spin lock mutex

class spinlock_mutex { private: std :: atomic_flag f ; public: spinlock_mutex() : f{ATOMIC_FLAG_INIT} {} void lock() { while (f .test_and_set()) {} } void unlock() { flag .clear () ; } };

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 14/41

slide-15
SLIDE 15

Memory consistency in C++ Atomic types

atomic_bool

More operations than atomic_flag. Can be initiated and assigned with bools. Cannot be copied from another atomic<bool>. Modification: a.store(order) Query: a.exchange(b, order) Automatic conversion to bool (seq. consistency): a.load(order). Example

std :: atomic<bool> a; bool x = a.load(std :: memory_order_acquire); a.store(true); x = a.exchange(false, std::memory_order_acq_rel);

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 15/41

slide-16
SLIDE 16

Memory consistency in C++ Atomic types

Compare and exchange

Compares atomic value with an expected value.

If both are equal, the desired value is stored in the atomic. If not equal, atomic is left unmodified. It always returns success/failure indication.

Two versions:

1 a.compare_exchange_weak(e,d):

Allows spurious failures (context switch) in some architectures. May behave as if *this!=e even if they are equal.

2 a.compare_exchange_strong(e,d):

Does not allow for spurious failures.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 16/41

slide-17
SLIDE 17

Memory consistency in C++ Atomic types

atomic_address

Atomic access to a memory address. Cannot be copied. Can copy a (void*) pointer. Interface similar to atomic<bool>:

is_lock_free(), load(), store(), exchange(), compare_exchange_weak(), compare_exchange_strong().

Additional operations.

fetch_add(), fetch_sub().

Allow for memory ordering specification. Return value previous to change.

+=, -=.

Return the value after the change. All operations allow byte arithmetic.

Other arithmetics with atomic<T*>.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 17/41

slide-18
SLIDE 18

Memory consistency in C++ Atomic types

atomic<integral>

Can be applied to all integral types. General operations:

is_lock_free(), load(), store(), exchange(), compare_exchange_weak(), compare_exchange_strong().

Arithmetic operations.

fetch_add(), fetch_sub(), fetch_and(), fetch_or(), fetch_xor(). +=, -=, &=, |=, ˆ=. ++x, x++, –x, x– There are no other arithmetic operations (*, /, %).

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 18/41

slide-19
SLIDE 19

Memory consistency in C++ Ordering relationships

1

Memory model

2

Atomic types

3

Ordering relationships

4

Consistency models

5

Barriers

6

Conclusion

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 19/41

slide-20
SLIDE 20

Memory consistency in C++ Ordering relationships

synchronizes-with relation

Relationship between operations on atomic types. A write on an atomic value synchronizes-with a read on that atomic value reading that value:

i Stored by that write. ii Stored by a subsequent write from the same thread that performed the write. iii Stored by a sequence of read-modify-write operations on the value from any thread in which the first operation read the value stored by the write.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 20/41

slide-21
SLIDE 21

Memory consistency in C++ Ordering relationships

happens-before relationship

Specified which operations see the effects from other

  • perations.

Within a thread, an operation happens-before other

  • peration if it appears in a preceding sentence.

There is no order between two operations from the same sentence.

Among two threads, an operation in one thread happens-before other operation from other thread if:

i There is a synchronizes-with relationship among both

  • perations.

ii There is a happens-before a synchronizes-with chain of relationships among both operations.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 21/41

slide-22
SLIDE 22

Memory consistency in C++ Ordering relationships

Ordering: Sequential consistency

Example

std :: vector<int> v; std :: atomic_bool f(false); void writer () { v.push_back(1); // #1 f = true; // #2 } void reader() { while(!f .load()) { // #3 std :: this_thread :: sleep( std :: milliseconds(1)); } std :: cout << v[0] << std::endl; // #4 }

v.push_back(1); f=true; f.load() false f.load() true std::cout « v[0];

Only possible result: v[0] == 1.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 22/41

slide-23
SLIDE 23

Memory consistency in C++ Consistency models

1

Memory model

2

Atomic types

3

Ordering relationships

4

Consistency models

5

Barriers

6

Conclusion

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 23/41

slide-24
SLIDE 24

Memory consistency in C++ Consistency models

Sequential consistency

memory_order_seq_cst. The program is consistent with a sequential view. If all the operations on atomics are sequentially consistent, multi-threaded program behavior is as if all the

  • perations would be performed in some particular order in

a single thread. There cannot be reorderings. It is the simplest model to reason about. It is the most costly model in terms of performance.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 24/41

slide-25
SLIDE 25

Memory consistency in C++ Consistency models

Access

std :: atomic<bool> x, y; std :: atomic<int> z; void f () { x.store(true, std :: memory_order_seq_cst); } void g() { y.store(true, std :: memory_order_seq_cst); } void h() { while (!x.load(std :: memory_order_seq_cst)) {} if (y.load(std :: memory_order_seq_cst)) ++ z; } void i () { while (!y.load(std :: memory_order_seq_cst)) {} if (x.load(std :: memory_order_seq_cst)) ++z; }

Threads launching

int main() { x = false; y = false; z = 0; std :: thread t1{f }; std :: thread t2{g}; std :: thread t3{h}; std :: thread t4{ i };

  • t1. join () ;
  • t2. join () ;
  • t3. join () ;
  • t4. join () ;

assert(z.load() !=0); return 0; }

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 25/41

slide-26
SLIDE 26

Memory consistency in C++ Consistency models

Sequential consistency: Analysis

x.store(true) x.load() false x.load() true y.load() false y.store(true) y.load() y.load() true x.load() false? true z++

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 26/41

slide-27
SLIDE 27

Memory consistency in C++ Consistency models

Sequential consistency: Analysis

x.store(true) x.load() false x.load() true y.load() false y.store(true) y.load() y.load() true x.load() false? true z++

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 27/41

slide-28
SLIDE 28

Memory consistency in C++ Consistency models

Non-sequentially consistent orders

There is no global order of events.

Each thread may have a different view.

Threads might not agree on the same order of events.

But, . . .

All threads must agree in the modifications order for each variable.

Alternatives:

relaxed ordering. release/acquire ordering.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 28/41

slide-29
SLIDE 29

Memory consistency in C++ Consistency models

Relaxed ordering

memory_order_relaxed Relaxed operations on atomics do not participate in synchronizes-with relationship. Operations on same variable in the same thread do fulfill happens-before relationship.

Accesses to an atomic variable within the same thread cannot be reordered. Once a thread has seen a value from variable it cannot see an older value of that variable.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 29/41

slide-30
SLIDE 30

Memory consistency in C++ Consistency models

Example

Data access

std :: atomic<bool> x, y; std::atomic<int> z; void f () { x.store(true, std :: memory_order_relaxed); y.store(true, std :: memory_order_relaxed); } void g() { while (!y.load(std :: memory_order_relaxed)) {} if (x.load(std :: memory_order_relaxed)) { ++z; } } int main() { x=false; y=false; z=0; std :: thread t1{f }; std :: thread t2{g};

  • t1. join () ; t2. join () ;

return 0; }

x.store(true); y.store(true); y.load() false y.load() true x.load() ??

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 30/41

slide-31
SLIDE 31

Memory consistency in C++ Consistency models

Release/acquire ordering

memory_order_acquire, memory_order_release, memory_order_acq_rel. Intermediate level of synchronization. A release operation writing a value synchronizes-with an acquire operation reading that value. Impact:

Different threads may see different orders. Not all orders are possible.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 31/41

slide-32
SLIDE 32

Memory consistency in C++ Consistency models

Access

std :: atomic<bool> x, y; std :: atomic<int> z; void f () { x.store(true, std :: memory_order_release); } void g() { y.store(true, std :: memory_order_release); } void h() { while (!x.load(std :: memory_order_acquire)) {} if (y.load(std :: memory_order_acquire)) ++ z; } void i () { while (!y.load(std :: memory_order_acquire)) {} if (x.load(std :: memory_order_acquire)) ++z; }

Threads launching

int main() { x = false; y = false; z = 0; std :: thread t1{f }; std :: thread t2{g}; std :: thread t3{h}; std :: thread t4{ i };

  • t1. join () ;
  • t2. join () ;
  • t3. join () ;
  • t4. join () ;

assert(z.load() !=0); return 0; }

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 32/41

slide-33
SLIDE 33

Memory consistency in C++ Consistency models

Analysis

x.store(true,release) x.load(acquire) false x.load(acquire) true y.load(acquire) false y.store(true,release) y.load(acquire) y.load(acquire) true x.load(acquire) ? ?

multiple orders are possible as there is no relationship acquire → release .

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 33/41

slide-34
SLIDE 34

Memory consistency in C++ Consistency models

Combining orderings

An equivalent effect to sequential consistency can be

  • btained with lower cost.

Access

std :: atomic<bool> x, y; std::atomic<int> z; void f () { x.store(true, std :: memory_order_relaxed); y.store(true, std :: memory_order_release); } void g() { while (!y.load(std :: memory_order_acquire)) {} if (x.load(std :: memory_order_relaxed)) ++z; } int main() { x = false; y = false; z = 0; std :: thread t1{f }; std :: thread t2{g};

  • t1. join () ; t2. join () ;

assert(z.load() !=0); return 0; } x.store(true,relaxed); y.store(true,release); y.load(acquire) false y.load(acquire) true x.load(relaxed) true

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 34/41

slide-35
SLIDE 35

Memory consistency in C++ Barriers

1

Memory model

2

Atomic types

3

Ordering relationships

4

Consistency models

5

Barriers

6

Conclusion

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 35/41

slide-36
SLIDE 36

Memory consistency in C++ Barriers

Barriers

Force ordering without modifying data. Example

std :: atomic<bool> x, y; std :: atomic<int> z; void f () { x.store(true, std :: memory_order_relaxed); std :: atomic_thread_fence(std::memory_order_release); y.store(true, std :: memory_order_relaxed); } void g() { while (!y.load(std :: memory_order_relaxed)) {} std :: atomic_thread_fence(std::memory_order_acquire); if (x.load(std :: memory_order_relaxed)) ++z; }

Threads

int main() { x = false; y = false; z = 0; std :: thread t1(f); std :: thread t2(g);

  • t1. join () ;
  • t2. join () ;

assert(z.load() !=0); return 0; }

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 36/41

slide-37
SLIDE 37

Memory consistency in C++ Barriers

Barriers: Analysis

x.store(true,relaxed); fence(release); y.store(true,relaxed); y.load(relaxed) true fence(acquire) x.load() true

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 37/41

slide-38
SLIDE 38

Memory consistency in C++ Conclusion

1

Memory model

2

Atomic types

3

Ordering relationships

4

Consistency models

5

Barriers

6

Conclusion

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 38/41

slide-39
SLIDE 39

Memory consistency in C++ Conclusion

Summary

The C++ memory model defines the memory access rules for a correct program.

Allows portable programming with lock free data structures.

Atomic types allow to perform memory operations specifying an ordering.

Default ordering is sequential consistency.

Relationships synchronizes-with and happens-before define constraints on operations ordering. Barriers allow to force orderings without modifying data.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 39/41

slide-40
SLIDE 40

Memory consistency in C++ Conclusion

References

C++ Concurrency in Action. Practical multithreading. Anthony Williams. Chapter 5.

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 40/41

slide-41
SLIDE 41

Memory consistency in C++ Conclusion

Memory consistency in C++

Computer Architecture

  • J. Daniel García Sánchez (coordinator)

David Expósito Singh Francisco Javier García Blas

ARCOS Group Computer Science and Engineering Department University Carlos III of Madrid

cbed

– Computer Architecture – ARCOS Group – http://www.arcos.inf.uc3m.es 41/41