TDDD38/726G82 - Advanced programming in C++ Sum Types in C++ - - PowerPoint PPT Presentation

tddd38 726g82 advanced programming in c
SMART_READER_LITE
LIVE PREVIEW

TDDD38/726G82 - Advanced programming in C++ Sum Types in C++ - - PowerPoint PPT Presentation

TDDD38/726G82 - Advanced programming in C++ Sum Types in C++ Christoffer Holm Department of Computer and informaon science 1 Intro 2 Union 3 STL types 4 Implementaon 5 Second Implementaon 1 Intro 2 Union 3 STL types 4


slide-1
SLIDE 1

TDDD38/726G82 - Advanced programming in C++

Sum Types in C++

Christoffer Holm

Department of Computer and informaon science

slide-2
SLIDE 2

1 Intro 2 Union 3 STL types 4 Implementaon 5 Second Implementaon

slide-3
SLIDE 3

1 Intro 2 Union 3 STL types 4 Implementaon 5 Second Implementaon

slide-4
SLIDE 4

3 / 66

Intro

Goals

‚ C++ is stacally typed Can we simulate dynamic typing though?

slide-5
SLIDE 5

3 / 66

Intro

Goals

‚ C++ is stacally typed ‚ Can we simulate dynamic typing though?

slide-6
SLIDE 6

4 / 66

Intro

Type categories

Algebraic Data Types ‚ Product types Sum types

slide-7
SLIDE 7

4 / 66

Intro

Type categories

Algebraic Data Types ‚ Product types ‚ A type containing several other types at once ‚ struct and class types are product types ‚ std::pair and std::tuple Sum types

slide-8
SLIDE 8

4 / 66

Intro

Type categories

Algebraic Data Types ‚ Product types ‚ Sum types

slide-9
SLIDE 9

4 / 66

Intro

Type categories

Algebraic Data Types ‚ Product types ‚ Sum types ‚ A sum type is a type that can take on one of several types at a me ‚ I.e. a type which can only store one value, but that value might be chosen from more than one type

slide-10
SLIDE 10

5 / 66

Intro

Product Type

n c Type

int char

slide-11
SLIDE 11

6 / 66

Intro

Sum Type

value Type

int char char const*

slide-12
SLIDE 12

6 / 66

Intro

Sum Type

value Type

int char char const* value : 5

slide-13
SLIDE 13

6 / 66

Intro

Sum Type

value Type

int char char const* value : 'a'

slide-14
SLIDE 14

6 / 66

Intro

Sum Type

value Type

int char char const* value : "some text"

slide-15
SLIDE 15

7 / 66

Intro

This sounds like Python (sort of)

‚ use sum types to simulate dynamic types but how do they work in C++?

slide-16
SLIDE 16

7 / 66

Intro

This sounds like Python (sort of)

‚ use sum types to simulate dynamic types ‚ but how do they work in C++?

slide-17
SLIDE 17

1 Intro 2 Union 3 STL types 4 Implementaon 5 Second Implementaon

slide-18
SLIDE 18

9 / 66

Union

Unions union Sum_Type { int n; char c; char const* s; }; int main() { Sum_Type obj;

  • bj.n = 5;
  • bj.c = 'a';
  • bj.s = "some text";

}

slide-19
SLIDE 19

10 / 66

Union

Unions

‚ Unions look like struct or class ‚ They work very different though ‚ Only one field can be set at one point

slide-20
SLIDE 20

11 / 66

Union

Problems with unions int main() { Sum_Type obj;

  • bj.n = 5;

cout << obj.c << endl; }

slide-21
SLIDE 21

11 / 66

Union

Problems with unions int main() { Sum_Type obj;

  • bj.n = 5;

cout << obj.c << endl; }

U n d e fi n e d B e h a v i

  • u

r

slide-22
SLIDE 22

12 / 66

Union

Problems with unions

‚ The only field that is safe to access is the one last set ‚ Accessing any other field will be undefined behaviour ‚ Once we assign to a new field the old one will be

  • verwrien
slide-23
SLIDE 23

13 / 66

Union

(Possible) Memory model of unions union Sum_Type { int n; // 4 bytes char c; // 1 byte char const* s; // 8 bytes }; sizeof(Sum_Type) == 8

slide-24
SLIDE 24

14 / 66

Union

(Possible) Memory model of unions Sum_Type obj;

slide-25
SLIDE 25

14 / 66

Union

(Possible) Memory model of unions s

  • bj.s = "some text";
slide-26
SLIDE 26

14 / 66

Union

(Possible) Memory model of unions

x 4 8 d 5

s

  • bj.s = "some text";
slide-27
SLIDE 27

14 / 66

Union

(Possible) Memory model of unions

x 4 8 d 5

n

  • bj.n = 5;
slide-28
SLIDE 28

14 / 66

Union

(Possible) Memory model of unions

5 8 d 5

n

  • bj.n = 5;
slide-29
SLIDE 29

14 / 66

Union

(Possible) Memory model of unions

5 8 d 5

c

  • bj.c = 'a';
slide-30
SLIDE 30

14 / 66

Union

(Possible) Memory model of unions

a 5 8 d 5

c

  • bj.c = 'a';
slide-31
SLIDE 31

14 / 66

Union

(Possible) Memory model of unions

a 5 8 d 5

c cout << obj.n << endl;

slide-32
SLIDE 32

14 / 66

Union

(Possible) Memory model of unions

a 5 8 d 5

n cout << obj.n << endl;

slide-33
SLIDE 33

14 / 66

Union

(Possible) Memory model of unions

a 5 8 d 5

n cout << obj.n << endl;

U n d e fi n e d B e h a v i

  • u

r

slide-34
SLIDE 34

15 / 66

Union

(Possible) Memory model of unions

‚ All fields in unions are stored in the same memory ‚ The size of the union is (at least) the size of the largest field; to make sure that everything fits ‚ Accessing any field other than the one latest assigned to is undefined behaviour ‚ The memory model presented here is just a common implementaon, there are no implementaon specificaons in the standard

slide-35
SLIDE 35

1 Intro 2 Union 3 STL types 4 Implementaon 5 Second Implementaon

slide-36
SLIDE 36

17 / 66

STL types

Sum Types in STL

‚ std::optional ‚ std::variant ‚ std::any

slide-37
SLIDE 37

17 / 66

STL types

Sum Types in STL

‚ std::optional ‚ either stores a value or nothing ‚ can only store values of one type with an addional value called nullopt ‚ oen used as a return type so that errors can be reported as nullopt ‚ will store the value inline inside the object ‚ std::variant ‚ std::any

slide-38
SLIDE 38

17 / 66

STL types

Sum Types in STL

‚ std::optional ‚ std::variant ‚ a safe alternave to unions ‚ will always hold a value of one of several types ‚ only possible to access the value as the type it is ‚ will throw excepons when used incorrectly ‚ std::any

slide-39
SLIDE 39

17 / 66

STL types

Sum Types in STL

‚ std::optional ‚ std::variant ‚ std::any ‚ contains a value of any type ‚ is extremely general ‚ but is very expensive ‚ the value is stored on the heap and is polymorphic ‚ so std::any should be avoided if at all possible

slide-40
SLIDE 40

18 / 66

STL types

std::optional #include <optional> // ... template <typename T> std::optional<T> read(istream& is) { T data; if (is >> data) { return data; } return {}; }

slide-41
SLIDE 41

19 / 66

STL types

std::optional int main() { std::optional<int> result{read<int>(cin)}; if (result) { cout << result.value() << endl; result = nullopt; } else { cout << "Error!" << endl; } }

slide-42
SLIDE 42

20 / 66

STL types

std::variant #include <variant> // ... int main() { std::variant<int, double> data{15}; cout << std::get<int>(data) << endl; data = 12.5; cout << std::get<1>(data) << endl; }

slide-43
SLIDE 43

21 / 66

STL types

std::variant // will initialize data to contain 0 std::variant<int, double> data{}; try { // will throw since data contains int cout << std::get<double>(data) << endl; } catch (std::bad_variant_access& e) { } // will assign 12.5 as an int // so data will contain 12 std::get<int>(data) = 12.5;

slide-44
SLIDE 44

22 / 66

STL types

std::variant

‚ possible to assign a value to the variant with operator= ‚ use std::get to access the value as the correct type ‚ the std::variant will keep track of the current value and type ‚ throws an std::bad_variant_access whenever the user tries to access the incorrect type

slide-45
SLIDE 45

23 / 66

STL types

std::any #include <any> // ... int main() { std::any var; var = 5; // int cout << std::any_cast<int>(var) << endl; var = new double{5.3}; // double* cout << *std::any_cast<double*>(var) << endl; delete std::any_cast<double*>(var); }

slide-46
SLIDE 46

24 / 66

STL types

std::any std::any var; if (var.has_value()) { ... } var = 7; if (var.type() == typeid(int)) { ... } try { cout << std::any_cast<double>(var) << endl; } catch (std::bad_any_cast& e) { }

slide-47
SLIDE 47

25 / 66

STL types

std::any

‚ std::any allows us to store whatever we want ‚ uses dynamic allocaons and typeid to keep track of data and type ‚ is quite inefficient and not that useful ‚ prefer std::variant instead whenever possible

slide-48
SLIDE 48

1 Intro 2 Union 3 STL types 4 Implementaon 5 Second Implementaon

slide-49
SLIDE 49

27 / 66

Implementaon

Variant

‚ let us implement a simplified variant type

  • ur variant will store int or std::string

two versions; one with union and one without we will also introduce a new way to handle memory

slide-50
SLIDE 50

27 / 66

Implementaon

Variant

‚ let us implement a simplified variant type ‚ our variant will store int or std::string two versions; one with union and one without we will also introduce a new way to handle memory

slide-51
SLIDE 51

27 / 66

Implementaon

Variant

‚ let us implement a simplified variant type ‚ our variant will store int or std::string ‚ two versions; one with union and one without we will also introduce a new way to handle memory

slide-52
SLIDE 52

27 / 66

Implementaon

Variant

‚ let us implement a simplified variant type ‚ our variant will store int or std::string ‚ two versions; one with union and one without ‚ we will also introduce a new way to handle memory

slide-53
SLIDE 53

28 / 66

Implementaon

Union-like classes struct my_union { union { int n; double d; }; }; int main() { my_union m{0}; cout << m.n << endl; m.d = 5.0; cout << m.d << endl; }

slide-54
SLIDE 54

29 / 66

Implementaon

Union-like classes

‚ it is possible to create mulple variables that occupy the same memory ‚ this is done through the use of a so called anonymous union ‚ an anonymous union will create each field inside the class as if they where members, but they will share the same memory space

slide-55
SLIDE 55

30 / 66

Implementaon

Non-trivial union-like classes struct my_union { union { int n; std::string s; }; }; int main() { my_union u{}; u.s = "hello"; cout << u.s << endl; }

slide-56
SLIDE 56

30 / 66

Implementaon

Non-trivial union-like classes

union.cc:14:12: error: use of deleted function 'my_union::my_union()' my_union u; ^ union.cc:3:8: note: 'my_union::my_union()' is implicitly deleted because the default definition would be ill-formed: struct my_union ^~~~~~~~ union.cc:14:12: error: use of deleted function 'my_union::~my_union()' my_union u; ^ union.cc:3:8: note: 'my_union::~my_union()' is implicitly deleted because the default definition would be ill-formed: struct my_union ^~~~~~~~

slide-57
SLIDE 57

31 / 66

Implementaon

Non-trivial union-like classes

‚ the compiler is unable to generate constructors and destructors for unions ‚ this is because the compiler is unable to determine if a fields destructor and constructor should be called ‚ since only one type can be acve at once the compiler can’t know which one it is (if any) ‚ due to this, we must define special member funcons

  • urselves
slide-58
SLIDE 58

32 / 66

Implementaon

Non-trivial union-like classes struct my_union { my_union() : n{0} { } ~my_union() { } union { int n; std::string s; }; }; int main() { my_union u{}; u.s = "hello"; cout << u.s << endl; }

slide-59
SLIDE 59

32 / 66

Implementaon

Non-trivial union-like classes struct my_union { my_union() : n{0} { } ~my_union() { } union { int n; std::string s; }; }; int main() { my_union u{}; u.s = "hello"; cout << u.s << endl; }

S e g m e n t a

  • n

F a u l t

slide-60
SLIDE 60

32 / 66

Implementaon

Non-trivial union-like classes struct my_union { my_union() : n{0} { } ~my_union() { } union { int n; std::string s; }; }; int main() { my_union u{}; u.s = "hello"; cout << u.s << endl; }

W h y t h

  • u

g h ? !

slide-61
SLIDE 61

33 / 66

Implementaon

Non-trivial union-like classes

‚ only one field is acve at once ‚ in the constructor we inialize n ‚ thus leaving s uninialized ‚ when we assign to s we are assigning to an uninialized string

slide-62
SLIDE 62

33 / 66

Implementaon

Non-trivial union-like classes

‚ assignment assumes that both strings are correctly inialized ‚ we would have to call a constructor on s ‚ ... but this can only be done at inializiaon? ‚ there is one other way to call constructors aer the fact!

slide-63
SLIDE 63

34 / 66

Implementaon

Placement new struct my_union { my_union() : n{0} { } ~my_union() { } union { int n; std::string s; }; }; int main() { my_union u{}; new (&u.s) std::string; u.s = "hello"; cout << u.s << endl; }

slide-64
SLIDE 64

35 / 66

Implementaon

Placement new

‚ placement new is a call to new with an extra parameter ‚ this extra parameter is a pointer to memory where an

  • bject should be placed

‚ this will not allocate any memory ‚ but will instead call a constructor of a specified type on the specified memory locaon ‚ this is a way to manually handle lifeme without any dynamic allocaons!

slide-65
SLIDE 65

36 / 66

Implementaon

But what about destrucon? int main() { my_union u{}; // call constructor new (&u.s) std::string; u.s = "hello"; cout << u.s << endl; // explicitly call destructor u.s.std::string::~string(); }

slide-66
SLIDE 66

37 / 66

Implementaon

But what about destrucon?

‚ unions does not track which field is acve ‚ so the compiler will be unable to call the appropriate destructor ‚ the my_union destructor is unable to know which field is acve ‚ therefore we have to manually call the destructor of s to ensure that no memory leaks occur ‚ calling the string destructor will only work if the union actually contains a string

slide-67
SLIDE 67

38 / 66

Implementaon

Extra note

‚ u.s.std::string::~string() is the way we call the destructor ‚ if we have using std::string or

using namespace std in our code we can simplify this

to u.s.~string() ‚ std::string is in reality an alias for

std::basic_string<char> so we can also write u.s.~basic_string()

slide-68
SLIDE 68

39 / 66

Implementaon

OK, but how do I get correct destrucon automacally? struct my_union { my_union() : n{0}, tag{INT} { } ~my_union() { } union { int n; std::string s; }; enum class Type { INT, STRING }; Type tag; };

slide-69
SLIDE 69

40 / 66

Implementaon

OK, but how do I get correct destrucon automacally?

‚ the only way to correctly destroy objects is if we

  • urselves keep track of what the current type is

‚ we create a so called tagged union ‚ we have some kind of data member that tracks what the current type is stored ‚ we will of course have to update this tag whenever we change the type

slide-70
SLIDE 70

41 / 66

Implementaon

Now we are ready for our own implementaon class Variant { public: // ... private: enum class Type { INT, STRING }; Type tag; union { int n; string s; }; };

slide-71
SLIDE 71

41 / 66

Implementaon

Now we are ready for our own implementaon class Variant { public: Variant(int n = 0); Variant(string const& s); ~Variant(); Variant& operator=(int other) &; Variant& operator=(string const& other) &; int& num(); string& str(); // ... };

slide-72
SLIDE 72

42 / 66

Implementaon

Union-based implementaon

‚ we create our variant as a tagged union ‚ use the tag data member to keep track of which type is currently stored ‚ we have assignment and geers as our interface ‚ will have to always check the type before performing

  • peraons
slide-73
SLIDE 73

43 / 66

Implementaon

Constructors Variant::Variant(int n) : n{n}, tag{Type::INT} { } Variant::Variant(string const& s) : s{s}, tag{Type::STRING} { }

slide-74
SLIDE 74

44 / 66

Implementaon

Constructors

‚ the constructors will inialize the appropriate field in the union ‚ they will also inialize tag to the appropriate value

slide-75
SLIDE 75

45 / 66

Implementaon

Destructor Variant::~Variant() { if (tag == Type::STRING) { s.~string(); } }

slide-76
SLIDE 76

46 / 66

Implementaon

Destructor

‚ if the currently assigned value is of type int then nothing needs to be done ‚ however; if the acve type is string we have to manually call the destructor on that field

slide-77
SLIDE 77

47 / 66

Implementaon

Assignment operators Variant& Variant::operator=(int other) & { if (tag == Type::STRING) { s.~string(); } n = other; tag = Type::INT; return *this; }

slide-78
SLIDE 78

47 / 66

Implementaon

Assignment operators Variant& Variant::operator=(string const& other) & { if (tag == Type::INT) { new (&s) string; } s = other; tag = Type::STRING; return *this; }

slide-79
SLIDE 79

48 / 66

Implementaon

Assignment operators

‚ if we are assigning a string we must guarantee that s is an inialized string object ‚ if the acve field is not string in that case we have to use placement new to construct a string in s ‚ if we are assigning an int we must potenally destroy s (if s was the previous acve field) ‚ therefore we check the type and call the destructor if necessary

slide-80
SLIDE 80

49 / 66

Implementaon

Geers int& Variant::num() { if (tag == Type::INT) { return n; } throw /* ... */; }

slide-81
SLIDE 81

49 / 66

Implementaon

Geers string& Variant::str() { if (tag == Type::STRING) { return s; } throw /* ... */; }

slide-82
SLIDE 82

50 / 66

Implementaon

Geers

‚ the geers should only return valid values ‚ therefore we throw some kind of excepon if the acve field is of incorrect type

slide-83
SLIDE 83

51 / 66

Implementaon

Test program Variant v{}; // will set n = 0 cout << v.num() << endl; // active field is int v = 5; cout << v.num() << endl; // active field is int, we must // construct a string inside the variant v = "this is a long string"; cout << v.str() << endl; // the destructor must destroy the string here

slide-84
SLIDE 84

1 Intro 2 Union 3 STL types 4 Implementaon 5 Second Implementaon

slide-85
SLIDE 85

53 / 66

Second Implementaon

Placement new std::string s{}; char data[sizeof(std::string)]; union { int n; std::string s; } u; int array[sizeof(std::string) / sizeof(int)]; int i{}; new (&s) std::string; // OK new (data) std::string; // OK new (&u.s) std::string; // OK new (array) std::string; // NOT OK new (&i) std::string; // NOT OK

slide-86
SLIDE 86

54 / 66

Second Implementaon

Placement new

‚ We can place our object in any memory that is; ‚ a union ‚ a char array with enough space ‚ or an object of the same type as the one we are trying to construct

slide-87
SLIDE 87

55 / 66

Second Implementaon

Placement new in C-arrays char data[sizeof(std::string)]; std::string* p {new (data) std::string}; *p = "hello world"; p->~string();

slide-88
SLIDE 88

56 / 66

Second Implementaon

Second version (no union) class Variant { public: // ... private: enum class Type { INT, STRING }; char data[sizeof(string)]; Type tag; };

slide-89
SLIDE 89

56 / 66

Second Implementaon

Second version (no union) class Variant { public: Variant(int n = 0); Variant(string const& s); ~Variant(); Variant& operator=(int other) &; Variant& operator=(string const& other) &; int& num(); string& str(); // ... };

slide-90
SLIDE 90

57 / 66

Second Implementaon

Constructors Variant::Variant(int n) : data{}, tag{Type::INT} { new (data) int{n}; }

slide-91
SLIDE 91

57 / 66

Second Implementaon

Constructors Variant::Variant(string const& s) : data{}, tag{Type::STRING} { new (data) string{s}; }

slide-92
SLIDE 92

58 / 66

Second Implementaon

Now, how do we retrieve our objects from the array? *reinterpret_cast<string*>(&data)

slide-93
SLIDE 93

58 / 66

Second Implementaon

Now, how do we retrieve our objects from the array? *reinterpret_cast<string*>(&data)

U n d e fi n e d B e h a v i

  • u

r

slide-94
SLIDE 94

59 / 66

Second Implementaon

Aliasing int x{}; // aliases to x int* p{&x}; int& r{x}; // modifying x through aliases *p = 5; // OK r = 7; // OK

slide-95
SLIDE 95

59 / 66

Second Implementaon

Aliasing int x{}; float* p{reinterpret_cast<float*>(&x)}; *p = 3.7; // NOT OK

slide-96
SLIDE 96

60 / 66

Second Implementaon

Strict aliasing rule

An object of type T can be aliased if the alias has one of the following types; ‚ T* ‚ T& ‚ char* ‚ (unsigned char* and std::byte*)

slide-97
SLIDE 97

61 / 66

Second Implementaon

Strict aliasing rule

accessing objects through pointers or references is known as aliasing. ‚ so when aliasing an object of type T the following must be true; ‚ must be accessed through a T pointer or reference ‚ or must be accessed through a char pointer ‚ otherwise this is undefined behaviour This is known as the strict aliasing rule

slide-98
SLIDE 98

62 / 66

Second Implementaon

The fix *std::launder(reinterpret_cast<string*>(&data));

slide-99
SLIDE 99

63 / 66

Second Implementaon

std::launder

‚ std::launder is defined in <new> ‚ tell the compiler that it must ignore the strict aliasing rule in this case ‚ Note: only correct if we are trying to point to an actually constructed object of the specified type

slide-100
SLIDE 100

64 / 66

Second Implementaon

Geers int& Variant::num() { if (tag == Type::INT) { return *std::launder( reinterpret_cast<int*>(&data)); } throw /* ... */; }

slide-101
SLIDE 101

64 / 66

Second Implementaon

Geers string& Variant::str() { if (tag == Type::STRING) { return *std::launder( reinterpret_cast<string*>(&data)); } throw /* ... */; }

slide-102
SLIDE 102

65 / 66

Second Implementaon

Destructor Variant::~Variant() { if (tag == Type::STRING) { str().~string(); } }

slide-103
SLIDE 103

66 / 66

Second Implementaon

Assignment operators Variant& Variant::operator=(int other) & { if (tag == Type::STRING) { str().~string(); } tag = Type::INT; num() = other; return *this; }

slide-104
SLIDE 104

66 / 66

Second Implementaon

Assignment operators Variant& Variant::operator=(string const& other) & { if (tag == Type::INT) { new (data) std::string; } tag = Type::STRING; str() = other; return *this; }

slide-105
SLIDE 105

www.liu.se