Advanced Software Engineering with C++ Templates
Templates
Thomas Gschwind <thgatzurichdotibmdotcom>
Advanced Software Engineering with C++ Templates Templates Thomas - - PowerPoint PPT Presentation
Advanced Software Engineering with C++ Templates Templates Thomas Gschwind <thg at zurich dot ibm dot com> Templates Polymorphisms Specialization Declaration and Use Classes and Members Ambiguities An Example (pvector)
Templates
Thomas Gschwind <thgatzurichdotibmdotcom>
91
§ Polymorphisms § Declaration and Use § Ambiguities § Specialization § Classes and Members § An Example (pvector)
§ “Ad-hoc”
§ Dynamic
(using a virtual method table)
§ Static or Parametric
92
§ Allow to implement a function or class for a set of types and not just a ”single” hard-coded type
§ Support generic programming
templates
93
§ Specify type as additional compile-time parameters
template
§ Are checked and resolved statically (during compile time)
§ The definition must be available to the compiler
94
template<class T> T min(T a, T b) { return a<b ? a : b; }
This is “old style”, typename is “more” correct, but many people still prefer class. If you use an antiquated C++ compiler you may have to use class.
§ When invoking templates we can specify as additional (compile- time) parameter the type of the template to use
97
template<typename T> T min(T a, T b) { return a<b?a:b; } const double pi=3.141596; void f() { min<double>(2.718282, 1.0); min<char>('a', 'z'); min<int>(1, 26); min<double>(pi, 2.718282); min<int>('a', 26); min<double>(2.718282, 1); }
§ In most cases, template parameters are deduced by the compiler
98
template<typename T> inline T min(T a, T b) { return a<b?a:b; } const double pi=3.141596; void f() { min(2.718282, 1.0); min('a', 'z'); min(1, 26); min(pi, 2.718282); min('a', 26); min(2.718282, 1); } template<typename T> inline T min(T a, T b) { return a<b?a:b; } const double pi=3.141596; void f() { min(2.718282, 1.0); // ok min('a', 'z'); // ok min(1, 26); // ok min(pi, 2.718282); // ok min('a', 26); // error, ambiguous min(2.718282, 1); // error, ambiguous }
§ Unlike “normal” functions, there is no implicit conversion for parameters passed to templates § Explicit
min<int>('a', 26);
min<const double>(pi, 2.718282);
100
§ Templates and non-templates can be mixed § Can define a template-based function min § And define a non template-based function min at the same time § Non-templates are preferred over templates if no type conversion necessary
template<typename T> T min(T a, T b) { return a<b ? a : b; } double min(double a, double b) { return a<b ? a : b; }
101
§ We can create separate helper functions
§ This approach not only looks tedious but is also error-prone, clumsy, …
int min(int x, int y) { return min<int>(x, y); } double min(double x, double y) { return min<double>(x, y); }
102
§ What happens if we use it with (C-style) strings? § Based on the behavior of the
expect a lexicographical comparison of the arguments § However, compares the addresses where the strings are stored
smaller address
103
cout << min("Hello", "World") << endl; World Hello 0x1000 0x2000 Memory
§ Templates and non-templates can be mixed § Define a non template-based function min for C strings
template<typename T> T min(T a, T b) { return a<b ? a : b; } char *min(char *a, char *b) { return strcmp(a, b)<0 ? a : b; } const char *min(const char *a, const char *b) { return strcmp(a, b)<0 ? a : b; } #include "min.h" void foo(char *x, char *y, const char *z) { cout << min(x,y) << endl; cout << min(x,z) << endl; cout << min<const char*>(x,z) << endl; } // yes // yes // compiles but no We are asking for the template, so we get the template …
104
§ C++ allows us to specialize an existing template for specific types
template<typename T> T min(T a, T b) { return a<b ? a : b; } template<> char *min<char *>(char *a, char *b) { return strcmp(a, b)<0 ? a : b; } template<> const char *min<const char *>(const char *a, const char *b) { return strcmp(a, b)<0 ? a : b; } #include "min.h" void foo(char *x, char *y, const char *z) { cout << min(x, y) << endl; // yes cout << min(x, z) << endl; // error cout << min<const char*>(x, z) << endl; // yes }
Compiler error; as we discussed, there is no implicit parameter conversions for templates.
105
§ Works exactly the same § Simply put
template <typename T, typename U, …>
in front of the declaration § It is even OK, to introduce new template parameters for individual member functions § Before C++17 no template parameter deduction for constructors
pair<int, bool>(1, false)
§ Template parameter deduction of constructors [C++17]
106
§ We want to implement a persistent version of C++’s vector class § Reads all elements from a file in the constructor § Writes all elements back to the file in the destructor
template<typename T> class pvector { string filename; vector<T> v; … public: pvector(string fname) : filename(fname) { readvector(); } ~pvector() { writevector(); } void push_back(const T &el) { v.push_back(el); } void pop_back() { v.pop_back(); } …
107
template<typename T> class pvector { string filename; vector<T> v; void readvector() { ifstream ifs(filename); for(;;) { T x; ifs >> x; if(!ifs.good()) break; v.push_back(x); } } void writevector() {
typename vector<T>::iterator fst=v.begin(), lst=v.end(); while(fst!=lst) ofs << *fst++ << endl; } …
108
OR starting with C++11, simply: for (const T &elem : v) ofs << elem << endl;
§ What happens if we pass the pvector around? § Hence, maybe we want to disable the copy-constructor for
pvector<T>
void foo(pvector<int> pv) { if(pv.size()>0) cout << pv[0] << endl; pv.push_back(17); } int main(int argc, char *argv[]) { pvector<int> pv("/tmp/pvector-int.txt"); foo(pv); }
109
110
Separate Compilation
Thomas Gschwind <thgatzurichdotibmdotcom>
112
§ Compared to Java § Variables § Routines (Functions & Operators) § Types (Structures, Classes) § Makefiles
§ Why?
§ How?
113
§ Each Java source file is compiled into a class file § If a Java class invokes a method of another class, the compiler consults that other class file to
§ That’s why in Java, the compiler needs the class path § Finally, all class files are loaded by the Java Virtual Machine (and “linked”) § Java source code can be reconstructed from .class file (see Java Decompiler: jd, jad)
114
§ Let us write Hello World in Java § In order to simplify the reconfiguration (e.g., translation)
115
public class Const { public static final String msg = "Hello World!"; } public class Cool { public static void main(String[] args) { System.out.println(Const.msg); } }
§ Now compile both Java files and run Cool § Change the msg in Const.java, recompile Const.java, and run Cool
public class Const { public static final String msg="Hello World!"; } public class Cool { public static void main(String[] args) { System.out.println(Const.msg); } }
116
§ Now compile both Java files and run Cool § Change the msg in Const.java, recompile Const.java, and run Cool § Cool still prints the old message! Why?
117
public class Const { public static final String msg="Hello World!"; } public class Cool { public static void main(String[] args) { System.out.println(Const.msg); } }
§ By convention, each source file is compiled into an object file § Object files provide the “minimum” necessary to execute the code § Object files do not provide enough information for the compiler to identify
§ Object files are not used during the compilation
§ In C/C++, we have the include path instead
118
Strictly speaking, that’s not true as we will learn
§ C++ uses so-called header files for separate compilation § The header file can be viewed as the object file’s interface § Hence, header files are another encapsulation mechanism
§ What goes into the header file?
119
§ Header files need to be included
§ Possibly, the string header also includes iostream
§ Need a mechanism to prevent headers from being included multiple times
120
#include <iostream> #include <string> …
§ Header files need protection from being included multiple times § Otherwise, this may cause compile errors
121
#include "vars.h" int my_dumb_global_variable=17; const double e=2.718281; #ifndef VARS_H_ #define VARS_H_ extern int my_dumb_global_variable; extern const double e; const double pi=3.141596; #endif
if VARS_H_ is not defined process the following lines define _VARS_H end the last open #if… section Process the file “vars.h” This pattern ensures that header files won’t be included multiple times
vars.h vars.cc
§ An alternative to the guard statement is to use #pragma once in your header files
(hard links, symbolic links, multiple filesystem mounts, etc.)
§ My suggestion
122
§ Variables
(if variable is to be accessed elsewhere)
§ Constant Variables
(if variable is to be accessed elsewhere)
123
124
#include "vars.h" int my_dumb_global_variable=17; const int primes[]={2, 3, 5,…, 1234567891}; #ifndef VARS_H_ #define VARS_H_ extern int my_dumb_global_variable; const double pi=3.141596; extern const int[] primes; #endif
Use extern to declare a variable to be defined elsewhere. No memory will be allocated for the variable
vars.h vars.cc
Constants may be defined in the header or declared like other variables. Include the header (for consistency checking) Variables are defined in the implementation file. Do not repeat constants defined in the header.
§ Functions
§ Inline Functions
treat them like functions
declaration and definition go into the header file
the implementation to be use instead of the function call
125
126
#include "util.h"
int gcf(int a, int b) {
if (a<b) swap(a,b); while (b!=0) { a=a-b; if (a<b) swap(a,b); } return a; }
#ifndef UTIL_H_ #define UTIL_H_
inline void swap(int &a, int &b) {
int c=a; a=b; b=c; }
extern int gcf(int a, int b); inline int lcm(int a, int b) {
return (a/gcf(a,b))*b; }
#endif
util.h util.cc
Inline functions are declared and defined in the header file. Use extern to declare a function. Extern for function declarations is
body, it cannot be a definition. Functions are defined in the implementation file. Do not repeat inline functions defined in the header.
§ We can declare functions as inline
just as a hint (or ignore it altogether since they know better)
visible to the compiler which is why inline functions go into the header
§ However, newer compilers support link-time-optimization
§ For the time being, it is not a lot of effort to put small functions as inline functions to the header
127
§ Type declarations and definitions go into the header
in all compilation units that need to allocate the type
§ For member functions, the same rules as for functions apply
implementation file
compilation unit need to be declared as part of the type definition
128
129
class fraction { int c; int d; public: fraction(int cntr=0, int denom=1) : c(cntr), d(denom) { /*void*/ } fraction operator*(const fraction &b); fraction operator/(fraction b) { swap(b.c, b.d); return (*this)*b; } }; #include "fraction.h" #include "util.h" fraction::fraction operator*(const fraction &b) { fraction r; int f1=gcf(this->c,b.d), f2=gcf(b.c,this->d); r.c=(this->c/f1)*(b.c/f2); r.d=(this->d/f2)*(b.d/f1); return r; }
member, inline functions go into the header file.
fraction.h fraction.cc
The complete layout of a type (public, protected, private members) go into the header file.
130
§ Type declaration and definition go into the header
§ If the template is only to be parameterized with a small set of types
classes
full template definitions (for instance, template class pvector<string>;)
§ Include header files
find compile errors in header more easily
find missing definitions in header more easily
§ Compile each file § Put dependencies into Makefile
needs to be recompiled
files including it need to be recompiled
131
#include <stdlib.h> #include <iostream> #include "fraction.h" #include "util.h" #include "vars.h" void main(int argc, char *argv[]) { int arg=atoi(argv[1]); cout << arg << "^2*pi=“ << arg*arg*pi << endl; cout << "e=" << e << endl; cout << gcd(atoi(argv[1]), atoi(argv[2])) << endl; cout << lcm(atoi(argv[1]), atoi(argv[2])) << endl; … // use of fraction data type }
all: main main: main.o fraction.o util.o vars.o g++ -o main main.o fraction.o util.o vars.o main.o: main.cc fraction.h util.h vars.h g++ -c main.cc fraction.o: fraction.cc fraction.h util.h g++ -c fraction.cc util.o: util.cc util.h g++ -c util.cc vars.o: vars.cc vars.h g++ -c vars.cc
Makefile
main.cc fraction.cc fraction.h util.h vars.h main.o fraction.o util.o vars.o main util.cc vars.cc Link the final executable (could also use ldd but tedious)
132
§ gcc: not only compiles and links files
(Checkout the -M… options)
§ ld: link object files
§ nm: list symbols in an object file or program
§ ldd, otool –L: list libraries needed by an object file
133
134
§ As mentioned before, gcc allows to generate source dependencies § This Makefile can be used as a generic starter for your Makefile
CXXFLAGS=... CXXFLAGS+=-Wall -Wextra -Werror OBJS=main.o # "main" file OBJS+=fraction.o ... # others… all: main clean: rm -f main *.o distclean: clean rm -f .depend/*.d rm -f *~ ...
135
...
%.o: %.cc g++ $(CXXFLAGS) -c -o $@ $*.cc @g++ -MM $(CXXFLAGS) -c $*.cc >.depend/$*.d main: $(OBJS) g++ $(LDFLAGS) -o $@ $(OBJS)
136
Memory Management
Thomas Gschwind <thgatzurichdotibmdotcom>
138
§ Allocation & Deallocation § Stack § Variables (Pointers, Arrays, References) § Heap § Memory and Classes
§ In Java
(only their references are stored on the stack)
§ In C++
built-in types both on the stack and the heap
(with delete and delete[])
139
140
§ Call by Value
§ Call by Reference/Pointer
§ C++/C
§ Java
§ C#
Memory needs to store A typical layout looks like this
141
§ The program § Global variables § Local variables (on the stack) § Data allocated dynamically with new (on the heap)
Memory
0xffff:
. . . . Stack
0x0000:
Heap . . . Global Variables . . Program . .
§ Stores local variables, return addresses, etc. § Right side shows the (simplified) stack after fact(2) has been invoked
142
int fact(int n) { int m=n-1; if (n<=2) return n; else return n*fact(m); } int main() { cout << "fact(4)=" << fact(4) << endl; return 0; }
0x0ff8: 0x0ffc: n==4, m==3 “line 8” . . main() - code . . 0x0ff0: “line 4” n==3, m==2 “line 4” n==2, m==1 0x0fec: 0x0fe8: 0x0fe4: 0x1000: result result result 0x0fdc: 0x0fd8: 0x0fd0: fact(4) fact(3) fact(2)
1 2 3 4 5 6 7 8 9 10
§ Pointers are a fundamental concept of C++ § Pointers should be used sparsely and carefully § A pointer points to a value or object stored anywhere in memory § Since pointers can point to different TYPEs of values, there are different types of pointers denoted by TYPE* § A pointer is similar to an iterator iterating over a collection of elements of type TYPE
143
§ &-Operator
§ *-Operator
to)
144
An lvalue (left value) is an expression that can occur on the left or the right hand side
an expression that can only occur on the right hand side.
§ Pointers store an address § This is visible to the developer § The type of the pointer indicates the type of object stored at the address
Address-Space
0x0ff8: 0x0ffc: b==5 a==3 . . . . main() . . . . . . . 0x0ff4: pa==? a==9
int main() { int a=3; int b=5; int *pa; pa=&b; pa=&a; // ok *pa=9; // ok }
pa==0x0ff8 pa==0x0ffc 0x1000:
145
§ Implement a routine that exchanges the value of two arguments § Arguments must be an lvalue § Invoked with c_swap(&var1, &var2); § In Java, this is impossible, arguments need to be wrapped within an object
void c_swap(int *x, int *y) { int z=*x; *x=*y; *y=z; } int a=3, b=5; c_swap(&a, &b);
146
§ Similar to pointers – sometimes more elegant § However, like pointers they should be used
§ A reference refers to a value or object stored in memory, it is another name (alias) for a given value or object stored in memory § Unlike a pointer, it cannot be changed to refer to a different location
147
§ References are similar to pointers § Except their implementation is “invisible”
Address-Space
0x6f04: 0x6f08: b==5 a==3 . . . . main() . . . . . . . 0x6f00:
int main() { int a=3 int b=5; int &pa=a; pa=7; // ok //&pa=b; // error pa=b; // ok }
pa/a==3 pa/a==7 pa/a==5
148
§ Implement a routine that exchanges the value of two arguments § Argument must be an lvalue § Invoked with swap(var1, var2); § Similar to VAR parameter in Pascal § In Java, this is impossible, in C#, however, it is possible Int a =3, b =5; Swap(a, b); // a==5, b==3
void swap(int &a, int &b) { int c=a; a=b; b=c; }
149
§ Functions may also return references § Allows the function to be used as lvalue
class fraction { public: … // conversion fraction to double
// references as return value int &counter() { return c; } int &denominator() { return d; } }; void normalize(fraction &a) { int f = gcf(a.counter(), a.denominator()); a.counter() = a.counter() / f; a.denominator() /= f; }
counter() and denominator() may now be used on the left side of operator= (i.e., as lvalue) However, this may make it harder to change the internal representation of fraction numbers in the future
150
§ References have the same characteristics as a pointer
§ References are taken implicitly
§ The Google Coding Style suggests to avoid references as return values and instead to stick to pointers
151
§ Function references elements passed to it by caller
(very common pattern)
§ Function returns reference to element passed to it
§ Member function returns reference to element of its class
T &vector::operator[](size_t index)
152
§ Arrays provide memory for several values of the same type § In C++ an array is typically equivalent to a pointer of the first element of the array
0x0ff8: 0x0ffc: buf[2]==? buf[3]==? . . main() . . 0x0ff4: buf[1]==? buf[0]==? bufp==0x0fec 0x0ff0: 0x0fec: 0x0fe8: 0x1000:
void foo() { int buf[4]; int *bufp=buf; // ok buf[0]=3; // ok *buf=3; // same *bufp=*buf; // same ++bufp; *bufp=*buf+1; }
buf[0]==3 “line 8” bufp==0x0ff0 buf[1]==4
buf
153
§ Arrays are not range checked (buffer-overflow) § C++ happily assigns a value to buf[4] § Arrays “cannot” be returned as the result of a function § You may want to use the following alternatives in C++
154
§ Stores non-local and non-global data § If memory needs to be allocated during runtime (e.g., linked lists, large arrays, …) § Memory needs to be explicitly allocated (like new like in Java) § For returning large user-defined types or arrays from routines (although consider alternatives) § When allocating data on the heap, think about its ownership and life-cycle
deallocate it
155
Memory
0xffff:
. . . . Stack
0x0000:
Heap . . . Global Variables . . Program . .
§ “No” garbage collection in C/C++
allocated (new, new[]) and freed (delete, delete[])
want uninitialized memory (there can be an advantage to this: realloc)
::operator new (size_t bytes)
§ Error handling
§ Initialization
“Unfortunately, overuse of new (and of pointers and references) seems to be an increasing problem.” Bjarne Stroustrup
156
§ An array per se is a pointer
Memory
0xffff:
. . . .
0x0000:
int* foo2(int n) { int *buf=new int[n]; for (int i=0; i<n; ++i) { buf[i]=i*i; } return buf; } void main() { int *buf=foo2(10); for (int i=0; i<10; ++i) { cout << buf[i] << endl; } delete[] buf; }
buf[2]==4 buf[3]==9 buf[1]==1 buf[0]==0
Program .
0x1000: 0x2000: buf==? “line 7” buf==0x2000 0x0ff8: 0x0ff4: 0x0ff0: 0x2004: 0x2008: 0x200c: result
buf
157
§ The example has a style problem § Can you spot it?
allocated
§ What are the alternatives?
memory returned by foo2
158
§ Frequently, in C, when a function f allocates memory for us, there is a sister function f’ that releases the memory allocated by f
(e.g., getaddrinfo, freeaddrinfo)
int* new_foo2(int n) {
int *buf=new int[n]; for(int i=0; i<n; ++i) { buf[i]=i*i; } return buf; }
void delete_foo2(int *buf) {
delete[] buf; } void main() {
int *buf=new_foo2(10);
for(int i=0; i<10; ++i) { cout << buf[i] << endl; }
delete_foo2(buf); }
159
§ C++ does not perform garbage collection but has wrappers that come close
Delete object pointed to when ptr is destructed
void foo() { auto_ptr<int> pi (new int); *pi=17; cout << "*" << pi.get() << "=" << *pi << endl; auto_ptr<int> pj(pi); // transfer ownership; pi points to NULL *pj=19; cout << "*" << pi.get() << endl; // displays 0 cout << "*" << pj.get() << "=" << *pj << endl; } // deallocate the integer
160
Works the same with C++11 unique_ptr except use pj(pi.release()) to transfer ownership
§ More versatile pointer wrappers
unique_ptr<T> or shared_ptr<T>
§ The following “pointers” provided by the Standard Library help
Delete object pointed to when unique_ptr is destructed
Delete object pointed to by shared_ptr when this is the “last” pointer pointing to the object – works similar to the smart_ptr exercise
Useful in combination with shared_ptr if cyclic structures are used, allows to break up cycles, need to be converted to shared_ptr before object may be accessed, see documentation for details
§ Don’t use them as all-round solution
161
§ unique_ptr and shared_ptr can be used to return objects § The pointers will destruct the object when “they” no longer point to it
unique_ptr<int[]> make_foo3(int n) { unique_ptr p{new int[n]}; for(int i=0; i<n; ++i) { p[i]=i*i; } return p; } void main() { auto buf = make_foo3(10); for(int i=0; i<10; ++i) { cout << buf[i] << endl; } }
162
§ The previous code is great if you insist on C-style memory management § Why not simply return a vector<int>
vector<int> foo2(int n) {
vector<int> res(n); for(int i=0; i<n; ++i) { res[i] = i*i; } return res; } void main() {
vector<int> buf=foo2(10);
for(int i=0; i<10; ++i) { cout << buf[i] << endl;
} }
163
Creates n default ints; if that is a problem, use: vector<int> res; res.reserve(n); for(int i=0; i<n; ++i) { res.emplace_back(i*i); } return res; The reserve(…) member does not initialize the extra memory.
164
§ Templates
§ Separate Compilation § Memory Organization & Management
§ Implement the persistent vector data type. § Experiment with the persistent vector and use it in combination with different data types. What do you observe? Why do you
§ What happens if we pass the pvector<T> around?
165
§ Which line belongs into the header file, respectively into the implementation file? Why?
char ch; string s; extern int error_number; static double sq(double); int count=1; const double pi=3.2; // according to Indiana Pi Bill struct fraction { int c; int d; }; char *prog[]={"echo","hello","world!",NULL}; extern "C" void c_swap(int *a, int *b); double sqrt(double); void swap(int &a, int &b) { int c=a; a=b; b=c; } template<typename T> T add(T a, T b) { return a+b; } namespace { int a; } struct user;
166
§ Upgrade your RPN calculator
your RPN calculator to work with type T; ensure you can use int, double, fraction as type T
the stack persistently (i.e., when you terminate your calculator with numbers on the stack that they reappear when you restart it)
167
§ Implement a template-based version of Connect 4. Connect 4 builds on a playing field composed out of 7 columns each having 6
stone towards the lowest unoccupied column. The player who first has 4 stones in a row (horizontally, vertically, diagonally) wins. § After each turn display the game field using simple ASCII graphics. Implement the game in such a way that players can be exchanged easily using templates. § The precise interfaces to follow will be published at the lecture’s
religiously.
168
§ Have a look at the following swap routines
§ Let the compiler compile the code but ask the compiler to stop at the assembly stage $ gcc -S -o source.s source.cc § Compare the assembly code, what do you observe? § How do you interpret the difference(s)?
169
§ Templates: Traits § Standard Library: Algorithms § Standard Library: Input and Output Have a nice weekend, see you in two week
172