C++ is a very complex language. This is no doubt part of its appeal. If you need (or merely want) some language feature, C++ probably has it in one form or another.
This paper reviews some aspects of C++. I have concentrated on those aspects that will likely be used in this course (Engr 4892-Data Structures) and that are likely to need reinforcement. In particular I have tried to warn against the kinds of misapprehensions I have seen in the past.
For the most part, I have skipped over the statement syntax and most forms of expressions. These are fairly straightforward and most books on C++ cover these aspects quite well.
To really get to know the language, read a good book or two. Bjarne Stroustrup's The C++ Programming Language, third edition is probably the best reference,1 while Scott Meyer's Effective C++ and More Effective C++ focus on specific techniques and areas of confusion. And write lots of programs.
A C++ object is a region of storage (memory). Each object will have a number of attributes: name, type, value, size, address, lifetime. Objects are often called variables, though sometimes the word variable refers to the name of an object. I will use the word variable to mean object.
Most variables have names. The declaration
|
It is important to understand that more than one variable can have the same name at the same time. For example, if we write
{
}
|
then two variables both named i will exist at the same time, if the `then' clause of the if-statement is executed. This causes no problem. The two variables occupy different location in the computer's memory, so the assignment to the inner i in no way affects the outer i. The code prints 1.
Each variable has a type. The type limits the values that a variable may hold. For example, an int variable can hold any integer between a certain minimum and maximum.
A class object is a variable whose type is a class.
At each point in time during the execution of a program, a variable will
have a value. One way to change this value is of course with an
assignment. The assignment
|
The following statement prints the value of the variable named i.
|
Each variable requires a certain amount of space in the computer's memory. An int variable, for example, may require 4 bytes of memory2. If so, 4 contiguous bytes of memory will be allocated to hold the variable value. The sizeof operator is used to compute the size of a variable in bytes.
The following statement prints the size of the variable named i.
|
Each variable has an address. This is a pointer value that locates the variable in memory. You can think of the computer's memory as a big array of bytes. The address of a variable is the index, into this array, of the first byte allocated to hold the variable's value.
For example, if our int variable is allocated bytes 100, 101, 102, and 103. Then the address of the variable is 100.
The & operator is used to compute the address of a variable. For example
|
The following statement prints the address of the variable named i
in hexadecimal notation.
|
Variables are created, used, and then destroyed. After a variable is destroyed, the space it used may be allocated to another variable.
We can classify variables according to the circumstances under which they are created and destroyed. The classifications are: static variables, stack variables, and heap variables. Each variable fits into exactly one of these categories.
Static variables are:
Any variable that is declared outside of a function will be static. Any variable declared inside a function or class will be static if its declaration is proceeded with the keyword static.
class C {
static float d ; // Static variable, but only usable in this file. double e ; // Static variable, globally linked void a_func() {
|
In this course, we hardly use static variables at all.
Stack variables are
Stack variables are also called auto variables which is short for automatic3. Any variable declared within a function, that is not preceded by the keyword static will be a stack variable.
For example, if the following code is executed:
{
}
|
first a stack variable i is created, then a stack variable j is created, finally a stack variable k is created. When the final } is met, all three are destroyed.
Here is a more complex example:
{
} /* i is destroyed here */
|
Note that, if the break statement is executed, then k will be destroyed, and that, if the return statement is executed, then both i and j will be destroyed, together with all other stack variables that were created within the function.
The next stack variable to be destroyed is always the most recently created. This is why they are called stack variables.
Function parameters are also stack variables.
There is one more kind of stack variable. Its existence can often be ignored, but I include it here for completeness. These are called `temporaries'. They hold values returned from functions. For example, suppose we have a function add with the following declaration:
Matrix add(Matrix a, Matrix b) ;
|
Assume x, y, and z are variables of
type Matrix. If the statement
|
Heap variables are
Heap variables are unusual in that they do not have names. They must be accessed using their addresses. Usually we use a pointer variable to hold the address of a heap variable.
For example:
|
Heap variables are only destroyed when their address is sent to the delete command.
You can think of data members and array elements as being variables too.
They are created when the class object or array that they are a part of is
created, and are destroyed when it is destroyed. Just like a variable, they
have types, sizes, and addresses. The name of a field F in a class
object o is
|
|
It may be helpful to look at an example of variables being created and destroyed as functions are called.
Consider the following code
int g( int &r ) {
int f( int i ) {
void main() {
|
Note that main calls f and f potentially calls g. All three subroutines have a variable called x.
The following table shows how the stack evolves as the program executes; time advances from top to bottom.
About to call f | x: 'A' | ||||||
Just called f | x: 'A' | temp: ?? | i: 1 | x: ?? | |||
Just called g | x: 'A' | temp: ?? | i:1 | x: ?? | temp: ?? | r: address of i | x: ?? |
While returning from g | x: 'A' | temp: ?? | i: 13 | x: ?? | temp: 42 | r: address of i | x: 99.0 |
While returning from f | x: 'A' | temp: 55 | i: 13 | x: 42 | |||
About to call f again | x: 'A' | y: 63 | |||||
Just called f again | x: 'A' | y: 63 | temp: ?? | i: 0 | x: ?? | ||
While returning from f | x: 'A' | y: 63 | temp: 185 | i: 99 | x: 86 | ||
End of main | x: 'A' | y: 63 |
Each box5 represents the variables belonging to a single function invocation. The temporary variables (labelled temp) are created to hold the result of each function call. Note that the variables belonging to the two invocations of f are entirely distinct; that is the i and x of the second invocation are entirely different from the identically named variables of the first invocation.
For each type T there is another type `pointer to T'. The address of any variable of type T will have this type, as will a null pointer (T*)0. Pointer variables are simply variables that have a pointer type as their type.
If the last value assigned to a pointer variable p was the address
of a variable v, then we say that `p points to v'
and the expression *p acts just like the variable v. Thus
if p points to v,
|
|
Heap variables are not destroyed when the pointer that points to them
is destroyed. For example the code
|
The rule of equal deletion. Delete every heap variable you create.
|
It is possible to run out of room for heap variables if you create a lot of them, or you create some very big ones, or have space leaks. The good news is that you can detect this problem because the address returned by new will be a special invalid address called `the null pointer'. The null pointer of type `pointer to T' can be written as (T*)0, but usually you can simply write 0.
You must never assume that new will be successful. Think of it as a polite request, rather than a command. Thus you should always check to see if it was successful or unsuccessful. E.g. rather than
int *p = new(nothrow) int ; | // |
*p = 13 ; | // Wrong! p may be null. |
etc | // |
do this
int *p = new(nothrow) int ; | // |
if( p == 0 ) { | // Right! |
deal with the failure | // |
} else { | // |
*p = 13 ; | // |
etc | // |
} | // |
The rule of checking new. Check after each new(nothrow) to see if you got the null pointer back.
|
The International Standards Organization (ISO) recently changed a few things about C++. new is something they changed - though why, I'll never know. Prior to the ISO, if new failed to find space, it returned a null pointer. Now, with ISO C++, if new fails to find space something happens called `throwing an exception' instead. See Section Exceptions, if you want to know what `throwing an exception' means - for now you don't have to. Luckily the ISO left a loop-hole. If you want the `old fashioned' behaviour of getting back a null pointer, all you have to do is:
#include <new> |
using namespace std ; |
at the top of your source file6, and
In this course we will use new(nothrow). The reason is simple; we want to keep up with the ISO's version of C++, but the compiler we will use does not support the new `throwing' behaviour of new. Furthermore, until all compilers support the new `throwing' behaviour of new, code written blithely assuming the new new behaviour will be less portable. I recommend using (nothrow) for at least the next few years.
The rule of nothrow. Include standard header new and use only new(nothrow).
|
You can allocate a whole array at once with statements like:
T *p ; |
p = new(nothrow) T[i] ; |
where i is an integer (greater or equal to 0). This allocates an array with i elements and sets p to point to the first element of the array. Later, if p still points to the first element of the array, we can destroy the array with the statement
The C++ system will have remembered how big the array is, so there is no need to tell it again.
What happens if p does not point to a variable? I can assure you nothing good will come of executing the expression *p in this circumstance. Unlike syntax errors, the compiler will likely not give you any warning. In fact errors of this sort are very hard to catch, even through debugging. There are (at least) four ways to create a pointer variable that does not point to a variable:
char *targ ; |
while( *source != 0 ) { |
*targ = *source ; // targ is not initialized! |
targ += 1 ; source += 1 ; } |
*targ = (char)0 ; |
The correct thing to do would be to create an array (either heap or stack - as appropriate) to hold the string and initialize targ to point to the first element of that array (see next point).
char a[10] ; // Make space |
char *targ = & a[0] ; //Initialize the pointer this time. |
while( *source != 0 ) { |
*targ = *source ; |
targ += 1 ; source += 1 ; } |
*targ = (char)0 ; |
If the length of the `string' pointed to by the initial value of source is longer than 10 characters (including the 0 byte that marks its end), then the dereference of targ will eventually be invalid. And if the array source points into does not have a 0 byte in it, then the dereference of source will eventually be invalid.
Here is an example where p pointed to a heap variable
Matrix * q = new(nothrow) Matrix[10] ; |
Matrix *p = & q[4] ; |
... |
delete [] q ; |
... |
*p = a ; // Bad pointer! |
The first line allocates an array of 10 Matrixes on the heap. The second line creates p and sets it to point to the 5th of them. I'll assume there are no assignments to p or q hidden in the ellipses. The delete line destroys all 10 Matrixes. The final line is in error because it makes reference to a variable that has been destroyed.
Here is an example where a stack allocated variable is referred to after it has been destroyed. Suppose there is a function that returns a pointer to a Matrix, looking something like this:
Matrix *f() { |
Matrix A ; |
... |
return &A ; /* Bad idea! We are about to destroy A */ |
} |
If we make a call
|
A pointer that points to a variable that has been previously destroyed is called a `dangling reference'.
We can summarize these cases with a rule
The rule of good pointers. Never dereference a pointer unless you can prove that it points to a live variable. Watch out especially for uninitalized pointers, array bounds, null pointers, and dangling references.
|
Violations of this rule lead to very hard to find errors. If you dereference a bad pointer, there is no guarantee your program will crash or do anything else to call attention to the bug. Often the program even behaves as you wish, and the bug only has an effect when an unrelated change is made to the program.
Pointer variables and arrays get mixed up in the minds of many C/C++ programmers. The reason for this is that arrays sometimes act like pointers and pointers sometimes act like arrays. It is important to keep in mind the differences.
|
|
|
|
So why do they get confused?
|
void f(int *p) { |
.... |
} |
You may call it with a call f(a). This is the same as f( & a[0] ). This is why arrays are sometimes said to be passed `by reference' in C and C++. Even if you declare f as
void f(int p[10]) { |
.... |
} |
p is still a pointer, not an array! There are no array parameters in C++ - only pointer parameters that are declared like arrays.
|
Pointers to characters often are often mixed up with ``strings''. One common way to represent a sequence of characters is to set aside some room for them in an array and refer to them by a pointer to the first character in that array. For example
char a[4] ; a[0] = 'a' ; a[1] = 'b' ; a[2] = 'c' ; a[3] = (char)0 ; char *p = & a[0] ;
|
The end of the sequence is marked with a special character (char)0, whose ASCII code is zero. (This character should not be confused with the character '0' which represents the digit `zero'; its ASCII code is 48.) Character sequences represented in this way are commonly referred to as a strings or a C-style strings. Many library subroutines assume this representation is being used.
When the C++ compiler sees a quoted string as in
|
Don't mix up the empty string and the null pointer. The statement
|
|
Two things make references confusing: (a) They are so much like pointers, you can easily forget they are different. (b) They are so unlike all other kinds of variables.
A reference variable r, is another name for some other variable. For example if I write
int i = 10 ;
int &r = i ;
then that declares r to be a second name for the variable named i. We can now write
r = 13 ;
cout << i ;
and that will print 13.
Now the above code is pretty silly because there is no advantage to calling i by any other name. A more useful use of references is when you have a long complicated name that you want to use several times:
{
}
Of course you can achieve the same effect using a pointer variable:
{
}
In fact that is exactly how reference variables are typically implemented by the compiler.The following three code snippets mean the same thing
{int &r = i ; | {int *p = &i ; | { |
x = r+2 ; | x = *p + 2 ; | x = i + 2 ; |
r = 2*y ; | *p = 2*y ; | i = 2*y ; |
} | } | } |
All three will likely be compiled to the same machine code. References are like pointers, but are more limited. You must initialize a reference, and you can never change which variable the reference refers to.
References are used almost exclusively for two purposes: parameter passing and result returning.
Reference parameters are simply reference variables that are parameters. They are initialized by the function call. The following two functions differ only in how they are called.
// Pass by reference | // Pass by pointer |
void r_double(int &r) { | void p_double(int *p) { |
r = r+r ; | *p = *p + *p ; |
} | } |
The first is called like this `r_double( i )' and the second like this `p_double( &i )'.
Putting the word const in the declaration of a reference variable is a promise not to assign to the referenced variable. Consider the function
// Pass by const reference |
float r_determinant(const Matrix &m) { |
Etc |
} |
We can tell from the declaration that it won't change the argument. The same is true if the Matrix is passed by value:
// Pass by value |
float v_determanant(Matrix m) { |
Same body as r_determinant. |
} |
So why is r_determinant better? The call v_determinant(a) requires calling the copy-constructor of Matrix to initialize m with a's value. But the call r_determinant(a) only requires the finding of a's address to initialize the reference m. For large objects, this can be a considerable run-time saving. For integers and other small objects, passing by value is more efficient.
References are also used in return values. In this course, we will only have to do this for defining assignment operators. The usual form of an assignment operator for a class C is
C & operator=(const C &r) { |
if( the object is not being assigned itself ) { |
Finalize old value of this object and |
copy the value of r to this object. } |
return *this ; |
} |
(See section 8.1 for a description of the this pointer.)
Use references to return values with great care. Consider:
Matrix &f() { |
Matrix A ; |
... |
return A ; /* Bad idea! */ |
} |
Yes, this saves a call to the copy constructor. But it creates a dangling reference. The stack variable, A, to which the result reference refers will be destroyed as part of the return. The only reason you can get away with it in the assignment operator is that, if *this is a stack variable, it will have a longer life than the temporary reference used to return it. My advice is only use return by reference when you are sure the object the reference refers to will be destroyed after the temporary reference used to return it.
Functions are either member functions or nonmember functions. Member functions are associated with one class when defined and with one object (called the recipient) when called.
You provide the compiler information about functions in two ways. With declarations and definitions. The declaration tells the compiler about the type of the function. The definition tells the compiler about the implementation.
Here are declarations of some nonmember functions:
int f(char p) ; Matrix operator * (const Matrix &a, const Matrix &b) ; /* Binary operator*/ float operator ~(const Matrix &a) ; /* Unary operator*/
|
Declarations of member functions (a.k.a. methods) are placed within a class construct. E.g.7
class Matrix {
} ;
|
Function definitions supply the body of the function. Definitions of nonmember functions are straight-forward.
int f(char p) {
} Matrix operator * (const Matrix &a, const Matrix &b) /* Binary operator*/ {
} float operator ~(const Matrix &a) /* Unary operator*/ {
}
|
Definitions of member functions require an indication of what class the function is a member of. This is done by prefixing the function's name with the class's name and a ::.
void Matrix::assign(int i, int j, float val) {
} Matrix Matrix::operator + (const Matrix &b) const /* Binary operator */ {
} Matrix Matrix::operator ! () const /* Unary operator*/ {
}
|
People get confused by the C/C++ declaration syntax. This is not surprising as it is mad. But there is a method to its madness. The slogan is `declare it as you use it.'
For example, if you want the expressions i, *p, a[i], (*q)[3], and *((*z[4])(5)) to be of type int. Then you declare the variables i, p, q, and z as
int i ; |
int *p ; |
int (*q)[10] ; |
int *((*z[10])(int)) ; |
What do these mean? Take q as an example. Since (*q)[i] will be an int, *q must be an array of ints, and thus q is a pointer to an array of 10 ints.
How about z? We can reason
*((*z[i])(i)) | is an int, so |
(*z[i])(i) | is a pointer to an int, so |
*z[i] | is a function taking an integer and returning a pointer to an int, so |
z[i] | is a pointer to a function taking an integer and returning a pointer to an int, so |
z | is an array of 10 pointers to functions taking integers and returning pointers to ints. |
The same goes for function declarations and definitions:
|
References are an exception to the `declare is as you use it' rule:
int &r ; |
int *&rp ; |
declare r to be a reference to an int and rp to be a reference to a pointer to an int, but this does not imply that either &r or *&rp are ints.
It is annoying to write crazy things, like the definitions of q,
f, and z above, more than you have to. This is why the
typedef declaration is nice. The declaration
|
|
In this course we won't use typedefs much because they don't seem to mix well with templates.
The keyword const can be confusing. At least it always confuses me.
It declares that a variable is constant and hence will not change. Thus
|
|
|
|
Most C++ coders write
|
|
|
You also get some choice about where you write const. You can write
|
|
Classes give the programmer the opportunity to define his/her own types. Each class object8 consists of a number of fields (a.k.a. data members) and methods (a.k.a. function members). The fields contain data, and the methods access and change the fields.
Each member (whether data or function) is categorized as either public, protected, or private. In this course, we won't need protected.
The public members of an object named o may be accessed in any code that o can be accessed in. The name of a member m of an object o is o.m. Sometimes the class object will not have a name (e.g. if it is a heap variable). As long as there is a pointer p to it, we can still write (*p).m. But this is awkward, so C++ provides an alternate syntax: p->m.
The private members of a class object can only be accessed from the code that is in the bodies of member functions of the class.
When a function member f of an object o is called (e.g. with a call like o.f() ), o is termed the recipient. Within the body of the function f you will likely want to refer to the members of the recipient. Any mention of a member name m will refer to a member of the recipient, o. Occasionally you will want to refer to the recipient, o, itself. The keyword this is always a pointer to the recipient, so you can refer to the recipient as *this.9
Generally speaking you should declare all fields as private or protected. Those methods you declare as public will constitute the public interface of the class.
Structs are just classes where members are public unless declared otherwise10. The following declarations are the same
struct listElem { | class listElem { |
int data ; | public : |
listElem *next ; | int data ; |
} ; | listElem *next ; |
} ; |
When a class object is created, one of its constructor methods will be used to initialize its fields. The name of a constructor method is the same as the class and it is declared with no return type at all (not even void!). For example:
class point {
}
|
declares a class with three constructors. Which one gets called depends on how the object is created:
point w ; /* Default constructor */ |
point x = w ; /* Copy constructor. Not the assignment operator. */ |
point y( x ) ; /* Copy constructor*/ |
point z(1.0, 2.7) ; /* The other constructor */ |
By the way, the declaration
|
One problem with constructors is what value the fields will have when they begin. Normally a default constructor is first used on each field whose type is a class. For example, if we define:
class line {
} line::line( const point &a, const point &b ) {
}
|
Point's default constructor will be invoked for both p0 and p1 before the beginning of line's constructor, which promptly overwrites the values just constructed. It would be slightly more efficient to use point's copy constructor. Indeed, C++ provides a way to do exactly that. We define
line::line( const point &a, const point &b ) :
}
|
The stuff between the colon and the left brace is a list of fields to be constructed and argument lists for their constructors. Any fields left out of the list will still be constructed by their default constructor.
Each class has exactly one destructor. This method is called when the object is about to be destroyed (e.g. when a stack object goes out of scope, or a heap object is deleted). Immediately after the destructor is finished, each field of the object is destroyed. The destructor's name is ~C where C is the class name. Like a constructor, the destructor should be declared with no return type.
There are certain methods that every class will have, whether or not you define them. If you don't define them, the compiler will supply definitions.
Matrix a ; |
p = new(nothrow) Matrix; |
q = new(nothrow) Matrix[10] ; |
The default constructor is called to construct the stack variable a, the heap variable *p, and the ten heap variables q[0],.., q[9].
Matrix a = x ; |
Matrix b(x) ; |
p = new(nothrow) Matrix(x); |
Assuming x has type Matrix, the copy constructor is called in each of these cases.
The copy constructor is also used for parameter passing and returning a
value from a function. Consider a function declared as
|
|
If you do not declare these methods, the compiler will create them for you.11 It is good to know what the compiler generated methods will do, since it is often not appropriate.
Here is an example class
class example {
|
If I were to be explicit and not leave the generation of any methods up to the compiler, the class would look like this:
class example {
} ;
|
example::example() // Default Constructor
|
example::example( const example &r ) // Copy constructor
|
example& example::operator= ( const example &r ) // Assignment operator
|
example:: ~example() // Destructor
|
When the implementation of an object involves heap-allocated storage, these compiler-generated methods are usually wrong and harmful, as is shown in the following case study.
Consider this class for representing character strings
class String {
} ;
|
The idea is to allocate just as many bytes from the heap as are needed to represent the string. The number of characters in the string will be len and c will point to the first in an array of len bytes on the heap.
Let's start with the destructor. The compiler generated destructor will do nothing. When a String is destroyed, the last pointer to the heap allocated array (i.e. the c field) will be destroyed. This is a space leak. The solution is to declare the destructor by adding the line
|
to the public part of the class and defining the destructor function as
String:: ~String() {
}
|
But there is a problem. Consider
b already exists |
{ String a = b ; |
use a for a while |
} /* a is destroyed here */ |
use b here |
a will be created using the copy constructor. The compiler generated copy constructor will simply copy over each field.
a.len = b.len ; |
a.c = b.c ; |
Thus after the constructor is done, a.c and b.c will both point to the (first element of the) same array. This may seem like a big space saving, but there is a problem. When a is destroyed, according to the destructor we just wrote, the array that a.c points to will be deleted. But this is also the array that b.c points to! That is, b.c, will become a dangling reference! We can not accept the compiler generated copy constructor and must write our own. Add
|
to the public part of the class and define the copy constructor
String:: String(const String &b) {
}
|
Suppose we assign
|
a.len = b.len ; |
a.c = b.c ; |
We have the same problem as with the copy constructor. The solution is to declare and define an assignment operator for the class. Add
|
to the public part of the class and define the assignment operator
String & operator = (const String &b) {
}
|
Why is there a delete? Well, if c already has a value, then it would be bad to simply overwrite it; that would again be a space leak. (Note that the assignment operator does nothing when an object is assigned to itself. Why did I do this?)
Finally consider the sequence
String a ; |
a = b ; |
It looks innocuous, but look at what happens in the assignment.
The first thing that happens is equivalent to
|
|
to the public part of the class and define
String:: String() {
}
|
We should ask whether delete ((char*)0) [] is allowed. It turns out that it is, and is guaranteed to cause no harm.
The rule of compiler-generated methods. If your class contains pointers to heap variables, you probably need to declare and define all four compiler-generated methods.
|
(The alert reader will have noticed that in this example, I repeatedly broke ``the rule of checking new''. If I were writing code to use in a product, rather than to illustrate the above points, I'd have been careful to check for null pointers.).
There is a sneaky trick you can play on the compiler, if you want to avoid writing a special method and you don't want the compiler to generate one for you either: Declare the method, but don't define it! There is no obligation to define any method, if it is never called. But what if someone does call the method? Well they will get a nearly incomprehensible error message when their program is linked. To give them a better error message and to make it come at compile time, simply declare the method as private.
Here is an example:
class Matrix {
} ;
|
In this case, I did not want to write a default constructor for my class because there is no obvious default size for a matrix.
In this course we will not be using inheritance very much, so I'll only cover as much as we need. There is much more that I won't mention.
Inheritance is used to create new classes (derived classes) from old ones (base classes). The derived class will have all the fields and methods of the base class and also whatever new fields and methods it adds.
For example consider
class Shape {
} ; class Circle : public Shape {
} ;
|
This declares two classes. The Circle class has private fields xcoord, ycoord, and radius, it also has public methods move, getX, getY, and expand, as well as a default constructor, copy constructor, destructor, and assignment operator. Similarly, we could create classes for rectangles, triangles, text shapes, etc. The move, getX, and getY functions need only be written once.
Inheritance is more than just a way to create classes quickly, it is also a way to write ``generic code'', i.e. code that can work on data of many types. Consider a function
void up( Shape &sh ) {
}
|
We can call this function with any object whose type is Shape or is derived from Shape. E.g.
|
One error that it is possible to make with inheritence is called slicing. When a variable of type Shape is created there is only enough space set aside for the fields of the Shape class. Suppose you try to store a Circle in a shape variable using either Shape's copy constructor or its assignment operator:
Circle circ ; ... Shape sh( circ ) ; // Copy constructor. ... sh = circ ; // Assignment operator.
|
What will happen? The radius field will be completely lost. Furthermore, the compiler will give no warning that information is being lost in the copy.
The implementation of the move function is the same regardless of the shape. However, sometimes you need to declare a function in the base class that will be implemented differently for each derived class. In our example, a function that draws the shape on a computer screen is such a function. The algorithm for drawing a circle is quite different from the algorithm for drawing a rectangle. We add declarations of a method called draw:
class Shape {
} ; class Circle : public Shape {
} ;
|
The draw method is defined for the Circle class:
void Circle::draw() {
}
|
and for rectangle, triangle, etc., but not for the base class Shape.
You can now call the draw routine of any Circle, Rectangle, Triangle etc., without even knowing what the exact type of the object is. Consider this routine
void drawAllShapes( Shape* shapeArray[], int arraySize ) {
}
|
Now you can fill up an array with pointers to Circles, Rectangles, Triangles, etc., and call the above routine to draw them all. For each object, the appropriate definition of draw will be executed. This is the beauty of virtual functions.
The declaration of draw in the Shape class looks a bit
odd. The =0 at its end indicates that you do not intend to define
draw for the Shape class; the compiler will complain if
you try. Such a function declaration is called a pure virtual function
declaration. The presence of at least one pure virtual function declaration
marks a class as an abstract class. You can not create objects that
belong to abstract classes. This is what I want, in this case, because it
does not make sense to have an object that is simply a shape and not some
particular kind of shape. You can not write
|
|
|
I recommend using virtual functions unless you are 100% sure that all derived classes will share the same implementation of a function13. Use pure virtual function declarations when there is no sensible implementation of a function for the base class.
When a group of people come together there is sometimes a name-clash: does the name `John' refer to John Quaicoe or John Robinson? We can resolve the clash by using family names. Likewise in a large program the same name may be used for several different purposes. Namespaces provide a way to group related things (classes, functions, static variables, typedefs) under a single name -the namespace name- that serves as a sort of family name. Here is an example:
namespace Bird {
} namespace Action {
}
|
After these declarations we can write either
|
|
The ISO has defined a standard library full of useful definitions. All functions and classes defined in the ISO standard library are defined in a name space called std. So the classic hello-world program that used to be written:
#include <iostream.h> int main( ) {
|
is now written:
#include <iostream> int main( ) {
|
There are two differences. Where input/output used to be declared in ``iostream.h'', the ISO library name does not have the ``.h'' part. Second is the use of namespace std in std::cout. These are minor differences, but in this course we will use the ISO library because it is more standard and has more stuff in it.
The dropping of the ``.h'' is consistent throughout the standard library, as is the use of namespace std. Some of the headers available are
<iostream> | Input/output |
<string> | A string class |
<complex> | Complex numbers |
<new> | Declares nothrow |
<cmath> | The C math library |
<cstring> | Functions supporting C-style strings. |
Writing std:: in front of all these standard names gets to be a
pain. There is a solution. We can declare that we want to be able to use
all the names in a namespace without mentioning the namespace. The
declaration is
|
#include <iostream> using namespace std ; int main( ) {
|
Sometimes you want a number of classes that differ only in the types of certain fields. It is tedious to write the same definitions over and over again with only minor variations. It is better to write one definition that works for all types.14 Such a definition is called generic. This is what C++'s template construct is for.
A template class definition is like a function from types to classes.
The syntax is
|
|
For example, consider this template class definition:
template <class L, class R> class Pair {
} ;
|
We must define the four member functions. Notice that we must instantiate Pair everywhere except when it is being used as the name of a constructor function.
template <class L, class R> Pair<L, R>::Pair( const Pair<L,R> &p ) :
{ }
|
template <class L, class R> Pair<L, R>::Pair( L leftInit, R rightInit ) :
{ }
|
template <class L, class R> L Pair<L,R>::getLeft() {
}
|
template <class L, class R> R Pair<L,R>::getRight() {
}
|
Now if there is a declaration
|
class PairIF {
} ;
|
PairIF::PairIF( const PairIF &p ) :
}
|
PairIF::PairIF( int leftInit, float rightInit ) :
}
|
int PairIF::getLeft() {
}
|
float PairIF::getRight() {
}
|
Since instantiations of a template class can be used anywhere types can be
used, they can be used as arguments to instantiate another template class,
or even the same one:
|
One can also write generic functions using a similar syntax. In fact, we've already seen that the member functions of a template class must be written as template functions.
Sometimes something happens that requires the program to radically alter what it is doing. Two examples that we will encounter in this course are: running out of memory, and the detection of programmer error.
There are various things you can do when such an undesired event occurs. Which is chosen depends a lot on what the undesired event was and what the nature of the application is. In some cases it is reasonable to stop and print an informative message, for other programs stopping should never be an option.15
In this course we will use two different mechanisms for dealing with undesired events. For programmer error, we will use the ``assert'' macro, and for running out of memory, we will use exceptions.
The assert macro is called with a boolean argument.
|
if( x ) { |
/* do nothing */ } |
else { |
Print a message and stop the program. } |
It is usually used to document the beliefs of the programmer. For example, if a programmer is writing a function to compute the integer part of the square root of a number, she/he might write.
#include <assert.h> : /* nat_root( i ) The integer part of the square root of i. Precondition: Should only be called with a nonnegative argument. Postcondition: Returns the integer part of the square root of i */ int nat_root( int i) {
}
|
The first assert detects the error of calling the routine with a negative argument. The second detects any errors in computing the square root. Both these potential errors are programmer errors - either on the part of the programmer who called the function, or the programmer who wrote it.
Asserts are best used as an executable complement to the documentation of preconditions, postconditions, and invariants.
Asserts should not be used to detect undesired events that can be anticipated to occur in production use of a program (e.g. after it is sold). Examples are running out of memory and incorrect user input.
Throwing an exception causes the program to stop what it is doing and jump to a piece of code called an `exception handler'. Exception handlers are declared and used like this:
void g() {
}
void f() {
}
|
Assuming no throws are executed, the following sequence can be
expected to happen during the execution of f:
|
|
Exceptions are a form of `goto' statement. As such, they are very tricky and should be used as a programming technique of last resort. I am only using them in this course because they seem to solve a problem I haven't solved to my satisfaction in any other way.
If not used with extreme care, they will cause more problems then they solve. Consider the following subroutine (using the C subroutines for file opening and closing).
void bad() {
}
|
If an exception is thrown from g(), the file will not be closed and the large array will never be deleted from the heap. There are ways to fix the subroutine, but the point is that if exceptions were not a possibility, there would be no problem to fix.
If we were to fix the above code, we might write:
void good() {
}
|
The documentation of good should explain that it may throw any exception that g may throw.
1Stroustrup is the creator of C++.
2The actual number varies depending on the type of computer and sometimes the compiler used.. Two, four and eight are the usual sizes for int variables.
3Although, I prefer to think it is because they have planned obsolescence.
4I say `tries to' because there may not be enough space in the heap to fit the new variable.
You may be wondering what this `(nothrow)' thing is after new. Bear with me; it will be explained in a later section. Until then you can just ignore the `(nothrow)'.
5If you are reading the HTML version of this file, the boxes won't be visible, so you'll have to imagine where they are.
6The using namespace std; declaration is explained in A later section
7The word const at the end of the declaration of a member function indicates that the recipient object will not be changed.
8A class object is just an object that has a class as its type.
9As a result you can refer to a member m as this->m. This is long-winded, but it emphasizes that m is a member of the recipient, rather than a local variable of the function.
10Structs are hold-over from C, which does not have classes. C also does not have private members, inheritance, or function members.
11Here is the whole truth: The compiler will generate definitions for these methods only if your code actually uses the methods. And the compiler will generate the default constructor, only if you did not declare any constructors at all. Finally, there are two more methods the compiler will generate for you, these a const and nonconst versions of the `address-of'' operator &.
12But remember after the destructor is done, each field that has a destructor is destructed.
13A special case is when you are 100% sure that a class will not be used as a base class.
14Or at least for several types.
15Consider a rocket guidance system. There are two computers -a primary and
a backup. In the primary, shutting down might be a reasonable response to
some undesired events. This is because the backup should take over when the
primary shuts down. In the backup, shutting down should never be an option.
The first launch of an Ariane 5 rocket ended in a crash, costing over 650
million $, when both the primary and backup guidance computers shut down in
response to a detected programmer error.