Module 8: Arrays

  • what are arrays used for?

  • how do we declare, use and pass arrays?

  • how do we use for loops?

int array
Figure 1. An array of integers

Up to now, variables have only held a single value. Now we introduce one of the two kinds of compound variables that can hold multiple values.

An array is designed to hold homogeneous values, i.e., they have the same type (int, double, char, etc.) and semantics (grade, temperature, colour, etc.). These values are stored next to each other in the computer’s memory, as shown in the image to the right. This image depicts an array of three integers (42, 17 and 54), each of which requires four bytes of storage. This array is found at location 0x1000 in the computer’s memory (4096 in a base-10 representation — see this online converter, meaning that the first element can be found at address 0x1000, the second element immediately after at 0x1004, etc. The difference between the elements' addresses depends on their sizes: this example features integers on a fairly standard notebook computer, so each element is 4B long.

Syntax

The C++ syntax for creating arrays adds a lot of square brackets ([ and ]) to the sorts of declarations that we’ve been using thus far. We will use square brackets when declaring array variables, when declaring array parameters and when accessing array elements.

Declaring array variables

When we declare an array variable, the compiler must be able to tell how many elements are in the array. Like any variable, the compiler needs to know how much space is required to store the information, but unlike most variables, array sizes can be determined in one of two ways: implicitly or explicitly.

An array declaration that includes an initializer doesn’t need to specify how many elements it contains. That information is available implicitly (without being expressed directly) because the compiler can count the number of elements in the initializer. Such a declaration-with-initialization looks like:

int grades[] = { 95, 96, 87, 92 };
double temperatures[] = { -2.1, -1, 2.9, 14.6, 25.6 };

The first array in this example has four elements and teh second has five. If we wanted to, we could also specify the number of elements in each array explicitly:

int grades[4] = { 95, 96, 87, 92 };
double temperatures[5] = { -2.1, -1, 2.9, 14.6, 25.6 };

This can help us to make sure that we use the correct number of elements in the initializer. For example, if we wanted to record the average temperature for every month in a year, we might explicitly specify that an array should have twelve elements. Then, if we accidentally typed 11 or 13 elements, the compiler would warn us. However, it is not necessary to be explicit about size when we have an initializer. When we do not have an array initializer, it is necessary to specify the number of elements in the array, e.g.:

double monthlyTemperatures[12];
int studentsGrades[150];

The size of an array

Once we know how many elements are in an array (either because we stated an explicit size or because the compiler counted the elements in an initializer), we can calculate its total size. The size of an array is, quite simply, the number of elements times the size of each element. In the examples above, montlyTemperatures would have a total size of 96 B ($12 \textrm{ elements} \times 8 \textrm{ B}/\textrm{element}$) and studentsGrades would have a size of 600 B on a notebook or desktop computer ($150 \textrm{ elements} \times 4 \textrm{ B}/\textrm{element}$).

We can also ask the compiler how large a type or expression result will be using the sizeof keyword. We can also use sizeof to determine how many elements are in an array by asking the compiler 1) how large the whole array is and 2) how large an element in that array is. For example, we can print the number of bytes required to store the studentsGrades array, as well as the number of elements in that array, with the following code:

cout << "total size: " << sizeof(studentsGrades) << " B\n";
cout << "elements: " << (sizeof(studentsGrades) / sizeof(studentsGrades[0])) << endl;

Array parameters and arguments

Array parameters are also declared using square brackets, but the number of elements is entirely optional. For example, the declaration for a function that averages some values could be:

double average(int values[], int size);

To call this function, we can pass any array of `double`s as an argument, e.g.:

int main()
{
  double temperatures[] = { -2.1, -1, 2.9, 14.6, 25.6 };
  double averageTemperature = average(temperatures, 5);

  /* ... */
}

Accessing array elements

We can access element $i$ within an array (for any value of $i$) using square brackets as an index operator. We can access array elements like any variable, to use them in expressions or to change their values. The C++ code looks like this:

double elementZero = grades[0];
temperature[7] = 21.9;
cout << name[0] << ", " << name[1] << endl;

The for loop

When working with arrays, the following pattern is very common:

int i = 0;
while (i < length)
{
  // do some work
  i++;
}

In this example, we did three things that we need to do in almost every loop:

  1. initialize some kind of loop index (often int i = 0),

  2. check a condition every time we go through the loop (often i < length) and

  3. update some variable at the end of each loop iteration (often i++).

These three elements are so common in loops that we have another kind of loop to combines them: the for loop. The syntax of the for loop puts these three elements (initialiation, condition and update) inside parenthesis right at the top of the loop, next to a for keyword and separated by semicolons. The loop body looks the same as any other loop, except for the fact that the update expression moves from the body to the parentheses at the top:

for (int i = 0; i < length; i++)
{
  // do some work
}

Semantics

Arrays as parameters

Array parameters behave a bit differently from ordinary parameters. When we pass an int or double argument to an int or double parameter, its value gets copied into the parameter (pass-by-value). When we pass any array into a function, however, we don’t copy all of its values. Instead, the thing that is actually passed into the function is the address of the array. We effectively tell the function, "if look at this place in memory, you will find the beginning of an array of elements". So, if the function modifies the array, it is modifying the original array and not a copy. This means that arrays are always effectively pass-by-reference. This is also why we can’t return an array from a function: by the time the function is finished, the address of any local variables will no longer refer to that local variable --- it will have gone out of scope. Those who go on to ENGI 3891 will see a lot more of these kinds of issues.

Passing the address of an array rather than a copy of the array itself allows functions to work with arrays of any size, but it also means that the function doesn’t automatically "know" how long the array is. So, when we pass an array into a function, we usually have to pass in the array’s size as a separate parameter too.

Array indices

An array contains a number of elements, each of which can be treated like an independent variable. We access individual elements within the array according to an index, which is a position within the array. It is conventional to count array elements from zero. The reason for this has to do with the addresses of the elements: element zero starts at zero bytes past the beginning of the array, element one starts at $1 \times S$ bytes past the beginning of the array (where $S$ is the size of each element), element two starts at $2 \times S$, etc.

indices
Figure 2. An array of eight elements, numbered zero through seven

When we access array elements, we must be careful not to access memory outside of the array. Unlike some languages, C++ will not stop your code from accessing an invalid array index. For example, the image below shows a simplified view of some memory with space allocated for several variables with different types. We could write some code to assign new values into the array elements coordinate[0], coordinate[1] or coordinate[2], but what would happen if we tried to assign a value to element 6 of this three-element array or even element -1?

array in memory
Figure 3. A view of memory with several variables, including a three-element array

The answer is that the computer won’t stop us from accessing these invalid array elements. If we write a value to coordinate[-1], the result will be a change in the count variable! If we assign something to coordinate[6], we will actually modify the index variable. Whenever we use an array in our code, it is essential to be able to say how we know the length of that array: a constant, a parameter passed into a function, a precondition, the sizeof operator, etc.

Things you can’t do with arrays

You can’t return an array from a function

As mentioned in the "arrays as parameters" section", you cannot return an array from a function. This is because arrays are passed (and would be returned) as addresses denoting the beginning of the array. If an array is a local variable within a function, that memory will stop being used for the array when we return from the function --- it will have done out of scope.

double[] foo()
{
	double values[] = { 1, 2, 3 };
	return values;    // values would go out of scope here!
}

So, "returning" an array would actually cause us to return an address for memory that doesn’t hold that variable any more!

You can’t assign to an array

Because arrays are a bit "weird" (they are interchangeable with the addresses that they start at), assignment operations may not work, or at least may not work in the way that you’d expect! For example, rather than writing the following to try and copy the array A:

double A[] = { 4.2, 3.1,12.6 };
double B[3];
B = A;

you should instead copy the values of A into B using a loop:

for (int i = 0; i < 3; i++)
	B[i] = A[i];

Exercises

Array parameters

  1. Write a function that will divide every element in an array by two.

  2. Write a function that will calulate the first 20 Fibonacci numbers and store them in an array that is passed in to it.

  3. Write a function that will input up to 20 numbers from the user (using the getNumberFromUser function on the Examples page) and store them in an array.

For loops

  1. Write an algorithm to calculate the variance of data in an array:

    • first using a while loop and

    • then using a for loop.

Algorithm translation

Integration

Translate the following algorithm into a C++ function:

sum = 0

for each value of i in the range 0 (inclusive) to length-1 (exclusive):
  sum += deltaT * (data[i] + data[i+1]) / 2

return sum

Design problems

  1. Write a function to calculate how many students in a class acheived a mark that will count towards their Engineering promotion average (55 or better).

  2. Heating degree-days is a metric used to calculate heating requirements for buildings. HDD can be calculated by summing each day’s number of degrees below some target temperature (e.g., 20 degrees C) over an entire year. Environment Canada has data for the average daily temperature every day for any year and any city in Canada. Write a function that uses that data to estimate the heating degree-days for a particular city in a particular year.

License: CC BY-NC-SA

(c) 2009–2018 Michael Bruce-Lockhart, Theo Norvell, Dennis Peters and Jonathan Anderson. Licensed under a Creative Commons Attribution–Noncommercial–Share-Alike 2.5 Canada License. Permissions beyond the scope of this license may be available at theteachingmachine.org.