- how can we
*search*in arrays? - how can we
*sort*arrays? - what is
*algorithmic complexity*? - what makes a good algorithm?

Arrays can to store large amounts of homogenous data, for example, telephone directories, corporate sales figures or meteorological readings for all of Canada. Thus, a common task is to search an array for a particular piece of data or a data with a particular characteristic.

**Problem:** Find the position (*index*) of the largest value in an array.

Clearly a loop is involved since we’ll have to search the entire array. Let’s start by concentrating on the body of the loop.

For each element in the loop, we need to compare that value
(let’s call it `data[i]`

) with the previously-largest element.
Let’s assume that we’ve stored the position (*index*) of that previously-largest
element in a variable called `largest`

.
In that case, every time we go through the loop, we need to check whether
`data[i] > data[largest]`

.
If it is, `i`

is the position of the newly-found-to-be-largest value,
so we can set the value of `largest`

to `i`

.
This means that our loop body will look like:

```
if data[i] > data[largest]:
largest = i
```

Now we need to think about how our loop should **start** and **finish**.
When we start the loop, what should the value of `largest`

be?
We haven’t examined any values yet, but we can start out by setting `largest`

to 0, making element 0 the largest-so-far,
**assuming that the array’s length isn’t zero**.
We need to address that assumption by either:

- defining a precondition that
`length > 0`

or - defining what the function will return if
`length == 0`

.

So, if we set `largest`

to 0 at the beginning and then loop through all of the
values in the array with the body from above, we get:

```
largest = 0
for each value of i in the range 0 (inclusive) through length (exclusive:
if data[i] > data[largest]:
largest = i
```

**Question:** if there are $N$ elements in the array, how many comparisons
does our search algorithm require?
How many assignments to `largest`

will be required in the worst case?
The best case?
What are those cases?

**Question:** What happens if there are two or more largest pieces of data?
Where will `position`

end up? Is this sensible?

Since this algorithm only needs to return one value (the index), we can
write a function with an `int`

return type.
A sensible name might be something like `findLargest`

, though other names are
possible too.
For parameters, we need an array of values to search through and the length
of that array.
Remember, “passing an array” actually means “passing the address in memory
where the array begins”; a separate parameter is required to say how long
the array is.

**Prototype:** `int findLargest(double data[], int length)`

**Declaration:**

```
int findLargest(double data[], int length);
```

**Contract:**

```
/**
* Find the largest value in an array.
*
* @param data[in] values to search through
* @param length the number of elements in `data` @pre > 0
*
* @returns the index of the largest value in the array
*/
int findLargest(double data[], int length);
```

**Definition:**

```
int findLargest(double data[], int length)
{
int largest = 0;
for (int i = 1; i < length; i++)
{
if (data[i] > data[largest])
{
largest = i;
}
}
return largest;
}
```

If we have $N$ items in an unsorted array, searching through the array to find
the largest value (or the smallest, or any particular value, etc.) will require
us to examine all $N$ values and compare them against some criterion
(larger than what we’ve seen before, equal to the value we’re looking for,
etc.).
This means that, if the array were twice as large, it would take twice as long
to search for the value we’re interested in.
An algorithm that requires approximately $N$ operations on an $N$-element array
is called a *linear* algorithm: the number of operations is linear with respect
to the number of things we’re operating on.
Searching is just such a linear algorithm, but as we’ll see now, many other
algorithms are **not** linear: adding a bit more data to the computation can
cause the algorithm to take wildly more time to compute an answer!

**Question:** how many operations are required to find the largest value
in a **sorted** array?

In this module we look at the problem of sorting, in particular, sorting an array of numbers so the values are in ascending order. We’ll look at a few standard sorting algorithms. At bottom, however, they all work by examining a single pair of numbers and switching them if they are out of order. The trick is organizing which pair.

A bubble sort is fairly easy to understand but it’s pretty slow. It works by comparing values in an array and swapping them if they’re in the wrong order. Every swap makes the array a little bit more sorted, but to sort the entire array we need to go through it more than once!

```
for i in range [0, length-1):
for j in range [0, length-i-1):
if data[j] > data[j + 1]:
swap data[j] with data[j + 1]
```

This algorithm has a couple of interesting properties worth noting:

it passes through the array multiple times, requiring a

**loop within a loop**:- the outer loop controls our passes through the array:
`i`

is the number of times we’ve passed through the array thus far - the inner loop controls the details of each pass through the array:
`j`

is the number of elements of the array that we’ve looked at**on this pass**, and

- the outer loop controls our passes through the array:
it relies on the ability to swap two array elements in place; in C++ we would implement this using

**pass-by-reference**:`void swap(double& x, double& y) { double temp = x; x = y; y = temp; }`

(see the lecture capture for how we derived this in class)

It can be shown that the number of comparisons that a bubble sort needs to perform is:

$$ \frac{n(n-1)}{2} = \frac{n^2-n}{2} $$

The larger the value of $n$ we deal with, the less significant the linear
term and constant divisor become; we call the bubble sort a *quadratic*
algorithm or, alternatively, an *order $n^2$* algorithm.
We are more interested in characterizing
**how a change in input size will affect the algorithm’s complexity**
than the precise number of operations required for any given $n$.
We can say that a linear (order $n$) algorithm like searching
*scales linearly* with its input data: if we have to process ten times
as much data, it will take approximately ten times longer to run.
With a quadratic algorithm like bubble sort, however, ten times as much data
requires 100 times as much processing time or power to work through.

There are other sorting algorithms, like the merge sort that we saw briefly in lecture or the quicksort algorithm that is often a bit quicker but also more complex to understand. You won’t be expected to remember the details of those sorting algorithms, but you should know that their time complexity is lower: $n \log(n)$ instead of $n^2$. Here’s how time complexity works itself out with a few of these algorithms:

Algorithm | Time complexity | 2 $\times$ data | 10 $\times$ | 100 $\times$ |
---|---|---|---|---|

Find largest/smallest/median of sorted array | O(1) — constant | 1 $\times$ time | 1 $\times$ | 1 $\times$ |

General search in sorted array | O($\log_2 n$) — logarithmic | 1 $\times$ time | $3.3\times$ | $6.6 \times$ |

General search in unsorted array | O($n$) — linear | 2 $\times$ time | 10 $\times$ | 100 $\times$ |

Merge sort, quicksort | O($n \log_2 n$) | 2 $\times$ time | 33 $\times$ | 6,600 $\times$ |

Bubble sort | O($n^2$) — quadratic | 4 $\times$ time | 100 $\times$ | 10,000 $\times$ |

Matrix multiplication | O($n^3$) — cubic | 8 $\times$ time | 1,000 $\times$ | 1,000,000 $\times$ |

- Write a function to
**count how many times**the values in an array go above a given value. - Write a function to find the
**position**of the largest number in an array of floating-point numbers. - Write a function to count
**how many times**a given integer occurs in an array of integers.**Extra challenge:**also “return” the index of the first and last occurrences of this number in the array using pass-by-reference.

- Write a function to count
**how many times**the**largest number**occurs in an array of integers. Your function should report**both**the number itself and the count of how many times it occurred.

- Implement the algorithm described above for a
**bubble sort**.

(c) 2009â€“2016 Michael Bruce-Lockhart, Theo Norvell, Dennis Peters and Jonathan Anderson. Licensed under a Creative Commons Attributionâ€“Noncommercialâ€“Share-Alike 2.5 Canada License. Permissions beyond the scope of this license may be available at theteachingmachine.org.