Recursion and iteration – an introduction to iteration

This is the first of a series of articles about recursion and iteration

  1. Introduction to iteration
  2. Introduction to recursion
  3. Mutual recursion and tail recursion
  4. Blurring the lines
  5. Loop unrolling

Recursion and iteration are both techniques for doing some work (or similar work) repeatedly, so that you can solve a bigger problem. This could be things like summing all the integers between 1 and 2048, finding how many times the word aspidistra appears in a file, etc.

In this article I’ll introduce iteration, and in later articles will go into recursion and then more advanced things.

Introduction

Iteration is a bit like a washing machine. Just as a washing machine has a drum that goes round repeatedly doing the work of washing the clothes, and a fixed bit that drives and controls it, iteration comes in two parts: some code that’s executed repeatedly, and other code that controls this repeated execution.

There are often three different basic structures that languages give to let you use iteration – while, for and foreach.

A bank of washing machines in a launderette
We build washing machines out of code

While

The simplest form of iteration in most languages is the while loop:

while (/* some condition */)
{
    /* code that's executed potentially many times */
}

For instance:

bool stillHaventFoundWhatImLookingFor = true;

while(stillHaventFoundWhatImLookingFor)
{
    stillHaventFoundWhatImLookingFor = tryToFindImportantThing(arg1, arg2);
}

The condition inside the while (line 1 or 3) is tested to see if it’s true. If it is, the code inside the curly brackets is executed. Then control goes back to the top and the condition is tested again. This process repeats until the condition is false, at which point the code after the closing curly bracket is executed. The code inside the curly brackets is the equivalent of the washing machine’s drum, and the while plus condition is the fixed bit that controls things.

There’s no need, and nothing forcing you, to link the condition and the code inside the curly brackets. However, if often makes sense to link them as in the example above – the code called inside the curly brackets sets the variable that’s used as all of the condition. This is the way to get the while loop to stop eventually – the code inside the curly brackets eventually makes the condition evaluate to false. You could have the constant true as the condition, in which case the loop will run forever. If the condition evaluates to false when execution first reaches the while loop, the code inside the curly brackets will never be executed.

Languages often have a statement such as break, that lets the code inside the curly brackets immediately break out of the loop, rather than continue through the remaining code until the reaching the closing curly bracket and testing the condition again. (break makes the code act as if the condition is set to false, and then execution jumps back up to evaluating the condition.)

There is also often a statement such as continue, which is like break in that it immediately sends execution back up to evaluating the condition, but unlike break it doesn’t set the condition to false.

For

The next common way of doing iteration is the for loop. It looks like this:

for(/* code that sets up initial conditions */;
    /* condition that's tested at the start of each time around the loop */;
    /* code that's executed at the end of each loop */)
{
    /* code that's executed potentially many times */
}

For instance

for(int total = 0, loopCounter = 1; loopCounter <= 20; loopCounter++)
{
    total += 3;
    Console.WriteLine("Hello");
}

This is a poor way of calculating the value of 20 * 3 while also printing out Hello 20 times. The simplest way of explaining what’s going on is to show you how you could write the same thing as a while loop.

int total = 0, loopCounter = 1;
while (loopCounter <= 20)
{
   total += 3;
   Console.WriteLine("Hello");

   loopCounter++;
}

continue and break are available to use inside the curly brackets, as with the while loop. As with the while loop, there’s nothing stopping you from writing a for loop that will execute 0 times, or a loop that runs forever. This isn’t always what you want. For instance, if instead of loopCounter++ I had written loopCounter–, then loopCounter would start at 1, then decrease to 0, -1, -2 etc. It would never get to the value 21, which is when the loop is intended to stop, so it will never stop.

When does it make sense to use a for loop rather than a while loop? My feeling is that you should use a for loop whenever you can separate out the code into four component parts:

  1. Initial conditions that the loop needs to be set up, that other code doesn’t need and won’t have set up;
  2. A condition that is true for each iteration and not true immediately after the last one (e.g. loopCounter <= 20);
  3. Code that needs to be executed to mark the end of one iteration as different from the next (e.g. increase the loop counter in the example above);
  4. Code that needs to be executed to do the main work (that’s separate from the code that marks the boundary between one iteration and the next).

This is only because it makes your code easier to understand. The for loop has 4 slots (the 3 bits inside the round brackets, plus the curly brackets), and each slot has a specific meaning. So it bundles the four components together in a way that makes it obvious which is which.

Foreach

foreach is useful when there is a collection or group of things and you want to loop through each element in the collection. It looks something like this

foreach(loopVariable in collection)
{
    /* code that's executed for each element in the collection */
}

For instance

int totalCharactersInFruit = 0;

List<string> allFruit = {"apples", "bananas", "cherries"};
foreach(string fruit in allFruit)
{
    Console.WriteLine("How do you like them " + fruit + "?");
    totalCharactersInFruit += fruit.Length;
}

Console.WriteLine("total characters of fruitiness = " + totalCharactersInFruit);

Some languages use for in … rather than foreach in …, but usually it’s just different ways of saying the same thing. You can write a foreach loop as a for loop – the previous example could be re-written like this:

int totalCharactersInFruit = 0;

List<string> allFruit = {"apples", "bananas", "cherries"};
for(int fruitIndex = 0; fruitIndex < allFruit.Length; fruitIndex++)
{
    string fruit = allFruit[fruitIndex];
    Console.WriteLine("How do you like them " + fruit + "?");

    totalCharactersInFruit += fruit.Length;
}

Console.WriteLine("total characters of fruitiness = " + totalCharactersInFruit);

I use foreach whenever the loop is based on a collection of things, rather than until an arbitrary condition is true. Again, this is to make the code clearer – this collection is the driver of this bit of activity.

A case to watch out for is when the code inside the curly brackets wants to update the collection it’s looping through. Depending on the language and other bits of context, this can often end badly. In this case you might need to re-write the loop using for rather than foreach, or create a new output collection so that you can leave the collection that’s controlling the loop unchanged.

Language-specific details

I won’t go into every language, but here are details of iterating through collections in some common languages.

In C# there are some details about foreach that I’m glossing over for now, but will go into in a later article. Everything should work fine without knowing them, but once you know them you have extra tools that can help you in certain circumstances. LINQ does a lot of iterating through collections of things – for instance, select will create a new output thing for each member of the collection, based on a function that you have supplied. I won’t go into the details of LINQ here as that’s a large topic in its own right, but understanding iteration will help you understand LINQ.

In JavaScript there are two related functions – map and forEach. They do similar but slightly different things, based on looping through an array. JavaScript also lets you iterate over the properties inside an object.

Python has something similar to LINQ called list comprehensions, which is also a way of producing an output list based on an input list. Similar to LINQ, it’s too detailed for me to go into here.

SQL and R are similar to each other, and different from the previous languages. They are both optimised to work on sets / lists / arrays / etc. If you want to e.g. find all the rows in a database table / data frame that have a lastLoggedInDate of 3 years ago and set their isActive to 0, you could do something similar to the foreach loop. In SQL this is usually referred to as a cursor.

However, it’s usually much (tens, hundreds etc. of times) faster to express it as a single update statement that operates on the set of rows as a whole that you want to update. The looping through individual rows in the set will still happen, but behind the scenes in a way that allows the database / R engine play to its strengths that were designed into it deliberately. So seeing explicit for or foreach loops in SQL or R code, rather than implicit ones inside e.g. update statements, can be a code smell and is sometimes referred to as row by agonising row.

Nesting loops

So far, all the examples have been one-dimensional. By that I mean that you could probably see how you could draw a diagram based on a straight line showing the different iterations of a loop. However, what if you wanted to do something that was two-dimensional (or even more)? For instance, what if you wanted to generate ideas for a band name, by pairing up each word in a list of adjectives with each word in a list of nouns?

That’s fine – you can have a loop inside a loop, also known as nested loops. The outer loop will behave as you would expect from the previous examples. The inner loop will be run from start to finish for each iteration of the outer loop.

foreach(string adjective in {"Happy", "Smashing", "Traveling", "Manic Street"})
{
    foreach(string noun in {"Mondays", "Pumpkins", "Wilburys"})
    {
        Console.WriteLine("The " + adjective + " " + noun);
    }
}

This will produce the following output

  1. The Happy Mondays
  2. The Happy Pumpkins
  3. The Happy Wilburys
  4. The Smashing Mondays
  5. The Smashing Pumpkins
  6. The Smashing Wilburys
  7. The Traveling Mondays
  8. The Traveling Pumpkins
  9. The Traveling Wilburys
  10. The Manic Street Mondays
  11. The Manic Street Pumpkins
  12. The Manic Street Wilburys

You can see that the inner loop goes from start to finish for each iteration of the outer loop, a bit like a minute hand of a clock going 0-60 for each position of the hour hand. You can have as many levels of nesting as you like, and each level choose its own kind of loop (e.g. you could have a while loop inside a for loop).

Alternatives to iteration

There are two main alternatives to iteration I can think of. The first is recursion, which I will cover in the next article in this series. The other is deleting the control part of the iterative code and simply copy/pasting out the inner part as many times as is necessary.

This has several drawbacks:

  • You’re limited to how many times you will do the repeated action. If you paste out the code 4 times, then it will be executed exactly 4 times each time the code is run – you can’t change it to be 3 sometimes and 5 others, without adding extra complications like wrapping each copy up with an if statement.
  • It’s less clear, and so harder to understand. Instead of seeing one copy of the code and noting it will be executed e.g. 7 times, you would have to examine the 7 copies of the code and notice that they’re identical. As well as having to check the copies, there are simply more lines of code to read.
  • If you want to change the repeated action, you will have to change it in each of the copies.

In my experience it’s always a bad idea to do copy/paste, and I have only ever gone the other way – converting code that I’ve noticed is repeated into explicit iteration.

Summary

Iteration is a way of building a washing machine out of code – a bit of code that does something repeatedly, based on a controller that sits outside it. There are a few different flavours of it available in most programming languages, that let you express your intent clearly for slightly different problems.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s