Recursion and iteration – blurring the lines

This article is part of a series on recursion and iteration

  1. Introduction to iteration
  2. Introduction to recursion
  3. Mutual recursion and tail recursion
  4. Blurring the lines
  5. Loop unrolling

In this article I will look at how the lines between iteration and recursion can get a bit blurry. This breaks down into three areas:

  1. Iteration = fast, recursion = slow isn’t always true
  2. Using tail recursion elimination to make recursion more like iteration
  3. Using iteration to traverse a recursive data structure
Blurred coloured lines
hdwallpapers.org, CC BY-SA 3.0 https://creativecommons.org/licenses/by-sa/3.0, via Wikimedia Commons

Iteration = fast, recursion = slow isn’t always true

This is some truth to this, but it’s not always true. The problem with recursion is the frequent creation and destruction of stack frames, as you can see in the first article on recursion. This adds overhead to the code and slows it down. So, all other things being equal, it’s often a good idea to go for an iterative solution over a recursive one.

However, if you look at sorting as an example, a recursive approach is faster than many iterative ones. The sorting algorithm that many people learn first is Bubble Sort. It’s an iterative algorithm. Without going into the details, it’s enough to know that at the end of each iteration you can be confident that one element in the list being sorted is in the right place. So you will need n iterations to get all elements in the right place. Each iteration involves comparing all (unsorted) elements in the list, and so this costs about n operations, leading to an overall complexity of O(n2).

The standard sorting algorithm is Quicksort – if you have a library function called sort, chances are it’s doing Quicksort. It’s a recursive algorithm, and on average has a complexity of O(n*log(n)) i.e. better than O(n2). Again, I won’t go into the details, but at a hand-waving high level Quicksort for a list of n elements is defined as:

quicksort(1, n) =
ensure: 1. the element at position x is the correct one
2. elements in positions <x are less than the element at position x
3. elements in positions >x are greater than the element at position x
quicksort(1, x-1)
quicksort(x+1, n)

So, please don’t take the mental shortcut of iteration = fast, recursion = slow. These are often true, but not always.

Tail recursion elimination

In the previous article I described tail recursion, but didn’t say why it was useful. If a compiler spots that tail recursion is happening, it can choose to apply an optimisation that will make the code run faster and take less memory, known as tail recursion elimination.

There are two important aspects to the recursive call in tail recursion:

  1. The method being called is the same as the method currently being executed (because this is non-mutual recursion).
  2. Once the recursive call has returned, there’s nothing we’ll need from the current stack frame as we’re doing no further work at this layer in the call stack before we return.

These two together mean that the current stack frame can be recycled for the recursive call. All that has to happen is that the current values in the stack frame need to be overwritten by the values for the call about to be made. This is useful in the pure version of functional programming in e.g. ML, where iteration is frowned on or even banned. With tail recursion elimination, you can work around the lack of iteration by using recursion, but not have to pay the speed and memory penalty you would normally get with recursion.

The lack of iteration might strike you as odd if you’ve not done functional programming. One of the characteristics of (the pure version of) functional programming is that you don’t have variables. Instead you have constants (that might not have their value yet). Iteration (for loops in particular) depends on something that keeps track of where you are in the iteration – something whose value changes over the course of the iteration.

Recursion in functional programming works around this because each stack frame is effectively a new world unto itself. The fact that there are stack frames above and below this one that have different values of the same constants is fine. These alternative values aren’t accessible to this frame, so they may as well not exist. In the context of this stack frame, a constant e.g. an input parameter will only ever have one value.

An advantage of this restriction is that it’s easier to reason about the code and what it will do. This is the realm of formal methods, which is far beyond the scope of even this long series of articles.

Iterating over a recursive data structure

In the article on iteration, I mentioned that in C# there were some details to the foreach type of iteration that I was skipping over, but that would let you do some interesting things. This is where I go into some of those details.

Behind the scenes, the compiler is doing some magic on your behalf when you use foreach in C#. This simple loop:

List<string> fruits = {"apple", "banana", "cherry"};
foreach(string fruit in fruits)
{
   Console.WriteLine(fruit);
}

is turned into something more like this:

List<string> fruits = {"apple", "banana", "cherry"};
IEnumerator<string> fruitEnumerator = fruits.GetEnumerator<string>();
for(bool isValue = fruitEnumerator.MoveNext(); isValue; isValue = fruitEnumerator.MoveNext())
{
   Console.WriteLine(fruitEnumerator.Current);
}

GetEnumerator is something that the List provides (by implementing the interface IEnumerable<T>). GetEnumerator returns an IEnumerator<T> – in this case an IEnumerator<string>. IEnumerator is the thing that will repeatedly deliver the next thing for the loop to process. If you imagine all the things for the loop (the list elements in this case) as being along a line, the IEnumerator starts off before the first one. MoveNext() tries to move the IEnumerator to the next thing for the loop, and returns true if it’s successful (false means there’s no more stuff to process). The next thing to process is held in Current.

If any collection of data implements IEnumerable<T>, it’s possible for that collection to be processed using foreach, i.e. to be iterated over. I won’t go into the details here, but if you defined a type such as BinaryTree<T> that was recursive, you could define an IEnumerator for it too, that would enable iteration over the tree. The IEnumerator might need a stack to keep track of the path down the tree from the root node to the current node, so that it could easily go back up the tree where necessary.

Once you had done this, a foreach loop could just iterate over the nodes in the tree, using whichever path over the tree that IEnumerator had chosen to take. E.g. for a given node, down the left child, then my value, then down the right child, which would return the values smallest to biggest.

Summary

Iteration and recursion are similar but different. Sometimes the differences can appear larger than they really are, due to things like tail recursion elimination and iterators over recursive data structures.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s