As well as stacks as mentioned in the title, this also touches on sorting and fragmentation. Yes, I’m one of those people who think that their way of loading the dishwasher is the best way. This is part 2 of a probably 2-part series of articles on how washing up shines a light on computer science – you can read about queues in part 1. Like part 1, this is a statement of the bleeding obvious that I hope is nonetheless helpful.
We use a dishwasher to do as much of the washing up as possible. For this article I’m going to ignore everything other than cutlery. Our dishwasher has a wide and shallow top drawer for the cutlery, rather than a basket that goes in the bottom drawer. The main part of the drawer is two rows of slots that hold the knives etc. lying on their side, plus a bit in the middle for e.g. things too long to fit into the slots.
You load the cutlery into the cutlery drawer in the dishwasher, it cleans them, then you take the cutlery from the dishwasher and put it in the cutlery drawer of one of the kitchen units. Like most kitchens, the cutlery drawer in the kitchen unit has sections that you can use to organise things in a helpful way – teaspoons in one place, forks in another etc. So far, so obvious, but also not obviously linked to computer science.
There’s a small wrinkle in this, which is to do with putting the clean cutlery away. Somewhere in the process the cutlery needs grouping i.e. roughly sorting, so that it can go into the correct slot in the destination drawer.
One way to do this is to sort as you put things away. You grab a handful of random cutlery and then put each thing into the correct slot, a bit like a croupier dealing cards out to card players. That’s perfectly fine, but I prefer to sort earlier in the process – as things are put into the dishwasher.
Not only does this mean that the final putting away is easier, it often means the cutlery fits together better in the dishwasher. For instance, forks have a curve to them but knives don’t, so lining up a row of forks curving all the same way means they don’t touch each other (which would increase their chance of not being washed properly). Compare this to alternating e.g. knives and forks.
OK, so we want to put cutlery into the dishwasher grouped by kind of thing – how do we do that?
The fly in the ointment is that for each load of the dishwasher, we don’t know how many knives, forks etc. there will be. Therefore, we don’t know how much space to allocate for each kind of thing.
The approach I take, which is not perfect but good enough, is as follows. It’s based on the fact that we divide cutlery into four slots in the destination drawer – forks, knives, big spoons, teaspoons. I put the first one of each kind of cutlery into a slot at one of the corners of the dishwasher drawer. For instance, the first dirty teaspoon in the top left corner, big spoon in the bottom left, fork in the top right and knife in the bottom right. As more cutlery is loaded into the dishwasher, it goes next to the last thing of its kind – forks together, knives together etc.
In this way, the groups start at the edges and grow towards each other. It’s a bit like a cave where stalactites grow down and stalagmites grow up. Or, to be more computer science-y four stacks that grow towards each other.
Note that while physically some stacks grow up and some stacks grow down, logically they’re still all stacks that grow up. The most recently added thing is always considered to be the stack top, and you put the next thing ‘on top’ of the previous thing. In a computer, it doesn’t really matter which way your stack grows (up or down), just that it always grows the same way.
Limitations of the metaphor
The main way in which this metaphor breaks down is taking stuff out of the dishwasher. In a computer stack, entries are removed or popped off the stack in reverse order. I.e. the most recently added thing is removed first, then the next most recently added and so on. This First-In-Last-Out behaviour (FILO) contrasts with the First-In-First-Out (FIFO) behaviour of a queue. There’s nothing forcing me to take clean cutlery from anywhere in particular, so I don’t have to take it from the stack top.
Also, in general a computer’s stack will alternate between growing and shrinking. In the dishwasher, the stacks do nothing but grow as the dishwasher’s loaded, and then do nothing but shrink as it’s emptied.
Finally, stacks can have the concept of a stack frame, which isn’t something you see in the dishwasher. When control passes to a new method / function / etc. in some code, space needs to be created to hold the local variables for that method. When control leaves the method, the space for its local variables can be thrown away, i.e. re-used for something else.
The way the computer handles that is to have a stack for storing local variables*. Imagine that methods A, B and C each have 5 local variables that are all integers. A calls B, and B calls C. When A is called, enough space for its 5 local variables is reserved on the stack in one go (rather than the 5 variables separately). This chunk of space for the 5 variables is the stack frame for A.
When A calls B, the stack frame for A remains in place, and a stack frame for B is pushed onto the stack. Similarly, when B calls C, a stack frame for C is pushed onto the stack too. At this point there are three stack frames – one each for A, B and C. When C returns, its stack frame is popped off the stack to leave the stack frames for A and B. If B were to call C again, a fresh stack frame for C would be added. If it called method D instead of C, then the stack frame for D would be added instead.
* Note that there’s an implementation detail here. There are some variables, such as integers, that will always occupy the same amount of space. These can easily go into a stack frame, because the size they need is trivial to work out. Other variables, like lists, could need a variable amount of space. In this case a pointer or reference for the variable is added to the stack frame, and the reference points to space that is allocated on a heap where the variable’s value will be stored. These two types of thing are often referred to as value types (live on the stack) or reference types (live on the heap, with a pointer on the stack). The heap is beyond the scope of this article.
Brief historical aside
I was lectured by David Wheeler, just before he retired. He invented the Wheeler Jump, which uses a stack. You might know this better as calling a method, function, procedure or sub-routine. Yes, someone had to invent that concept, and I was lectured by him.
Before the Wheeler Jump, there was the Goto command. The Goto meant that the flow of execution just went to the address you told it to go to (as the name implies). If you wanted to come back later, that would be a separate Goto. There would be little to no help for getting the two legs of the journey to be built from the separate Gotos.
The Wheeler Jump saved the current location of the flow of execution on a stack, and then jumped to the specified location. When that code wanted to return it said so via a new Return command. This popped the previously saved location off the stack, incremented it to point to the next command, and then jumped to that location.
I hope that you can see how the A → B → C example above would result in pushing and popping return addresses on this address stack.
If you think back to the dishwasher example, where stacks grow towards each from opposite ends of the dishwasher cutlery drawer, much (but not all) of the time the space just sorts itself out. If there happen to be e.g. lots of knives and few forks, then the knife stack and the fork stack will just end up meeting closer to where the fork stack started and further from where the knife stack started.
The problem comes when e.g. there are more knives and forks combined than will fit in their side of the drawer. This means the next knife or fork must go into the other side of the drawer (if there’s space). I.e. the run of knives or forks isn’t contiguous any more, but is fragmented into at least two bits. This is what I meant by the approach being good enough rather than perfect.
Fragmentation is a problem on your computer’s hard disk too, and so occasionally it will benefit from being de-fragmented. The bytes that make up a given file each need to live in a physical location on the disk, like the knives and forks sitting in particular slots in the drawer. Life is easier if all the bytes for a file live next to each other, and then the next file starts etc. However, as files grow and shrink over time it isn’t always possible to keep things this simple and tidy.
Instead, a file becomes a series of separate fragments, with some way of joining the fragments together into one file (and joining another set of fragments, that might be interleaved with the first set of fragments, into another file and so on). De-fragmentation shuffles all the bits of each file together so that they turn back into one contiguous run. (Note that it’s actually a bit more complicated than this, because space on the disk is allocated in chunks or blocks rather than individual bytes, but the same basic problem exists.)
A cutlery drawer in a dishwasher can be a metaphor for stacks in a computer. Stacks can physically grow up or down, but logically they always grow up. Stacks are used behind the scenes a lot to help code function, keeping track of which bit of code should be executed next and the local variables for each bit. Fragmentation is an unfortunate fact of life, that needs to be lived with or worked around in some way.
As in the previous article in this series, I encourage you to look at life on as many levels at once as you find helpful.