Recursion…a slow-learner approach…



I think everyone has topics they pick up quickly. In high school, trigonometry just made sense. I attribute most of that to an amazing teacher, but sometimes you can just “see” the answer as you’re looking at a problem. It’s intuitive and very hard to explain why you know that your approach is going to work. It’s frustrating to your friends who ask “How did you do that?” The token answer is “I’m not really sure. I just followed the steps I was taught.”

I feel that way sometimes when I read others explain certain approaches to algorithms. I was asked once to write an algorithm that finds a sequence of Fibonacci numbers. I had memorized a recursive formula and reproduced it quickly…feeling proud of my mad memorization skills (note: I’m actually horrible at memorizing things. I have a biology teacher in high school that can attest to it…as well as a lit teacher who saw me struggle mightily to recite a monologue from Romeo and Juliet).

Then he asked me to explain what the algorithm was doing. I balked. “uh….well, the program keeps calling itself until it finds the answer.” “How does it know what the answer is?” “Well…..”

To his credit, he could have really lit into me for using a procedure I didn’t fully understand, but he didn’t. He also recommended me for the next round of interviews, so it wasn’t a deal-breaker, but it taught me an important lesson. Memorizing to “fake it while you make it” only works for the short-term. If asked to create another problem using recursion, I’d have had to end the interview. I had no clue.

So here I am a year and some months later, and recursion is still one of those magical concepts that looks like voodoo, but with some extra understanding of memory usage, is not a particularly difficult concept to comprehend, even if implementation still remains a mystery.

Borrowing some explanations from Aditya Bhargava’s “Grokking Algorithms”, I’m going to attempt to explain the mad science of recursion.

My daughter is constantly losing her clothes. She’ll always have items scattered all over the house, but has no idea where everything is. As a seven year old, her sleuthing skills are lacking. She gives a cursory glance of her bedroom, and immediately gives up and declares her clothes lost. So I ask her to make a list of all the rooms she’s been in since taking her clothes off yesterday and putting on her pajamas.

  • kitchen
  • bedroom
  • closet
  • bathroom
  • my office

So I ask her, well, you know the clothes have to be in one or more of the rooms, so how are you going to find everything you need?

  1. Go into room
  2. Look in all nooks and crannies of that room
  3. If all items found, I can go play
  4. If not, go to the next room on my list and repeat

Note: this still only works 1/2 the time because life is super stressful for a seven year old who doesn’t want to look and only wants to play, but still, the process remains true

So she goes into the kitchen, checks the pantry, checks under the table, checks the chairs, and checks the recycling bins (don’t ask, but seven year olds are weird creatures) and finds her socks.

“I found socks.” “Okay, go to the bedroom”

She starts off in the closet, searches under the bed, searches her bathroom she shares with her sister and then her chest of drawers. Lo and behold, she’s found her shoes. The logic behind this eludes me.

Now that she’s properly shoed, she proceeds through additional rooms until she eventually has been able to clothe herself. Finding one item is useful, but she’s not leaving the house with some of her items and not others. Satisfied that she can face her friends without ridicule, she goes off to enjoy her afternoon fun time where she solves big problems and plays in her make believe world.

What she’s really done is a recursive search for her clothes. She has a list of criteria to search, goes through the criteria on her list and stops when she’s found what she needs. Each item found represents the base case. It tells her when the search can stop.

So recursive functions have base cases. In that regard, they aren’t much different than a while loop. While some scenario is false, keep searching. When it’s true, stop the search. The recursion application comes from each item being needed in conjunction with each other item. A shirt is helpful, but a shirt with pants, shoes, socks, and jacket are all necessary before she’s ready to venture out the door.

In programming, something similar happens. But what happens with all those continued recursive calls? Are the numbers magically lost? Do they somehow keep track of themselves in the background? Is it part of the voodoo? Not exactly.

What I didn’t know is that the function is keeping track of its calls in something called a stack. A stack is short-term memory usage. When the function is done, the memory that was allocated for the function is done away with and gone. However, while the function is still calling (not yet completed), it still utilizes space on the stack. With recursive functions, each recursive call puts some new function on the stack. The stack behaves in a “last in, first out” order. Whatever the last allocation was made, it’s the first one back out when the function finally completes and the memory is no longer needed. To put another way, the top function has to finish first before any of the other calls below it in the stack can finish processing. I like to think of a can of Pringles in this case. The last Pringle chip put into the can will be the first one back out when the can is opened.

In this case, every time you call a function, the computers saves the values from its variables in memory on the stack. When you call a function from another function, the first call is on the bottom of the stack, with it’s values still being held. The next call goes on top of it, holding it’s values too. Here is a graphic example for the fibonacci function.

As an example, the first few Fibonacci numbers are: 0, 1, 1, 3, 5, 8, 13, 21

First: the function…

func fib(n int : Int) -> Int {
    if int <= 2 {
        return int
    } else {
        return fib(n: int - 1) + fib(n: int - 2)

Note: my code above will not give an accurate answer for fib(0) which should be 0. I purposely did this to simplify my demonstration below.

So if I call fib(4), the resultant answer will be the fourth number in the fibonacci sequence, which is 3. Here is how it works…

This is the first call. fib(4) is typed in and n is assigned the value of 4. Looking through the function, n is not less than or equal to 2, so we go straight to the else statement. We need fib(4-1) + fib(4 – 2)…which we don’t yet have.

Our stack looks like this. fib(4) sits on the stack. In fact, it’s the only thing on the stack. It’s holding the parameter value that we called in the function, fib(4). Well, we don’t have any sort of answer for fib(4-1) handy, and the same holds true for fib(4-2), so we start with fib(4-1) or fib(3) and go through the same process. Now, we’ve kept, fib(4) on the stack to return to later so we can finish out that original problem, so now we evaluate fib(3).

fib(3) is now also placed on the stack, on top of fib(4) in fact, and it goes through the same evaluation process. Our stack now looks like:

fib(3) keeps ahold of its parameter variable on the stack, which may not seem important at the moment, but we’ll get there. The parameter 3 is not less than or equal to 2, so we move to the else statement and we’ve instructed it to return fib(3-1) + fib(3 -2) or rather fib(2) + fib(1). Since we don’t yet know the value for fib(2) or fib(1), we evaluate fib(2) and push another function onto the stack.

As you can see, our stack is starting to grow. If these recursive calls continued, we’d eventually run out of memory and cause a “stack overflow”–meaning there is no more room to put our function calls on the stack. However, every time a function can finally complete, in our case, the base case, it gets popped off the stack. In our case here, fib(2) will pop off the stack because it does satisfy the base case of being less than or equal to 2, so it returns the value of 1. This value can be substituted in the fib(3) call made earlier in place of fib(2) in the return statement and fib(2) gets popped off the stack.

So now our stack contains only fib(3) and fib(4). fib(3) asked us to return fib(2), which we found to be 1. Anytime a function completes, it is removed from the call stack. It’s values that were required by other functions are acquired so they can be used to finish their respective needs. In the case of fib(3), we’ve returned half of what it needs by find fib(2), but it also requires fib(1). This means another function call, fib(1), and another addition to our call stack. So fib(1) goes onto the stack and awaits evaluation.

fib(1)’s evaluation meets the base case, so it returns 1 as well. With evaluations for fib(2) and fib(1), fib(3) can finally be completed and return an answer of 2. fib(3) returns ‘2’ and leaves the call stack so we’re now left with fib(4).

At this point, you think, “Great! I have everything I need”. Not quite. fib(4) requires a return of fib(3) + fib(2). Well, we’ve only evaluated the fib(3) part. fib(2), although evaluated earlier as we travelled down the fib(3) trail, is no longer available to us as it returned its value to fib(3) and was removed from the stack. So now we must evaluate fib(2) again, obtain the returned value of 1 and have that answer returned as part of the fib(3) + fib(2) answer or 2 + 1 which gives us 3. The fourth fibonacci number is indeed 3.

Here is a broader picture of what is going on, step-by-step:

With recursion, small recursive calls aren’t that big of a deal. Yeah, it takes up more memory and repeats calls already made, but with small numbers, the runtime differences aren’t even noticeable. However, if you tried to compute fib(100), you’d run out of space in the call stack. The reason is this repeatedly having to find values you found before. Just to calculate fib(100), you’d have to find fib(99) + fib(98). fib(99) would have to work all the way down to fib(1) and then you could start working on fib(98) and repeat the whole process.

This is a better example of how many calculations must be computed using recursion.

Notice how many times fib(4) has to be computed just to solve fib(7).

There is a work around called a memoization table which stores each compute fib value if it doesn’t already exist and then checks the table first to see if a value exists before trying to compute. If it does exist, it uses the value stored. If not, it computes the value, and stores it in case it’s needed later. It reduces the number of computations needed and is very quick. I won’t be demonstrating it today, but will in a future post.

I blogged about this topic to get to a topic I’ve really been wanting to discuss: sorting algorithms. Merge sort was the first sorting algorithm introduced to me that used recursion and it completely baffled me. Quick sort actually made more sense, but it was introduced later. I wanted to blog about both, more to make sure I could explain it to myself than anything, but also to help those poor souls who suffered as I suffered when trying to wrap my head around what was going on. In order to do that, I needed a better understanding of recursion. It wasn’t until the stack was explained, that recursion made any sort of sense. Up until then, I considered recursion to be magic. It still is magic, just not as baffling as before!