Understanding how recursive functions work

31,645

Solution 1

I think the confusion is stemming from thinking of it as "the same function" being called many times. If you think of it as "many copies of the same function being called", then it may be clearer:

Only one copy of the function ever returns 0, and it's not the first one (it's the last one). So the result of calling the first one is not 0.

For the second bit of confusion, I think it will be easier to spell out the recursion in English. Read this line:

return a + sumInts(a + 1, b: b)

as "return the value of 'a' plus (the return value of another copy of the function, which is the copy's value of 'a' plus (the return value of another copy of the function, which is the second copy's value of 'a' plus (...", with each copy of the function spawning a new copy of itself with a increased by 1, until the a > b condition is met.

By the time you reach the the a > b condition being true, you have a (potentially arbitrarily) long stack of copies of the function all in the middle of being run, all waiting on the result of the next copy to find out what they should add to 'a'.

(edit: also, something to be aware of is that the stack of copies of the function I mention is a real thing that takes up real memory, and will crash your program if it gets too large. The compiler can optimize it out in some cases, but exhausting stack space is a significant and unfortunate limitation of recursive functions in many languages)

Solution 2

1.The function is called recursively until a condition is met. That condition is a > b. When this condition is met, return 0. At first glance, I would expect the return value to be 0 which is obviously incorrect.

Here is what the computer computing sumInts(2,5) would think if it were able to:

I want to compute sumInts(2, 5)
for this, I need to compute sumInts(3, 5)
and add 2 to the result.
  I want to compute sumInts(3, 5)
  for this, I need to compute sumInts(4, 5)
  and add 3 to the result.
    I want to compute sumInts(4, 5)
    for this, I need to compute sumInts(5, 5)
    and add 4 to the result.
      I want to compute sumInts(5, 5)
      for this, I need to compute sumInts(6, 5)
      and add 5 to the result.
        I want to compute sumInts(6, 5)
        since 6 > 5, this is zero.
      The computation yielded 0, therefore I shall return 5 = 5 + 0.
    The computation yielded 5, therefore I shall return 9 = 4 + 5.
  The computation yielded 9, therefore I shall return 12 = 3 + 9.
The computation yielded 12, therefore I shall return 14 = 2 + 12.

As you see, some call to the function sumInts actually returns 0 however this not the final value because the computer still has to add 5 to that 0, then 4 to the result, then 3, then 2, as described by the four last sentences of the thoughts of our computer. Note that in the recursion, the computer does not only have to compute the recursive call, it also has to remember what to do with the value returned by the recursive call. There is a special area of computer's memory called the stack where this kind of information is saved, this space is limited and functions that are too recursive can exhaust the stack: this is the stack overflow giving its name to our most loved website.

Your statement seems to make the implicit assumption that the computer forgets what it were at when doing a recursive call, but it does not, this is why your conclusion does not match your observation.

2.Printing out the value of 'a' on each iteration yields a value which I would expect: 2, 3, 4, 5 (at which point 5+1 > b which meets the first condition: a > b) but I still don't see how the value of 14 is achieved.

This is because the return value is not an a itself but the sum of the value of a and the value returned by the recursive call.

Solution 3

To understand recursion you must think of the problem in a different way. Instead of a large logical sequence of steps that makes sense as a whole you instead take a large problem and break up into smaller problems and solve those, once you have an answer for the sub problems you combine the results of the sub problems to make the solution to the bigger problem. Think of you and your friends needing to count the number of marbles in a huge bucket. You do each take a smaller bucket and go count those individually and when you are done you add the totals together.. Well now if each of you find some friend and split the buckets further, then you just need to wait for these other friends to figure out their totals, bring it back to each of you, you add it up. And so on. The special case is when you only get 1 marble to count then you just return it back and say 1. let the other people above you do the adding you are done.

You must remember every time the function calls itself recursively it creates a new context with a subset of the problem, once that part is resolved it gets returned so that the previous iteration can complete.

Let me show you the steps:

sumInts(a: 2, b: 5) will return: 2 + sumInts(a: 3, b: 5)
sumInts(a: 3, b: 5) will return: 3 + sumInts(a: 4, b: 5)
sumInts(a: 4, b: 5) will return: 4 + sumInts(a: 5, b: 5)
sumInts(a: 5, b: 5) will return: 5 + sumInts(a: 6, b: 5)
sumInts(a: 6, b: 5) will return: 0

once sumInts(a: 6, b: 5) has executed, the results can be computed so going back up the chain with the results you get:

 sumInts(a: 6, b: 5) = 0
 sumInts(a: 5, b: 5) = 5 + 0 = 5
 sumInts(a: 4, b: 5) = 4 + 5 = 9
 sumInts(a: 3, b: 5) = 3 + 9 = 12
 sumInts(a: 2, b: 5) = 2 + 12 = 14.

Another way to represent the structure of the recursion:

 sumInts(a: 2, b: 5) = 2 + sumInts(a: 3, b: 5)
 sumInts(a: 2, b: 5) = 2 + 3 + sumInts(a: 4, b: 5)  
 sumInts(a: 2, b: 5) = 2 + 3 + 4 + sumInts(a: 5, b: 5)  
 sumInts(a: 2, b: 5) = 2 + 3 + 4 + 5 + sumInts(a: 6, b: 5)
 sumInts(a: 2, b: 5) = 2 + 3 + 4 + 5 + 0
 sumInts(a: 2, b: 5) = 14 

Solution 4

Recursion is a tricky topic to understand and I don't think I can fully do it justice here. Instead, I'll try to focus on the particular piece of code you have here and try to describe both the intuition for why the solution works and the mechanics of how the code computes its result.

The code you've given here solves the following problem: you want to know the sum of all the integers from a to b, inclusive. For your example, you want the sum of the numbers from 2 to 5, inclusive, which is

2 + 3 + 4 + 5

When trying to solve a problem recursively, one of the first steps should be to figure out how to break the problem down into a smaller problem with the same structure. So suppose that you wanted to sum up the numbers from 2 to 5, inclusive. One way to simplify this is to notice that the above sum can be rewritten as

2 + (3 + 4 + 5)

Here, (3 + 4 + 5) happens to be the sum of all the integers between 3 and 5, inclusive. In other words, if you want to know the sum of all the integers between 2 and 5, start by computing the sum of all the integers between 3 and 5, then add 2.

So how do you compute the sum of all the integers between 3 and 5, inclusive? Well, that sum is

3 + 4 + 5

which can be thought of instead as

3 + (4 + 5)

Here, (4 + 5) is the sum of all the integers between 4 and 5, inclusive. So, if you wanted to compute the sum of all the numbers between 3 and 5, inclusive, you'd compute the sum of all the integers between 4 and 5, then add 3.

There's a pattern here! If you want to compute the sum of the integers between a and b, inclusive, you can do the following. First, compute the sum of the integers between a + 1 and b, inclusive. Next, add a to that total. You'll notice that "compute the sum of the integers between a + 1 and b, inclusive" happens to be pretty much the same sort of problem we're already trying to solve, but with slightly different parameters. Rather than computing from a to b, inclusive, we're computing from a + 1 to b, inclusive. That's the recursive step - to solve the bigger problem ("sum from a to b, inclusive"), we reduce the problem to a smaller version of itself ("sum from a + 1 to b, inclusive.").

If you take a look at the code you have above, you'll notice that there's this step in it:

return a + sumInts(a + 1, b: b)

This code is simply a translation of the above logic - if you want to sum from a to b, inclusive, start by summing a + 1 to b, inclusive (that's the recursive call to sumInts), then add a.

Of course, by itself this approach won't actually work. For example, how would you compute the sum of all the integers between 5 and 5 inclusive? Well, using our current logic, you'd compute the sum of all the integers between 6 and 5, inclusive, then add 5. So how do you compute the sum of all the integers between 6 and 5, inclusive? Well, using our current logic, you'd compute the sum of all the integers between 7 and 5, inclusive, then add 6. You'll notice a problem here - this just keeps on going and going!

In recursive problem solving, there needs to be some way to stop simplifying the problem and instead just go solve it directly. Typically, you'd find a simple case where the answer can be determined immediately, then structure your solution to solve simple cases directly when they arise. This is typically called a base case or a recursive basis.

So what's the base case in this particular problem? When you're summing up integers from a to b, inclusive, if a happens to be bigger than b, then the answer is 0 - there aren't any numbers in the range! Therefore, we'll structure our solution as follows:

  1. If a > b, then the answer is 0.
  2. Otherwise (a ≤ b), get the answer as follows:
    1. Compute the sum of the integers between a + 1 and b.
    2. Add a to get the answer.

Now, compare this pseudocode to your actual code:

func sumInts(a: Int, b: Int) -> Int {
    if (a > b) {
        return 0
    } else {
        return a + sumInts(a + 1, b: b)
    }
}

Notice that there's almost exactly a one-to-one map between the solution outlined in pseudocode and this actual code. The first step is the base case - in the event that you ask for the sum of an empty range of numbers, you get 0. Otherwise, compute the sum between a + 1 and b, then go add a.

So far, I've given just a high-level idea behind the code. But you had two other, very good questions. First, why doesn't this always return 0, given that the function says to return 0 if a > b? Second, where does the 14 actually come from? Let's look at these in turn.

Let's try a very, very simple case. What happens if you call sumInts(6, 5)? In this case, tracing through the code, you see that the function just returns 0. That's the right thing to do, to - there aren't any numbers in the range. Now, try something harder. What happens when you call sumInts(5, 5)? Well, here's what happens:

  1. You call sumInts(5, 5). We fall into the else branch, which return the value of `a + sumInts(6, 5).
  2. In order for sumInts(5, 5) to determine what sumInts(6, 5) is, we need to pause what we're doing and make a call to sumInts(6, 5).
  3. sumInts(6, 5) gets called. It enters the if branch and returns 0. However, this instance of sumInts was called by sumInts(5, 5), so the return value is communicated back to sumInts(5, 5), not to the top-level caller.
  4. sumInts(5, 5) now can compute 5 + sumInts(6, 5) to get back 5. It then returns it to the top-level caller.

Notice how the value 5 was formed here. We started off with one active call to sumInts. That fired off another recursive call, and the value returned by that call communicated the information back to sumInts(5, 5). The call to sumInts(5, 5) then in turn did some computation and returned a value back to the caller.

If you try this with sumInts(4, 5), here's what will happen:

  • sumInts(4, 5) tries to return 4 + sumInts(5, 5). To do that, it calls sumInts(5, 5).
    • sumInts(5, 5) tries to return 5 + sumInts(6, 5). To do that, it calls sumInts(6, 5).
    • sumInts(6, 5) returns 0 back to sumInts(5, 5).</li> <li>sumInts(5, 5)now has a value forsumInts(6, 5), namely 0. It then returns5 + 0 = 5`.
  • sumInts(4, 5) now has a value for sumInts(5, 5), namely 5. It then returns 4 + 5 = 9.

In other words, the value that's returned is formed by summing up values one at a time, each time taking one value returned by a particular recursive call to sumInts and adding on the current value of a. When the recursion bottoms out, the deepest call returns 0. However, that value doesn't immediately exit the recursive call chain; instead, it just hands the value back to the recursive call one layer above it. In that way, each recursive call just adds in one more number and returns it higher up in the chain, culminating with the overall summation. As an exercise, try tracing this out for sumInts(2, 5), which is what you wanted to begin with.

Hope this helps!

Solution 5

You've got some good answers here so far, but I'll add one more that takes a different tack.

First off, I have written many articles on simple recursive algorithms that you might find interesting; see

http://ericlippert.com/tag/recursion/

http://blogs.msdn.com/b/ericlippert/archive/tags/recursion/

Those are in newest-on-top order, so start from the bottom.

Second, so far all of the answers have described recursive semantics by considering function activation. That each, each call makes a new activation, and the recursive call executes in the context of this activation. That is a good way to think of it, but there is another, equivalent way: smart text seach-and-replace.

Let me rewrite your function into a slightly more compact form; don't think of this as being in any particular language.

s = (a, b) => a > b ? 0 : a + s(a + 1, b)

I hope that makes sense. If you're not familiar with the conditional operator, it is of the form condition ? consequence : alternative and its meaning will become clear.

Now we wish to evaluate s(2,5) We do so by doing a textual replacing of the call with the function body, then replace a with 2 and b with 5:

s(2, 5) 
---> 2 > 5 ? 0 : 2 + s(2 + 1, 5)

Now evaluate the conditional. We textually replace 2 > 5 with false.

---> false ? 0 : 2 + s(2 + 1, 5)

Now textually replace all false conditionals with the alternative and all true conditionals with the consequence. We have only false conditionals, so we textually replace that expression with the alternative:

---> 2 + s(2 + 1, 5)

Now, to save me having to type all those + signs, textually replace constant arithmetic with its value. (This is a bit of a cheat, but I don't want to have to keep track of all the parentheses!)

---> 2 + s(3, 5)

Now search-and-replace, this time with the body for the call, 3 for a and 5 for b. We'll put the replacement for the call in parentheses:

---> 2 + (3 > 5 ? 0 : 3 + s(3 + 1, 5))

And now we just keep on doing those same textual substitution steps:

---> 2 + (false ? 0 : 3 + s(3 + 1, 5))  
---> 2 + (3 + s(3 + 1, 5))                
---> 2 + (3 + s(4, 5))                     
---> 2 + (3 + (4 > 5 ? 0 : 4 + s(4 + 1, 5)))
---> 2 + (3 + (false ? 0 : 4 + s(4 + 1, 5)))
---> 2 + (3 + (4 + s(4 + 1, 5)))
---> 2 + (3 + (4 + s(5, 5)))
---> 2 + (3 + (4 + (5 > 5 ? 0 : 5 + s(5 + 1, 5))))
---> 2 + (3 + (4 + (false ? 0 : 5 + s(5 + 1, 5))))
---> 2 + (3 + (4 + (5 + s(5 + 1, 5))))
---> 2 + (3 + (4 + (5 + s(6, 5))))
---> 2 + (3 + (4 + (5 + (6 > 5 ? 0 : s(6 + 1, 5)))))
---> 2 + (3 + (4 + (5 + (true ? 0 : s(6 + 1, 5)))))
---> 2 + (3 + (4 + (5 + 0)))
---> 2 + (3 + (4 + 5))
---> 2 + (3 + 9)
---> 2 + 12
---> 14

All we did here was just straightforward textual substitution. Really I shouldn't have substituted "3" for "2+1" and so on until I had to, but pedagogically it would have gotten hard to read.

Function activation is nothing more than replacing the function call with the body of the call, and replacing the formal parameters with their corresponding arguments. You have to be careful about introducing parentheses intelligently, but aside from that, it's just text replacement.

Of course, most languages do not actually implement activation as text replacement, but logically that's what it is.

So what then is an unbounded recursion? A recursion where the textual substitution doesn't stop! Notice how eventually we got to a step where there was no more s to replace, and we could then just apply the rules for arithmetic.

Share:
31,645

Related videos on Youtube

Jason Elwood
Author by

Jason Elwood

Updated on April 02, 2021

Comments

  • Jason Elwood
    Jason Elwood about 3 years

    As the title explains I have a very fundamental programming question which I have just not been able to grok yet. Filtering out all of the (extremely clever) "In order to understand recursion, you must first understand recursion." replies from various online threads I still am not quite getting it.

    Understanding that when faced with not knowing what we don't know, we can tend to ask the wrong questions or ask the right questions incorrectly I will share what I "think" my question is in hopes that someone with a similar outlook can share some bit of knowledge that will help turn on the recursive light bulb for me!

    Here is the function (the syntax is written in Swift):

    func sumInts(a: Int, b: Int) -> Int {
        if (a > b) {
            return 0
        } else {
            return a + sumInts(a: a + 1, b: b)
        }
    }
    

    We'll use 2 and 5 as our arguments:

    println(sumInts(a: 2, b: 5))
    

    Obviously the answer is 14. But I'm not clear on how that value is achieved.

    These are my 2 hangups:

    1. The function is called recursively until a condition is met. That condition is a > b. When this condition is met, return 0. At first glance, I would expect the return value to be 0 which is obviously incorrect.

    2. Printing out the value of 'a' on each iteration yields a value which I would expect: 2, 3, 4, 5 (at which point 5+1 > b which meets the first condition: a > b) but I still don't see how the value of 14 is achieved.

    My first thought is that something similar to the following is happening magically:

    var answer = a;
    answer += a+1 until a > b;
    return answer;   
    

    So ruling out magic, I'm just not getting something. I would love to understand what's happening more than just implicitly.

    If someone could kindly explain what technically happens during this kind of function and why the result isn't 0 and how, eventually, a + sumInts(a: a + 1, b: b) = 14, I would be forever in your debt.

    • blgt
      blgt over 9 years
      Recursion is one of those programming concepts which are much easier to understand in mathematical terms than in code; there's a good definition here
    • Clockwork-Muse
      Clockwork-Muse over 9 years
      Side note: Recursion isn't required for this calculation. Unless I miss my guess, you can use a + (a + 1) * (b - a + 1).
    • Musicode
      Musicode over 9 years
      I strongly recommend finding 2-3 simple recursive functions online, and later writing 2-3 of your own. Trace through each function, writing out the values of each variable at each step, and draw out the recursion tree by hand. I used to have trouble with recursion too, but now it is like second nature and I use it in real world problem solving. My success came from using that strategy, and I still do it sometimes when I encounter recursion. You have to get to the point where you not only "grok" it, but also simply just brute force memorize the way that such programs move about and work.
    • Musicode
      Musicode over 9 years
      To expand on the above, the last sentence is really quite important. For me, at least, it helped very very much to get to the point where, when reading or developing, I no longer have to "grok" a recursive program, I simply know how it will move around and work by putting the lines of code in the right order and structure. Many recursive programs have a similar structure. Try to memorize the structure. That being said, tracing through as I described above should obviously heavily come first, it takes a lot of practice for some people (myself included). I hope this all makes sense.
    • recursion.ninja
      recursion.ninja over 9 years
      LearnYouARecursion, complete problem sets from world-class professor!
    • Floris
      Floris over 9 years
      I just have to urge you to type "Recursion" in the Google search box. One of those Easter eggs. I won't spoil the surprise for you.
    • keshlam
      keshlam over 9 years
      If it helps at all, recursion is essentially calculation-by-induction -- the same concepts as proof-by-induction apply.
    • Neil McGuigan
      Neil McGuigan over 9 years
    • specialscope
      specialscope over 9 years
      Thinking of recursion as a tree really helps. In this case it is tree with single branch (a->a+1->a+2->a+3......a+n->0). Each of the arrow means (add +). When you get more complex recursion like double recursion (most sorting algos, fibonacci etc) visualizing in terms of recursion tree is very useful.
    • fredoverflow
      fredoverflow over 9 years
      What on Earth is Recursion? explains it very nicely, imho.
    • dirkk
      dirkk over 9 years
    • dirkk
      dirkk over 9 years
      I am seriously confused that this question has gotten so much attention. SO is a Q&A site and I thought we try to avoid answering the same question over and over again. I see nothing more in this question than basically "How does recursion work?" and this was certainly asked before. Btw, this is not meant as criticism of the OP, as I understand that recursion is difficult to grasp as a beginner. Imho, this question should be closed.
    • Daniel Viglione
      Daniel Viglione about 7 years
      These are seriously some good answers to this question, preferably the second answer but the first answer is also fruitful.
  • Jason Elwood
    Jason Elwood over 9 years
    Catfish_Man: I think you nailed it! Thinking of it as several "copies" of the same function makes total sense. I'm still wrapping my head around it but I think you've sent me off on the right path! Thanks for taking time out of your busy day to help out a fellow programmer! I will mark your answer as the correct answer. Have a great day!
  • Catfish_Man
    Catfish_Man over 9 years
    Happy I could help :)
  • Jason Elwood
    Jason Elwood over 9 years
    Thanks for taking time out of your busy day to share such a comprehensive answer! There is a ton of great information here which is helping me get my head around recursive functions and will surely help others that stumble upon this post in the future. Thanks again and have a great day!
  • Jason Elwood
    Jason Elwood over 9 years
    Very well put, Rob. You've put it in a way that's very clear and easy to understand. Thanks for taking the time!
  • Bryan
    Bryan over 9 years
    This is the clearest representation of what is going on, without going into the theory and technical details of it, shows each step of the execution clearly.
  • Rob
    Rob over 9 years
    I am glad. :) it is not always easy to explain these things. Thank you for the compliment.
  • Jason Elwood
    Jason Elwood over 9 years
    Thanks for taking time to write up this great answer Michael! +1!
  • Michaël Le Barbier
    Michaël Le Barbier over 9 years
    @JasonElwood Maybe it is helpful if you modify sumInts so that it actually writes down the “computer thoughts”. Once you have written a hand of such functions, you will probably have “got it”!
  • Tim B
    Tim B over 9 years
    This is a good analogy - although be careful not to take it too literally as each "copy" is actually the exact same code. What is different for each copy is all the data it is working on.
  • Thomas
    Thomas over 9 years
    I'm not too happy with thinking of it as a copy. I find that a more intuitive explanation is to differentiate the function itself (the code, what it does) and a function invocation (instantiation of that function) to which a stack frame/execution context is associated. The function doesn't own its local variables, they are instantiated as the function is called (invoked). But I guess this will do as an introduction to recursion
  • cmv
    cmv over 9 years
    Great explanation. @JasonElwood: It may be helpful to write it out on paper as well - on the first line, write sumInts(2,5), and leave space for the expected result next to it. To get that result, you must step into sumInts, so write sumInts(3,5) on the next line, indented a little. Leave some space for the result, and so on and so on until you reach a state where you do not need to write sumInts on the next line. Then, you can finally write 0 for the result for that call, and you can fill in a result for everything above it.
  • Theodore Norvell
    Theodore Norvell over 9 years
    The correct terminology is that there are several invocations of the function. Each invocation has its own instances of variables a and b.
  • Catfish_Man
    Catfish_Man over 9 years
    Yes, there's a significant amount of precision that could be added to this answer. I intentionally left out the distinction between "instances of a function" and "activation records of invocations of a single function", because it was extra conceptual load that doesn't really assist in understanding the problem. It helps in understanding other problems, so it's still useful info, just elsewhere. These comments seem like a fine place for it :)
  • KChaloux
    KChaloux over 9 years
    +1. This is how I would describe it, specifically with your last example of the structure. It's helpful to visually unroll what's happening.
  • Floris
    Floris over 9 years
    Now it's even closer... 55:28
  • Eric Lippert
    Eric Lippert over 9 years
    @TheodoreNorvell: "Invocation" is good; "activation" is also in common usage.
  • Eric Lippert
    Eric Lippert over 9 years
    @Thomas: do think of it as a copy. See my answer for a detailed explanation of why this is reasonable.
  • Eric Lippert
    Eric Lippert over 9 years
    This is a good answer, though I note that there is no requirement that function activation take place on a data structure called "the stack". Recursion could be implemented by continuation passing style, in which case there is no stack at all. The stack is just one -- particularly efficient, and therefore in common usage -- reification of the notion of continuation.
  • chi
    chi over 9 years
    I am not sure that finite state automata or context-free grammars are the best examples which can help one building some first intuition about recursion. They are nice examples, but perhaps a bit unfamiliar to programmers having no previous CS background.
  • fredoverflow
    fredoverflow over 9 years
    @EricLippert You don't have much love for the stack, do you? ;)
  • Michaël Le Barbier
    Michaël Le Barbier over 9 years
    @EricLippert While techniques used to implement recursivity is an interesting topic per se, I am not sure if it would be useful for the OP—who wants to understand “how it works”—to be exposed to the variety of mechanisms used. While continuation passing style or expansion based languages (e.g. TeX and m4) are not intrinsically harder than more common programming paradigms, I will not offence anybody by labelling these “exotic” and a little white lie like “it always happens on the stack” should help the OP to understand the concept. (And a kind of stack is always involved.)
  • Theodore Norvell
    Theodore Norvell over 9 years
    @EricLippert Yes, activation is also common. Yes, you can think of invocation in terms of copying the function body. This makes the most sense for eager functional languages. For imperative languages, you need to be sure that each copy has its own local variables, including parameters.
  • Barmar
    Barmar over 9 years
    There has to be some way for the software to remember what it was doing, call the function recursively, and then return to that original state when it returns. This mechanism acts stack-like, so it's convenient to call it a stack, even if some other data structure is used.
  • Khaled.K
    Khaled.K over 9 years
    sum(a,b) = a + sum(a+1,b) if a <= b otherwise 0
  • CodeYogi
    CodeYogi over 8 years
    Good example but it breaks your heart when you proceed to do more complex computations. Eg. Finding the common ancestor in a Binary Tree.
  • Will Ness
    Will Ness about 3 years
    yep. :) copying is key to understanding function calls in general, too (each invocation uses the same function recipe with its new fresh invocation frame / sand-box to play in).