how can a recursive function operate if it returns to the beginning?

问题

I am a novice programmer

I am looking up into a problem which is using a recursive function. Though I could understand the main point, there is an unclear issue which I could not decipher immediately as I go through the debugging process. I will appreciate your help on my question.

The problem's concept (merge sorting) is pretty straight forward, but I am confused with the way a recursive function works in general. Bellow is the program I am dealing with (from Georgia Tech course on Python):

def mergesort(lst):

    if len(lst) <= 1:
        return lst

    else:

        midpoint = len(lst) // 2
        left = mergesort(lst[:midpoint])

        right = mergesort(lst[midpoint:])

        newlist = []
        while len(left) and len(right) > 0:
            if left[0] < right[0]:
                newlist.append(left[0])
            else:
                newlist.append(right[0])
                del right[0]

        newlist.extend(left)
        newlist.extend(right)

        return newlist

print(mergesort([2, 5, 3, 8, 6, 9, 1, 4, 7]))

QUESTION: What happens when the program reaches this line left = mergesort( lst[:midpoint])?

Based on my understanding, it returns to the first line of the program and comes down again to reach the same line (just like for does).

So it keeps returning!!! This, however, makes the program unreadable to me. In general, How the program deals with the recursive function is my main question. I could not understand the way it works.

回答1:

What would happen When the program reaches this line left = mergesort(lst[:midpoint])? Based on my understanding, it returns to the first line of the program and comes down again to reach the same line...

Each time the program recurs, it calls mergesort with a smaller list. We call this a "sub-problem" -

def mergesort(lst):
    if len(lst) <= 1:
        # ...
    else:
        midpoint = len(lst) // 2           # find midpoint
        left = mergesort(lst[:midpoint])   # solve sub-problem one
        right = mergesort(lst[midpoint:])  # solve sub-problem two
        # ...

For example, if we first call mergesort with a 4-element list -

mergesort([5,2,4,7])

The input list, lst, does not meet the base case, so we move onto the else branch -

def mergesort(lst):                       # lst = [5,2,4,7]
    if len(lst) <= 1:
        # ...
    else:
        midpoint = len(lst) // 2          # midpoint = 2
        left = mergesort(lst[:midpoint])  # left = mergesort([5,2])
        right = mergesort(lst[midpoint:]) # right = mergesort([4,7])
        # ...

Notice mergesort is called with [5,2] and [4,7] sub-problems. Let's repeat these steps for the first sub-problem -

left = mergesort([5,2])

def mergesort(lst):                       # lst = [5,2]
    if len(lst) <= 1:
        # ...
    else:
        midpoint = len(lst) // 2          # midpoint = 1
        left = mergesort(lst[:midpoint])  # left = mergesort([5])
        right = mergesort(lst[midpoint:]) # right = mergesort([2])
        # ...

So it keeps returning!!!

Not exactly. When we solve the sub-problems in this step, things looks different. When the input is one element or less, the base case is satisfied and the function exits -

left = mergesort([5])

def mergesort(lst):     # lst = [5]
    if len(lst) <= 1:   # base case condition satisfied
        return lst      # return [5]
    else:
        ...             # no more recursion

Recursion stops for the left sub-problem and the answer of [5] is returned. The same applies for the right sub-problem -

right = mergesort([2])

def mergesort(lst):     # lst = [2]
    if len(lst) <= 1:   # base case condition satisfied
        return lst      # return [2]
    else:
        ...             # no more recursion

Next we return our first left sub-problem -

left = mergesort([5,2])

def mergesort(lst):                       # lst = [5,2]
    if len(lst) <= 1:
        # ...
    else:
        midpoint = len(lst) // 2          # midpoint = 1
        left = mergesort(lst[:midpoint])  # left = [5]        <-
        right = mergesort(lst[midpoint:]) # right = [2]       <-
        # ...
        return newlist                    # newlist = [2,5]

You would now repeat these steps for the first right sub-problem -

right = mergesort([4,7])

def mergesort(lst):                       # lst = [4,7]
    if len(lst) <= 1:
        # ...
    else:
        midpoint = len(lst) // 2          # midpoint = 1
        left = mergesort(lst[:midpoint])  # left = mergesort([4])
        right = mergesort(lst[midpoint:]) # right = mergesort([7])
        # ...

Again, recursion stops as the new left and right sub-problems are a single-element list, which satisfies the base case -

right = mergesort([4,7])

def mergesort(lst):                       # lst = [4,7]
    if len(lst) <= 1:
        # ...
    else:
        midpoint = len(lst) // 2          # midpoint = 1
        left = mergesort(lst[:midpoint])  # left = [4]       <-
        right = mergesort(lst[midpoint:]) # right = [7]      <-
        # ...
        return newlist                    # newlist = [4,7]

And finally the outermost mergesort call can return -

mergesort([5,2,4,7])

def mergesort(lst):                       # lst = [5,2,4,7]
    if len(lst) <= 1:
        # ...
    else:
        midpoint = len(lst) // 2          # midpoint = 2
        left = mergesort(lst[:midpoint])  # left = [2,5]
        right = mergesort(lst[midpoint:]) # right = [4,7]
        # ...
        return newlist                    # newlist = [2,4,5,7]

# => [2,4,5,7]

All of that said, recursion is a functional heritage and so using it with functional style yields the best results. This means avoiding things like mutations, variable reassignments, and other side effects. Consider this alternative which lowers the conceptual overhead by clearly separating the program's concerns -

def mergesort(lst):
  def split(lst):
    m = len(lst) // 2
    return (lst[:m], lst[m:])

  def merge(l, r):
    if not l:
      return r
    elif not r:
      return l
    elif l[0] < r[0]:
      return [l[0]] + merge(l[1:], r)
    else:
      return [r[0]] + merge(l, r[1:])

  if len(lst) <= 1:
    return lst
  else:
    (left, right) = split(lst)
    return merge(mergesort(left), mergesort(right))

mergesort([5,2,4,7])

# => [2,4,5,7]

回答2:

the answer to your question is: copies.

each function is a recipe for calculation.

when a function is called, the recipe's copy is created. each invocation involves creation of a separate copy. that's how each can operate on its own, and they all are not jumbled up together.

in general, there is nothing special about a recursive function call. a function call is a function call, no matter what is the function that is called. a function is called, does what it does, and its result is returned to the caller. as for recursion, you're not supposed to track it. it does its work on its own. you're supposed to prove it to yourself that the base case is correct and that the recursive case is correct. that is all.

then it is guaranteed to work in however convoluted way it does, and the whole point of it is for us to not care about how it does it exactly, i.e. about its exact sequence of steps.

so specifically in your case, assuming mergesort indeed does work correctly (wait, what? never mind, suspend your disbelief for a moment),

        left = mergesort(lst[:midpoint])

calls the function mergesort with the first half of lst, from its start to its midpoint, and stores the result - which is the sorted first half, by assumption, - in the variable left; then

        right = mergesort(lst[midpoint:])

calls the function mergesort with the second half of lst, from its midpoint to its end, and stores the result - which is the sorted second half, by assumption, - in the variable right;

and then you need to convince yourself that the rest of the code creates newlist from those two sorted halves such that newlist is also sorted in the correct order.

And then by the principle of mathematical induction this proves the correctness of mergesort.

By assuming it works, we prove that it indeed works! Where's the catch? There's no catch, because the two cases that worked by assumption are for two smaller inputs (and that is our recursive case).

And when we divide a thing into two parts, over and over, eventually we're left with either a singleton, or an empty thing. And those two are naturally sorted (and that is our base case).

Recursion is a leap of faith. Assume the thing is working, then you get to use it. And if you use it correctly, you will have thus built the very thing you were using in the first place!

来源：https://stackoverflow.com/questions/65454744/how-can-a-recursive-function-operate-if-it-returns-to-the-beginning

标签

function

sorting

recursion

mergesort