问题
I am a novice programmer
I am looking up into a problem which is using a recursive function. Though I could understand the main point, there is an unclear issue which I could not decipher immediately as I go through the debugging process. I will appreciate your help on my question.
The problem's concept (merge sorting) is pretty straight forward, but I am confused with the way a recursive function works in general. Bellow is the program I am dealing with (from Georgia Tech course on Python):
def mergesort(lst):
if len(lst) <= 1:
return lst
else:
midpoint = len(lst) // 2
left = mergesort(lst[:midpoint])
right = mergesort(lst[midpoint:])
newlist = []
while len(left) and len(right) > 0:
if left[0] < right[0]:
newlist.append(left[0])
else:
newlist.append(right[0])
del right[0]
newlist.extend(left)
newlist.extend(right)
return newlist
print(mergesort([2, 5, 3, 8, 6, 9, 1, 4, 7]))
QUESTION: What happens when the program reaches this line left = mergesort( lst[:midpoint])
?
Based on my understanding, it returns to the first line of the program and comes down again to reach the same line (just like for
does).
So it keeps returning!!! This, however, makes the program unreadable to me. In general, How the program deals with the recursive function is my main question. I could not understand the way it works.
回答1:
What would happen When the program reaches this line
left = mergesort(lst[:midpoint])
? Based on my understanding, it returns to the first line of the program and comes down again to reach the same line...
Each time the program recurs, it calls mergesort
with a smaller list. We call this a "sub-problem" -
def mergesort(lst):
if len(lst) <= 1:
# ...
else:
midpoint = len(lst) // 2 # find midpoint
left = mergesort(lst[:midpoint]) # solve sub-problem one
right = mergesort(lst[midpoint:]) # solve sub-problem two
# ...
For example, if we first call mergesort
with a 4-element list -
mergesort([5,2,4,7])
The input list, lst
, does not meet the base case, so we move onto the else
branch -
def mergesort(lst): # lst = [5,2,4,7]
if len(lst) <= 1:
# ...
else:
midpoint = len(lst) // 2 # midpoint = 2
left = mergesort(lst[:midpoint]) # left = mergesort([5,2])
right = mergesort(lst[midpoint:]) # right = mergesort([4,7])
# ...
Notice mergesort
is called with [5,2]
and [4,7]
sub-problems. Let's repeat these steps for the first sub-problem -
left = mergesort([5,2])
def mergesort(lst): # lst = [5,2]
if len(lst) <= 1:
# ...
else:
midpoint = len(lst) // 2 # midpoint = 1
left = mergesort(lst[:midpoint]) # left = mergesort([5])
right = mergesort(lst[midpoint:]) # right = mergesort([2])
# ...
So it keeps returning!!!
Not exactly. When we solve the sub-problems in this step, things looks different. When the input is one element or less, the base case is satisfied and the function exits -
left = mergesort([5])
def mergesort(lst): # lst = [5]
if len(lst) <= 1: # base case condition satisfied
return lst # return [5]
else:
... # no more recursion
Recursion stops for the left
sub-problem and the answer of [5]
is returned. The same applies for the right
sub-problem -
right = mergesort([2])
def mergesort(lst): # lst = [2]
if len(lst) <= 1: # base case condition satisfied
return lst # return [2]
else:
... # no more recursion
Next we return our first left
sub-problem -
left = mergesort([5,2])
def mergesort(lst): # lst = [5,2]
if len(lst) <= 1:
# ...
else:
midpoint = len(lst) // 2 # midpoint = 1
left = mergesort(lst[:midpoint]) # left = [5] <-
right = mergesort(lst[midpoint:]) # right = [2] <-
# ...
return newlist # newlist = [2,5]
You would now repeat these steps for the first right
sub-problem -
right = mergesort([4,7])
def mergesort(lst): # lst = [4,7]
if len(lst) <= 1:
# ...
else:
midpoint = len(lst) // 2 # midpoint = 1
left = mergesort(lst[:midpoint]) # left = mergesort([4])
right = mergesort(lst[midpoint:]) # right = mergesort([7])
# ...
Again, recursion stops as the new left
and right
sub-problems are a single-element list, which satisfies the base case -
right = mergesort([4,7])
def mergesort(lst): # lst = [4,7]
if len(lst) <= 1:
# ...
else:
midpoint = len(lst) // 2 # midpoint = 1
left = mergesort(lst[:midpoint]) # left = [4] <-
right = mergesort(lst[midpoint:]) # right = [7] <-
# ...
return newlist # newlist = [4,7]
And finally the outermost mergesort
call can return -
mergesort([5,2,4,7])
def mergesort(lst): # lst = [5,2,4,7]
if len(lst) <= 1:
# ...
else:
midpoint = len(lst) // 2 # midpoint = 2
left = mergesort(lst[:midpoint]) # left = [2,5]
right = mergesort(lst[midpoint:]) # right = [4,7]
# ...
return newlist # newlist = [2,4,5,7]
# => [2,4,5,7]
All of that said, recursion is a functional heritage and so using it with functional style yields the best results. This means avoiding things like mutations, variable reassignments, and other side effects. Consider this alternative which lowers the conceptual overhead by clearly separating the program's concerns -
def mergesort(lst):
def split(lst):
m = len(lst) // 2
return (lst[:m], lst[m:])
def merge(l, r):
if not l:
return r
elif not r:
return l
elif l[0] < r[0]:
return [l[0]] + merge(l[1:], r)
else:
return [r[0]] + merge(l, r[1:])
if len(lst) <= 1:
return lst
else:
(left, right) = split(lst)
return merge(mergesort(left), mergesort(right))
mergesort([5,2,4,7])
# => [2,4,5,7]
回答2:
the answer to your question is: copies.
each function is a recipe for calculation.
when a function is called, the recipe's copy is created. each invocation involves creation of a separate copy. that's how each can operate on its own, and they all are not jumbled up together.
in general, there is nothing special about a recursive function call. a function call is a function call, no matter what is the function that is called. a function is called, does what it does, and its result is returned to the caller. as for recursion, you're not supposed to track it. it does its work on its own. you're supposed to prove it to yourself that the base case is correct and that the recursive case is correct. that is all.
then it is guaranteed to work in however convoluted way it does, and the whole point of it is for us to not care about how it does it exactly, i.e. about its exact sequence of steps.
so specifically in your case, assuming mergesort
indeed does work correctly (wait, what? never mind, suspend your disbelief for a moment),
left = mergesort(lst[:midpoint])
calls the function mergesort
with the first half of lst
, from its start to its midpoint, and stores the result - which is the sorted first half, by assumption, - in the variable left
; then
right = mergesort(lst[midpoint:])
calls the function mergesort
with the second half of lst
, from its midpoint to its end, and stores the result - which is the sorted second half, by assumption, - in the variable right
;
and then you need to convince yourself that the rest of the code creates newlist
from those two sorted halves such that newlist
is also sorted in the correct order.
And then by the principle of mathematical induction this proves the correctness of mergesort
.
By assuming it works, we prove that it indeed works! Where's the catch? There's no catch, because the two cases that worked by assumption are for two smaller inputs (and that is our recursive case).
And when we divide a thing into two parts, over and over, eventually we're left with either a singleton, or an empty thing. And those two are naturally sorted (and that is our base case).
Recursion is a leap of faith. Assume the thing is working, then you get to use it. And if you use it correctly, you will have thus built the very thing you were using in the first place!
来源:https://stackoverflow.com/questions/65454744/how-can-a-recursive-function-operate-if-it-returns-to-the-beginning