问题
I have a question in finding the all paths for a sum. The question is:
Given a binary tree and a number ‘S’, find all paths from root-to-leaf such that the sum of all the node values of each path equals ‘S’.
My approach with recursion is:
def all_sum_path(root, target):
result = []
find_sum_path(root, target, result, [])
return result
def find_sum_path(root, target, result, new_path):
if not root:
return None
new_path.append(root.value)
diff = target - root.value
if not root.left and not root.right and diff == 0:
# copy the value of the list rather than a reference
result.append(list(new_path))
if root.left:
return find_sum_path(root.left, diff, result, new_path)
if root.right:
return find_sum_path(root.right, diff, result, new_path)
del new_path[-1]
class TreeNode():
def __init__(self, _value):
self.value = _value
self.left, self.right, self.next = None, None, None
def main():
root = TreeNode(1)
root.left = TreeNode(7)
root.right = TreeNode(9)
root.left.left = TreeNode(4)
root.left.right = TreeNode(5)
root.right.left = TreeNode(2)
root.right.right = TreeNode(7)
print(all_sum_path(root, 12))
root = TreeNode(12)
root.left = TreeNode(7)
root.right = TreeNode(1)
root.left.left = TreeNode(4)
root.right.left = TreeNode(10)
root.right.right = TreeNode(5)
print(all_sum_path(root, 23))
main()
and the output is:
[[1, 7, 4]]
[[12, 7, 4]]
Process finished with exit code 0
However, the correct approach should be:
def all_sum_path(root, target):
result = []
find_sum_path(root, target, result, [])
return result
def find_sum_path(root, target, result, new_path):
if not root:
return None
new_path.append(root.value)
diff = target - root.value
if not root.left and not root.right and diff == 0:
# copy the value of the list rather than a reference
result.append(list(new_path))
if root.left:
find_sum_path(root.left, diff, result, new_path)
if root.right:
find_sum_path(root.right, diff, result, new_path)
del new_path[-1]
class TreeNode():
def __init__(self, _value):
self.value = _value
self.left, self.right, self.next = None, None, None
def main():
root = TreeNode(1)
root.left = TreeNode(7)
root.right = TreeNode(9)
root.left.left = TreeNode(4)
root.left.right = TreeNode(5)
root.right.left = TreeNode(2)
root.right.right = TreeNode(7)
print(all_sum_path(root, 12))
root = TreeNode(12)
root.left = TreeNode(7)
root.right = TreeNode(1)
root.left.left = TreeNode(4)
root.right.left = TreeNode(10)
root.right.right = TreeNode(5)
print(all_sum_path(root, 23))
main()
With output:
[[1, 7, 4], [1, 9, 2]]
[[12, 7, 4], [12, 1, 10]]
Process finished with exit code 0
I have a some questions here:
Why don't we need a
return
in the recursion statement? I am also interested in how thereturn
statement reduced the output to only one?Why don't we need the
result = find_sum_path(root, target, result, [])
? Then what is the logic behind to update the results?I am not sure why the time complexity is O(N^2)?
The time complexity of the above algorithm is O(N^2), where ‘N’ is the total number of nodes in the tree. This is due to the fact that we traverse each node once (which will take O(N)), and for every leaf node we might have to store its path which will take O(N).
Thanks for your help in advance.
回答1:
First off, I want to say you're very close to solving the problem and you've done a terrific job. Recursion is a functional heritage and so using it with functional style yields the best results. This means avoiding things like mutations, variable reassignments, and other side effects. This can eliminate many sources of bugs (and headaches)!
To spruce up your program, we can start by modifying TreeNode
to accept both left
and right
parameters at time of construction -
class TreeNode():
def __init__(self, value, left=None, right=None):
self.value = value
self.left = left
self.right = right
Now we can define two trees, t1
and t2
. Notice we don't reassign root
-
def main():
t1 = TreeNode \
( 1
, TreeNode(7, TreeNode(4), TreeNode(5))
, TreeNode(9, TreeNode(2), TreeNode(7))
)
t2 = TreeNode \
( 12
, TreeNode(7, TreeNode(4), None)
, TreeNode(1, TreeNode(10), TreeNode(5))
)
print(all_sum_path(t1, 12))
print(all_sum_path(t2, 23))
main()
The expected output is -
[[1, 7, 4], [1, 9, 2]]
[[12, 7, 4], [12, 1, 10]]
Finally we implement find_sum
. We can use mathematical induction to write a simple case analysis for our function -
- if the input tree
t
is empty, return the empty result - (inductive)
t
is not empty. ift.value
matches the target,q
, a solution has been found; addt.value
to the currentpath
and yield - (inductive)
t
is not empty andt.value
does not match the targetq
. addt.value
to the currentpath
and new sub-problem,next_q
; solve the sub-problem ont.left
andt.right
branches -
def find_sum (t, q, path = []):
if not t:
return # (1)
elif t.value == q:
yield [*path, t.value] # (2)
else:
next_q = q - t.value # (3)
next_path = [*path, t.value]
yield from find_sum(t.left, next_q, next_path)
yield from find_sum(t.right, next_q, next_path)
Notice how we don't use mutations like .append
above. To compute all paths, we can write all_find_sum
as the expansion of find_sum
-
def all_sum_path (t, q):
return list(find_sum(t, q))
And that's it, your program is done :D
If you don't want to use a separate generator find_sum
, we can expand the generator in place -
def all_sum_path (t, q, path = []):
if not t:
return []
elif t.value == q:
return [[*path, t.value]]
else:
return \
[ *all_sum_path(t.left, q - t.value, [*path, t.value])
, *all_sum_path(t.right, q - t.value, [*path, t.value])
]
Notice the distinct similarity between the two variations. Any well-written program can easily be converted between either style.
回答2:
Why don't we need a return in the recursion statement?
Why don't we need the result = find_sum_path(root, target, result, [])? Then what is the logic behind to update the results?
The result
list (and also the new_path
list) is being passed through the recursion stacks by reference (or rather by assignment, see what does it mean by 'passed by assignment'?) which means the result
variable always points to the same location in your memory as it was initialized to in all_sum_path
(as long as it is not re-assigned) and you are able to mutate it in place as needed.
I am also interested in how the return statement reduced the output to only one?
When you use return
in your solution, you are completely giving up on exploring right subtrees of a node when the left subtree is done.
if root.left:
return find_sum_path(root.left, diff, result, new_path)
# -- unreachable code if `root.left` is not `None` --
if root.right:
return find_sum_path(root.right, diff, result, new_path)
I am not sure why the time complexity is O(N^2)?
if not root.left and not root.right and diff == 0:
# copy the value of the list rather than a reference
result.append(list(new_path))
This part of the code is making a full copy of the new_path
to append it to result
. Take the case of a binary tree which is somewhere between highly imbalanced and completely balanced, all nodes have values 0 and S
is also 0. In such case, you'll make L
(number of leaf nodes) copies of new_path
with each containing up to H
elements (height of the tree) so O(L * H)
~ O(N^2)
So the worst-case possible time complexity is certainly not linear O(N) but not completely O(N^2) either.
来源:https://stackoverflow.com/questions/65670647/all-paths-for-a-sum-with-return-issues