Understanding the Recursion of mergesort

后端 未结 9 2011
感动是毒
感动是毒 2020-12-13 07:01

Most of the mergesort implementations I see are similar to this. intro to algorithms book along with online implentations I search for. My recursion chops don\'t go much fur

相关标签:
9条回答
  • 2020-12-13 07:37

    I know this is an old question but wanted to throw my thoughts of what helped me understand merge sort.

    There are two big parts to merge sort

    1. Splitting of the array into smaller chunks (dividing)
    2. Merging the array together (conquering)

    The role of the recurison is simply the dividing portion.

    I think what confuses most people is that they think there is a lot of logic in the splitting and determining what to split, but most of the actual logic of sorting happens on the merge. The recursion is simply there to divide and do the first half and then the second half is really just looping, copying things over.

    I see some answers that mention pivots but I would recommend not associating the word "pivot" with merge sort because that's an easy way to confuse merge sort with quicksort (which is heavily reliant on choosing a "pivot"). They are both "divide and conquer" algorithms. For merge sort the division always happens in the middle whereas for quicksort you can be clever with the division when choosing an optimal pivot.

    0 讨论(0)
  • 2020-12-13 07:37

    When you call the recursive method it does not execute the real function at the same time it's stack into stack memory. And when condition not satisfied then it's going to next line.

    Consider that this is your array:

    int a[] = {10,12,9,13,8,7,11,5};
    

    So your method merge sort will work like below:

    mergeSort(arr a, arr empty, 0 , 7);
    mergeSort(arr a, arr empty, 0, 3);
    mergeSort(arr a, arr empty,2,3);
    mergeSort(arr a, arr empty, 0, 1);
    
    after this `(low + high) / 2 == 0` so it will come out of first calling and going to next:
    
        mergeSort(arr a, arr empty, 0+1,1);
    
    for this also `(low + high) / 2 == 0` so it will come out of 2nd calling also and call:
    
        merger(arr a, arr empty,0,0,1);
        merger(arr a, arr empty,0,3,1);
        .
        .
        So on
    

    So all sorting values store in empty arr. It might help to understand the how recursive function works

    0 讨论(0)
  • 2020-12-13 07:50

    An obvious thing to do would be to try this merge sort on a small array, say size 8 (power of 2 is convenient here), on paper. Pretend you are a computer executing the code, and see if it starts to become a bit clearer.

    Your question is a bit ambiguous because you don't explain what you find confusing, but it sounds like you are trying to unroll the recursive calls in your head. Which may or may not be a good thing, but I think it can easily lead to having too much in your head at once. Instead of trying to trace the code from start to end, see if you can understand the concept abstractly. Merge sort:

    1. Splits the array in half
    2. Sorts the left half
    3. Sorts the right half
    4. Merges the two halves together

    (1) should be fairly obvious and intuitive to you. For step (2) the key insight is this, the left half of an array... is an array. Assuming your merge sort works, it should be able to sort the left half of the array. Right? Step (4) is actually a pretty intuitive part of the algorithm. An example should make it trivial:

    at the start
    left: [1, 3, 5], right: [2, 4, 6, 7], out: []
    
    after step 1
    left: [3, 5], right: [2, 4, 6, 7], out: [1]
    
    after step 2
    left: [3, 5], right: [4, 6, 7], out: [1, 2]
    
    after step 3
    left: [5], right: [4, 6, 7], out: [1, 2, 3]
    
    after step 4
    left: [5], right: [6, 7], out: [1, 2, 3, 4]
    
    after step 5
    left: [], right: [6, 7], out: [1, 2, 3, 4, 5]
    
    after step 6
    left: [], right: [7], out: [1, 2, 3, 4, 5, 6]
    
    at the end
    left: [], right: [], out: [1, 2, 3, 4, 5, 6, 7]
    

    So assuming that you understand (1) and (4), another way to think of merge sort would be this. Imagine someone else wrote mergesort() and you're confident that it works. Then you could use that implementation of mergesort() to write:

    sort(myArray)
    {
        leftHalf = myArray.subArray(0, myArray.Length/2);
        rightHalf = myArray.subArray(myArray.Length/2 + 1, myArray.Length - 1);
    
        sortedLeftHalf = mergesort(leftHalf);
        sortedRightHalf = mergesort(rightHalf);
    
        sortedArray = merge(sortedLeftHalf, sortedRightHalf);
    }
    

    Note that sort doesn't use recursion. It just says "sort both halves and then merge them". If you understood the merge example above then hopefully you see intuitively that this sort function seems to do what it says... sort.

    Now, if you look at it more carefully... sort() looks pretty much exactly like mergesort()! That's because it is mergesort() (except it doesn't have base cases because it's not recursive!).

    But that's how I like thinking of recursive functions--assume that the function works when you call it. Treat it as a black box that does what you need it to. When you make that assumption, figuring out how to fill in that black box is often easy. For a given input, can you break it down into smaller inputs to feed to your black box? After you solve that, the only thing that's left is handling the base cases at the start of your function (which are the cases where you don't need to make any recursive calls. For example, mergesort([]) just returns an empty array; it doesn't make a recursive call to mergesort()).

    Finally, this is a bit abstract, but a good way to understand recursion is actually to write mathematical proofs using induction. The same strategy used to write an proof by induction is used to write a recursive function:

    Math proof:

    • Show the claim is true for the base cases
    • Assume it is true for inputs smaller than some n
    • Use that assumption to show that it is still true for an input of size n

    Recursive function:

    • Handle the base cases
    • Assume that your recursive function works on inputs smaller than some n
    • Use that assumption to handle an input of size n
    0 讨论(0)
提交回复
热议问题