Why does Collections.sort call Comparator twice with the same arguments?

后端 未结 3 1275
感情败类
感情败类 2021-01-02 20:53

I\'m running an example to understand the behavior of Comparator in Java.

import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparat         


        
相关标签:
3条回答
  • 2021-01-02 21:19

    Sorting algorithms are a complex topic. Consider this very simple (but inefficient) algorithm.

    Compare the first item to the second item. Keep track of the higher item and compare it to the next item. Keeping track of the highest item until the get to the end of the list to find the highest item in the list. Place highest one in a new list, and remove it from the original list.

    Then repeat the previous steps until the original list is empty.

    Because you are going through the list multiple times, you could end up comparing an item its neighbor multiple times. Perhaps even consecutively on different passes.

    0 讨论(0)
  • 2021-01-02 21:24

    It depends on sorting algorithm, on how many times it calls compare method. Once we call Collections.sort() method, it goes to the implementation of sorting used in Collections.sort().

    Collections.sort() implementation uses merge sort. According to the Javadoc, only primitive arrays are sorted using Quicksort. Object arrays are sorted with a Mergesort as well.

    0 讨论(0)
  • 2021-01-02 21:31

    If this is Java 7 or later, it's using TimSort. TimSort starts off by running through the input and detecting or gathering ascending runs of 32 or more elements (in this implementation). See countRunAndMakeAscending in the source code.

    Runs longer than 32 are left in place for now. Runs shorter than 32 are lengthened by doing a binary insertion sort of subsequent elements into the current run until it's at least 32 elements long. See binarySort in the source code.

    (The merge sorting approach is done only after runs of >= 32 are gathered. Since your input has only 3 elements, the entire sort is done using the binary insertion sort, and no merging is done.)

    What countRunAndMakeAscending has to do is to detect runs by comparing adjacent elements. First it compares Sony to Samsung and then Panasonic to Sony. The result is a run of length 2, [Samsung, Sony].

    Next, binarySort lengthens this run by taking the next element, Panasonic, and inserting it into the right place. A binary search is done to find that place. The midpoint of the run of 2 is location 1, which is Sony, so it compares Panasonic to Sony. (This is the repeated comparison.) Panasonic is less than Sony, so next comparison is between Panasonic and Samsung, which determines the proper insertion point. We now have a run of length 3.

    Since the entire input is of length 3, the sort is finished after four comparisons.

    The duplicate comparisons occur because countRunAndMakeAscending and binarySort are separate sort phases, and it just so happens that the last comparison of the first phase is the same as the first comparison of the second phase.

    0 讨论(0)
提交回复
热议问题