How to calculate order (big O) for more complex algorithms (eg quicksort)

后端未结

关注

 6  633

春和景丽 2021-01-30 02:54

I know there are quite a bunch of questions about big O notation, I have already checked:

Plain english explanation of Big O
Big O, how do you calculate/a

6条回答

心在旅途 (楼主)

2021-01-30 03:00
The logarithm is the inverse operation of exponentiation. An example of exponentiation is when you double the number of items at each step. Thus, a logarithmic algorithm often halves the number of items at each step. For example, binary search falls into this category.

Many algorithms require a logarithmic number of big steps, but each big step requires O(n) units of work. Mergesort falls into this category.

Usually you can identify these kinds of problems by visualizing them as a balanced binary tree. For example, here's merge sort:
```
 6   2    0   4    1   3     7   5
  2 6      0 4      1 3       5 7
    0 2 4 6            1 3 5 7
         0 1 2 3 4 5 6 7
```
At the top is the input, as leaves of the tree. The algorithm creates a new node by sorting the two nodes above it. We know the height of a balanced binary tree is O(log n) so there are O(log n) big steps. However, creating each new row takes O(n) work. O(log n) big steps of O(n) work each means that mergesort is O(n log n) overall.

Generally, O(log n) algorithms look like the function below. They get to discard half of the data at each step.
```
def function(data, n):
    if n <= constant:
       return do_simple_case(data, n)
    if some_condition():
       function(data[:n/2], n / 2) # Recurse on first half of data
    else:
       function(data[n/2:], n - n / 2) # Recurse on second half of data
```
While O(n log n) algorithms look like the function below. They also split the data in half, but they need to consider both halves.
```
def function(data, n):
    if n <= constant:
       return do_simple_case(data, n)
    part1 = function(data[n/2:], n / 2)      # Recurse on first half of data
    part2 = function(data[:n/2], n - n / 2)  # Recurse on second half of data
    return combine(part1, part2)
```
Where do_simple_case() takes O(1) time and combine() takes no more than O(n) time.

The algorithms don't need to split the data exactly in half. They could split it into one-third and two-thirds, and that would be fine. For average-case performance, splitting it in half on average is sufficient (like QuickSort). As long as the recursion is done on pieces of (n/something) and (n - n/something), it's okay. If it's breaking it into (k) and (n-k) then the height of the tree will be O(n) and not O(log n).
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...