Why do we ignore co-efficients in Big O notation?

后端 未结 7 898
孤城傲影
孤城傲影 2020-11-29 08:37

While searching for answers relating to \"Big O\" notation, I have seen many SO answers such as this, this, or this, but still I have not clearly understood some points.

相关标签:
7条回答
  • 2020-11-29 08:57

    For practical applications the constants does matter, so O(2 n^3) will be better than O(1000 n^2) for inputs with n smaller than 500.

    There are two main ideas here: 1) If your algorithm should be great for any input, it should have a low time complexity, and 2) that n^3 grows so much faster than n^2, that perfering n^3 over n^2 almost never makes sense.

    0 讨论(0)
  • 2020-11-29 08:58

    From a (complexity) theory point of view, the coefficients represent hardware details that we can ignore. Specifically, the Linear Speedup Theorem dictates that for any problem we can always throw an exponentially increasing amount of hardware (money) at a computer to get a linear boost in speed.

    Therefore, modulo expensive hardware purchases two algorithms that solve the same problem, one at twice the speed of the other for all input sizes, are considered essentially the same.

    Big-O (Landau) notation has its origins independently in number theory, where one of its uses is to create a kind of equivalence between functions: if a given function is bounded above by another and simultaneously is bounded below by a scaled version of that same other function, then the two functions are essentially the same from an asymptotic point of view. The definition of Big-O (actually, "Big-Theta") captures this situation: the "Big-O" (Theta) of the two functions are exactly equal.

    The fact that Big-O notation allows us to disregard the leading constant when comparing the growth of functions makes Big-O an ideal vehicle to measure various qualities of algorithms while respecting (ignoring) the "freebie" optimizations offered by the Linear Speedup Theorem.

    0 讨论(0)
  • 2020-11-29 09:01

    Big O provides a good estimate of what algorithms are more efficient for larger inputs, all things being equal; this is why for an algorithm with an n^3 and an n^2 factor we ignore the n^2 factor, because even if the n^2 factor has a large constant it will eventually be dominated by the n^3 factor.

    However, real algorithms incorporate more than simple Big O analysis, for example a sorting algorithm will often start with a O(n * log(n)) partitioning algorithm like quicksort or mergesort, and when the partitions become small enough the algorithm will switch to a simpler O(n^2) algorithm like insertionsort - for small inputs insertionsort is generally faster, although a basic Big O analysis doesn't reveal this.

    The constant factors often aren't very interesting, and so they're omitted - certainly a difference in factors on the order of 1000 is interesting, but usually the difference in factors are smaller, and then there are many more constant factors to consider that may dominate the algorithms' constants. Let's say I've got two algorithms, the first with running time 3*n and the second with running time 2*n, each with comparable space complexity. This analysis assumes uniform memory access; what if the first algorithm interacts better with the cache, and this more than makes up for the worse constant factor? What if more compiler optimizations can be applied to it, or it behaves better with the memory management subsystem, or requires less expensive IO (e.g. fewer disk seeks or fewer database joins or whatever) and so on? The constant factor for the algorithm is relevant, but there are many more constants that need to be considered. Often the easiest way to determine which algorithm is best is just to run them both on some sample inputs and time the results; over-relying on the algorithms' constant factors would hide this step.

    0 讨论(0)
  • 2020-11-29 09:02

    The purpose of the Big-O notation is to find what is the dominant factor in the asymptotic behavior of a function as the value tends towards the infinity.

    As we walk through the function domain, some factors become more important than others.

    Imagine f(n) = n^3+n^2. As n goes to infinity, n^2 becomes less and less relevant when compared with n^3.

    But that's just the intuition behind the definition. In practice we ignore some portions of the function because of the formal definition:

    f(x) = O(g(x)) as x->infinity

    if and only if there is a positive real M and a real x_0 such as

    |f(x)| <= M|g(x)| for all x > x_0.

    That's in wikipedia. What that actually means is that there is a point (after x_0) after which some multiple of g(x) dominates f(x). That definition acts like a loose upper bound on the value of f(x).

    From that we can derive many other properties, like f(x)+K = O(f(x)), f(x^n+x^n-1)=O(x^n), etc. It's just a matter of using the definition to prove those.

    In special, the intuition behind removing the coefficient (K*f(x) = O(f(x))) lies in what we try to measure with computational complexity. Ultimately it's all about time (or any resource, actually). But it's hard to know how much time each operation take. One algorithm may perform 2n operations and the other n, but the latter may have a large constant time associated with it. So, for this purpose, isn't easy to reason about the difference between n and 2n.

    0 讨论(0)
  • 2020-11-29 09:06

    Big O notation is not an absolute measure of complexity.

    Rather it is a designation of how complexity will change as the variable changes. In other words as N increases the complexity will increase Big O(f(N)).

    To explain why terms are not included we look at how fast the terms increase.

    So, Big O(2n+2) has two terms 2n and 2. Looking at the rate of increase Big O(2) this term will never increase it does not contribute to the rate of increase at all so it goes away. Also since 2n increases faster than 2, the 2 turns into noise as n gets very large.

    Similarly Big O(2n^3 + 99n^2) compares Big O(2n^3) and Big O(99n^2). For small values, say n < 50, the 99n^2 will contribute a larger nominal percentage than 2n^3. However if n gets very large, say 1000000, then 99n^2 although nominally large it is insignificant (close to 1 millionth) compared to the size of 2n^3.

    As a consequence Big O(n^i) < Big O(n^(i+1)).

    Coefficients are removed because of the mathematical definition of Big O.

    To simplify the definition says Big O(f(n)) = Big O(f(cn)) for a constant c. This needs to be taken on faith because the reason for this is purely mathematical, and as such the proof would be too complex and dry to explain in simple terms.

    0 讨论(0)
  • 2020-11-29 09:08

    The mathematical reason:

    The real reason why we do this, is the way Big O-Notation is defined: A series (or lets use the word function) f(n) is in O(g(n)) when the series f(n)/g(n) is bounded. Example:

    f(n)= 2*n^2
    g(n)= n^2

    f(n) is in O(g(n)) because (2*n^2)/(n^2) = 2 as n approaches Infinity. The term (2*n^2)/(n^2) doesn't become infinitely large (its always 2), so the quotient is bounded and thus 2*n^2 is in O(n^2).

    Another one:

    f(n) = n^2
    g(n) = n

    The term n^2/n (= n) becomes infinetely large, as n goes to infinity, so n^2 is not in O(n).

    The same principle applies, when you have

    f(n) = n^2 + 2*n + 20
    g(n) = n^2

    (n^2 + 2*n + 20)/(n^2) is also bounded, because it tends to 1, as n goes to infinity.


    Big-O Notation basically describes, that your function f(n) is (from some value of n on to infinity) smaller than a function g(n), multiplied by a constant. With the previous example:

    2*n^2 is in O(n^2), because we can find a value C, so that 2*n^2 is smaller than C*n^2. In this example we can pick C to be 5 or 10, for example, and the condition will be satisfied.

    So what do you get out of this? If you know your algorithm has complexity O(10^n) and you input a list of 4 numbers, it may take only a short time. If you input 10 numbers, it will take a million times longer! If it's one million times longer or 5 million times longer doesn't really matter here. You can always use 5 more computers for it and have it run in the same amount of time, the real problem here is, that it scales incredibly bad with input size.

    0 讨论(0)
提交回复
热议问题