Why is the constant always dropped from big O analysis?

24,641

Solution 1

There's a distinction between "are these constants meaningful or relevant?" and "does big-O notation care about them?" The answer to that second question is "no," while the answer to that first question is "absolutely!"

Big-O notation doesn't care about constants because big-O notation only describes the long-term growth rate of functions, rather than their absolute magnitudes. Multiplying a function by a constant only influences its growth rate by a constant amount, so linear functions still grow linearly, logarithmic functions still grow logarithmically, exponential functions still grow exponentially, etc. Since these categories aren't affected by constants, it doesn't matter that we drop the constants.

That said, those constants are absolutely significant! A function whose runtime is 10100n will be way slower than a function whose runtime is just n. A function whose runtime is n2 / 2 will be faster than a function whose runtime is just n2. The fact that the first two functions are both O(n) and the second two are O(n2) doesn't change the fact that they don't run in the same amount of time, since that's not what big-O notation is designed for. O notation is good for determining whether in the long term one function will be bigger than another. Even though 10100n is a colossally huge value for any n > 0, that function is O(n) and so for large enough n eventually it will beat the function whose runtime is n2 / 2 because that function is O(n2).

In summary - since big-O only talks about relative classes of growth rates, it ignores the constant factor. However, those constants are absolutely significant; they just aren't relevant to an asymptotic analysis.

Hope this helps!

Solution 2

Big-O notation only describes the growth rate of algorithms in terms of mathematical function, rather than the actual running time of algorithms on some machine.

Mathematically, Let f(x) and g(x) be positive for x sufficiently large. We say that f(x) and g(x) grow at the same rate as x tends to infinity, if

enter image description here

now let f(x)=x^2 and g(x)=x^2/2, then lim(x->infinity)f(x)/g(x)=2. so x^2 and x^2/2 both have same growth rate.so we can say O(x^2/2)=O(x^2).

As templatetypedef said, hidden constants in asymptotic notations are absolutely significant.As an example :marge sort runs in O(nlogn) worst-case time and insertion sort runs in O(n^2) worst case time.But as the hidden constant factors in insertion sort is smaller than that of marge sort, in practice insertion sort can be faster than marge sort for small problem sizes on many machines.

Solution 3

Big O notation is most commonly used to describe an algorithm's running time. In this context, I would argue that specific constant values are essentially meaningless. Imagine the following conversation:

Alice: What is the running time of your algorithm?

Bob: 7n2

Alice: What do you mean by 7n2?

  • What are the units? Microseconds? Milliseconds? Nanoseconds?
  • What CPU are you running it on? Intel i9-9900K? Qualcomm Snapdragon 845? (Or are you using a GPU, an FPGA, or other hardware?)
  • What type of RAM are you using?
  • What programming language did you implement the algorithm in? What is the source code?
  • What compiler / VM are you using? What flags are you passing to the compiler / VM?
  • What is the operating system?
  • etc.

So as you can see, any attempt to indicate a specific constant value is inherently problematic. But once we set aside constant factors, we are able to clearly describe an algorithm's running time. Big O notation gives us a robust and useful description of how long an algorithm takes, while abstracting away from the technical features of its implementation and execution.

Now it is possible to specify the constant factor when describing the number of operations (suitably defined) or CPU instructions an algorithm executes, the number of comparisons a sorting algorithm performs, and so forth. But typically, what we're really interested in is the running time.

None of this is meant to suggest that the real-world performance characteristics of an algorithm are unimportant. For example, if you need an algorithm for matrix multiplication, the Coppersmith-Winograd algorithm is inadvisable. It's true that this algorithm takes O(n2.376) time, whereas the Strassen algorithm, its strongest competitor, takes O(n2.808) time. However, according to Wikipedia, Coppersmith-Winograd is slow in practice, and "it only provides an advantage for matrices so large that they cannot be processed by modern hardware." This is usually explained by saying that the constant factor for Coppersmith-Winograd is very large. But to reiterate, if we're talking about the running time of Coppersmith-Winograd, it doesn't make sense to give a specific number for the constant factor.

Despite its limitations, big O notation is a pretty good measure of running time. And in many cases, it tells us which algorithms are fastest for sufficiently large input sizes, before we even write a single line of code.

Solution 4

You are completely right that constants matter. In comparing many different algorithms for the same problem, the O numbers without constants give you an overview of how they compare to each other. If you then have two algorithms in the same O class, you would compare them using the constants involved.

But even for different O classes the constants are important. For instance, for multidigit or big integer multiplication, the naive algorithm is O(n^2), Karatsuba is O(n^log_2(3)), Toom-Cook O(n^log_3(5)) and Schönhage-Strassen O(n*log(n)*log(log(n))). However, each of the faster algorithms has an increasingly large overhead reflected in large constants. So to get approximate cross-over points, one needs valid estimates of those constants. Thus one gets, as SWAG, that up to n=16 the naive multiplication is fastest, up to n=50 Karatsuba and the cross-over from Toom-Cook to Schönhage-Strassen happens for n=200.

In reality, the cross-over points not only depend on the constants, but also on processor-caching and other hardware-related issues.

Solution 5

Big O without constant is enough for algorithm analysis.

First, the actual time does not only depend how many instructions but also the time for each instruction, which is closely connected to the platform where the code runs. It is more than theory analysis. So the constant is not necessary for most case.

Second, Big O is mainly used to measure how the run time will increase as the problem becomes larger or how the run time decrease as the performance of hardware improved.

Third, for situations of high performance optimizing, constant will also be taken into consideration.

Share:
24,641
driftwood
Author by

driftwood

Updated on July 09, 2022

Comments

  • driftwood
    driftwood almost 2 years

    I'm trying to understand a particular aspect of Big O analysis in the context of running programs on a PC.

    Suppose I have an algorithm that has a performance of O(n + 2). Here if n gets really large the 2 becomes insignificant. In this case it's perfectly clear the real performance is O(n).

    However, say another algorithm has an average performance of O(n2 / 2). The book where I saw this example says the real performance is O(n2). I'm not sure I get why, I mean the 2 in this case seems not completely insignificant. So I was looking for a nice clear explanation from the book. The book explains it this way:

    "Consider though what the 1/2 means. The actual time to check each value is highly dependent on the machine instruction that the code translates to and then on the speed at which the CPU can execute the instructions. Therefore the 1/2 doesn't mean very much."

    And my reaction is... huh? I literally have no clue what that says or more precisely what that statement has to do with their conclusion. Can somebody spell it out for me please.

    Thanks for any help.

  • templatetypedef
    templatetypedef about 10 years
    I think it's a bit too strong to claim that it's okay to ignore constants entirely in algorithmic analysis. Many algorithms that look efficient (say, O(n)) are terribly inefficient in practice due to the huge constant factors that are hidden in the O notation. Determining which algorithm to use is often just as much about the constant factor as it is about the asymptotic runtime, especially if the inputs are known to be small.
  • driftwood
    driftwood about 10 years
    That's a great explanation and makes perfect sense. Thanks. But I'm still trying understand what point the book was making. Any clue??
  • tanmoy
    tanmoy about 10 years
    @user2722568: please clarify your problem.
  • driftwood
    driftwood about 10 years
    if you go back to my original post at the top, i cite a quote from a textbook that has its own explanation about the significance (or insignificance in this case according to the book) of the 1/2 factor in the O(N^2/2). i'd ideally like to understand what point they are trying to make. right now it make no sense to me.
  • tanmoy
    tanmoy about 10 years
    @user2722568: do you want to understand the meaning of the constant factor hidden in asymptotic notations?
  • driftwood
    driftwood about 10 years
    no i already get that. i don't understand this cryptic explanation: "Consider though what the 1/2 means. The actual time to check each value is highly dependent on the machine instruction that the code translates to and then on the speed at which the CPU can execute the instructions. Therefore the 1/2 doesn't mean very much."
  • tanmoy
    tanmoy about 10 years
    @user2722568: I think it tries to explain the insignificance of the constant factor hidden in asymptotic notations when you are analyzing algorithms .this constant actually dependent on statement costs.And statement costs may be different for different machines.but in some special contexts, as given in example, this constant can be useful.
  • driftwood
    driftwood about 10 years
    In other words, when compared to the differences between different machines the constant factor is negligible. In that context the only thing that will give an idea of performance, ragardless of what hardware the algorithm is run on is Big-O analysis without constants. Got it. Thanks.
  • Novaterata
    Novaterata over 6 years
    You don't know that a O(2n) function will be slower than an O(n) function. The O(n) function could take over twice as long to compute each iteration as the O(2n) function.
  • templatetypedef
    templatetypedef over 6 years
    Yes, absolutely. After all, O(2n) means literally the same thing as O(n). But if we're talking about total runtime from a non-big-O perspective, if the true runtime is 2n versus a true runtime of n, the first algorithm will be slower than the second.