Before reviewing Big-Oh notation, let's start with something simpler that you might not have seen before: Big-Ell notation.
Big-Ell is an imprecise abstraction of natural numbers less than or equal to another number, hence the L. It's defined as follows:
where and are natural numbers. That is, represents all the natural numbers less than or equal to . For example, .
Could you do arithmetic with Big-Ell? For example, what would be? It's not a well-posed question, to be honest: addition is an operation we think of being defined on integers, not sets of integers. But a reasonable interpretation of could be doing the addition for each element in the set, yielding . Note that is a proper subset of , and the latter is . So we could say that . We could even say that , but it's not tight: the former subset relation included the fewest possible extra elements, whereas the latter was loose by needlessly including extra.
For more about Big Ell, see Concrete Mathematics, chapter 9, 1989, by Graham, Knuth, and Patashnik.
If you understand Big-Ell, and you understand functional programming, here's some good news: you can easily understand Big-Oh.
Let's build up the definition of Big-Oh in a few steps. We'll start with version 1, which we'll write as .
represents any natural number that is less than or equal to a natural number .
represents any natural function that is less than or equal to a natural function .
A natural function is just a function on natural numbers; that is, its type is .
All we do with is upgrade from natural numbers to natural functions. So Big-Oh version 1 is just the higher-order version of Big-Ell. How about that!
Of course, we need to work out what it means for a function to be less than another function. Here's a reasonable formalization:
Big-Oh Version 1:
For example, consider the function that doubles its input. In math
textbooks, that function might be written as . In OCaml
we would write
let g n = 2 * n or
let g = fun n -> 2 * n or just
fun n -> 2 * n. In math that same anonymous function
would be written with lambda notation as .
Proceeding with lambda notation, we have:
- , but
Next, recall that in defining algorithmic efficiency, we wanted to ignore constant factors. does not help us with that. We'd really like for all these functions:
to be in .
Toward that end, let's define to ignore constant factors:
Big-Oh Version 2:
That existentially-quantified positive constant lets us "bump up" the function to whatever constant factor we need. For example,
and therefore , because if we take , or , or any larger .
Finally, recall that we don't care about small inputs: we want to THINK BIG when we analyze algorithmic efficiency. It doesn't matter whether the running time of an algorithm happens to be a little faster or a little slower for small inputs. In fact, we could just hardcode a lookup table for those small inputs if the algorithm is too slow on them! What matters really is the performance on big-sized inputs.
Toward that end, let's define to ignore small inputs:
Big-Oh Version 3:
That existentially quantified positive constant lets us "ignore" all inputs of that size or smaller. For example,
and therefore , because if we take and . Note how we get to ignore the fact that is temporarily a little too big at by picking . That's the power of ignoring "small" inputs.
Version 3 is the right definition of Big-Oh. We repeat it here, for real:
That's the final, important version you should know. But don't just memorize it. If you understand the derivation we gave here, you'll be able to recreate it from scratch anytime you need it.
Big-Oh is called an asymptotic upper bound. If , then is at least as efficient as , and might be more efficient.
Big-Oh Notation Warnings
Because it's an upper bound, we can always inflate a Big-Oh statement: for example, if , then also , and , etc. But our goal is always to give tight upper bounds, whether we explicitly say that or not. So when asked what the running time of an algorithm is, you must always give the tightest bound you can with Big-Oh.
Instead of , most authors instead write . They don't really mean applied to . They mean a function parameterized on input but not yet applied. This is badly misleading and generally a result of not understanding anonymous functions. Moral of that story: more people need to study functional programming.
Instead of nearly all authors write . This is a hideous and inexcusable abuse of notation that should never have been allowed and yet has permanently infected the computer science consciousness. The standard defense is that here should be read as "is" not as "equals". That is patently ridiculous, and even those who make that defense usually have the good grace to admit it's nonsense. Sometimes we become stuck with the mistakes of our ancestors. This is one of those times. Be careful of this "one-directional equality" and, if you ever have a chance, teach your (intellectual) children to do better.