What's Up with O(1)?

前端 未结 13 2111
既然无缘
既然无缘 2020-12-22 17:40

I have been noticing some very strange usage of O(1) in discussion of algorithms involving hashing and types of search, often in the context of using a dictionary type provi

相关标签:
13条回答
  • 2020-12-22 18:13

    Hash table implementations are in practice not "exactly" O(1) in use, if you test one you'll find they average around 1.5 lookups to find a given key across a large dataset

    ( due to to the fact that collisions DO occur, and upon colliding, a different location must be assigned )

    Also, In practice, HashMaps are backed by arrays with an initial size, that is "grown" to double size when it reaches 70% fullness on average, which gives a relatively good addressing space. After 70% fullness collision rates grow faster.

    Big O theory states that if you have a O(1) algorithm, or even an O(2) algorithm, the critical factor is the degree of the relation between input-set size and steps to insert/fetch one of them. O(2) is still constant time, so we just approximate it as O(1), because it means more or less the same thing.

    In reality, there is only 1 way to have a "perfect hashtable" with O(1), and that requires:

    1. A Global Perfect Hash Key Generator
    2. An Unbounded addressing space.

    ( Exception case: if you can compute in advance all the permutations of permitted keys for the system, and your target backing store address space is defined to be the size where it can hold all keys that are permitted, then you can have a perfect hash, but its a "domain limited" perfection )

    Given a fixed memory allocation, it is not plausible in the least to have this, because it would assume that you have some magical way to pack an infinite amount of data into a fixed amount of space with no loss of data, and that's logistically impossible.

    So retrospectively, getting O(1.5) which is still constant time, in a finite amount of memory with even a relatively Naïve hash key generator, I consider pretty damn awesome.

    Suffixory note Note I use O(1.5) and O(2) here. These actually don't exist in big-o. These are merely what people whom don't know big-o assume is the rationale.

    If something takes 1.5 steps to find a key, or 2 steps to find that key, or 1 steps to find that key, but the number of steps never exceeds 2 and whether it takes 1 step or 2 is completely random, then it is still Big-O of O(1). This is because no matter how many items to you add to the dataset size, It still maintains the <2 steps. If for all tables > 500 keys it takes 2 steps, then you can assume those 2 steps are in fact one-step with 2 parts, ... which is still O(1).

    If you can't make this assumption, then your not being Big-O thinking at all, because then you must use the number which represents the number of finite computational steps required to do everything and "one-step" is meaningless to you. Just get into your head that there is NO direct correlation between Big-O and number of execution cycles involved.

    0 讨论(0)
提交回复
热议问题