Cuckoo hashing in C

前端 未结 8 2276
傲寒
傲寒 2021-02-19 01:15

Does anybody have an implementation of Cuckoo hashing in C? If there was an Open Source, non GPL version it would be perfect!

Since Adam mentioned it in his comment, any

8条回答
  •  攒了一身酷
    2021-02-19 02:08

    Following a comment from "onebyone", I've implemented and tested a couple of versions of Cuckoo hashing to determine the real memory requirement.

    After some experiment, the claim that you don't have to reash until the table is almost 50% full seems to be true, especially if the "stash" trick is implmented.

    The problem is when you enlarge the table. The usual approach is to double its size but this leads to the new table being only 25% utilized!

    In fact, assume the hashtable has 16 slots, when I insert the 8th element number, I'll run out of good slots and will have to reash. I'll double it and now the table is 32 slots with only 8 of them occupied which is a 75% waste!

    This is the price to pay to have a "constant" retrieval time (in terms of upper bound for the number of access/comparison).

    I've devised a different schema, though: starting from a power of 2 greater than 1, if the table has n slots and n is a power of two, add n/2 slots otherwhise add n/3 slots:

    +--+--+
    |  |  |                             2 slots
    +--+--+
    
    +--+--+--+
    |  |  |  |                          3 slots
    +--+--+--+ 
    
    +--+--+--+--+
    |  |  |  |  |                       4 slots
    +--+--+--+--+
    
    +--+--+--+--+--+--+
    |  |  |  |  |  |  |                 6 slots
    +--+--+--+--+--+--+
    
    +--+--+--+--+--+--+--+--+
    |  |  |  |  |  |  |  |  |           8 slots
    +--+--+--+--+--+--+--+--+
    

    etc.

    Together with the assumption that reashing will only occur when the table is 50% full, this leads to the fact that the table will only be 66% empty (1/3rd) rather than 75% empty (1/4th) after a reash (i.e. the worst case).

    I've also figured out (but I still need to check the math) that enlarging each time by sqrt(n), the wasted space asymptotically approaches 50%.

    Of course the price to pay for less memory consumption is the increase of the number of reash that will be needed in the end. Alas, nothing comes for free.

    I'm going to investigate further if anyone is interested.

提交回复
热议问题