问题
I'm trying to create an efficient look-up table in C.
I have an integer as a key and a variable length char*
as the value.
I've looked at uthash
, but this requires a fixed length char*
value. If I make this a big number, then I'm using too much memory.
struct my_struct {
int key;
char value[10];
UT_hash_handle hh;
};
Has anyone got any pointers? Any insight greatly appreciated.
Thanks everyone for the answers. I've gone with uthash
and defined my own custom struct to accommodate my data.
回答1:
Declare the value
field as void *value
.
This way you can have any type of data as the value, but the responsibility for allocating and freeing it will be delegated to the client code.
回答2:
You first have to think of your collision strategy:
- Will you have multiple hash functions?
- Or will you have to use containers inside of the hashtable?
We'll pick 1.
Then you have to choose a nicely distributed hash function. For the example, we'll pick
int hash_fun(int key, int try, int max) {
return (key + try) % max;
}
If you need something better, maybe have a look at the middle-squared method.
Then, you'll have to decide, what a hash table is.
struct hash_table {
int max;
int number_of_elements;
struct my_struct **elements;
};
Then, we'll have to define how to insert and to retrieve.
int hash_insert(struct my_struct *data, struct hash_table *hash_table) {
int try, hash;
if(hash_table->number_of_elements >= hash_table->max) {
return 0; // FULL
}
for(try = 0; true; try++) {
hash = hash_fun(data->key, try, hash_table->max);
if(hash_table->elements[hash] == 0) { // empty cell
hash_table->elements[hash] = data;
hash_table->number_of_elements++;
return 1;
}
}
return 0;
}
struct my_struct *hash_retrieve(int key, struct hash_table *hash_table) {
int try, hash;
for(try = 0; true; try++) {
hash = hash_fun(key, try, hash_table->max);
if(hash_table->elements[hash] == 0) {
return 0; // Nothing found
}
if(hash_table->elements[hash]->key == key) {
return hash_table->elements[hash];
}
}
return 0;
}
And least a method to remove:
int hash_delete(int key, struct hash_table *hash_table) {
int try, hash;
for(try = 0; true; try++) {
hash = hash_fun(key, try, hash_table->max);
if(hash_table->elements[hash] == 0) {
return 0; // Nothing found
}
if(hash_table->elements[hash]->key == key) {
hash_table->number_of_elements--;
hash_table->elements[hash] = 0;
return 1; // Success
}
}
return 0;
}
回答3:
It really depends on the distribution of your key field. For example, if it's a unique value always between 0 and 255 inclusive, just use key % 256
to select the bucket and you have a perfect hash.
If it's equally distributed across all possible int
values, any function which gives you an equally distributed hash value will do (such as the afore-mentioned key % 256
) albeit with multiple values in each bucket.
Without knowing the distribution, it's a little hard to talk about efficient hashes.
来源:https://stackoverflow.com/questions/6844739/implement-a-hash-table