问题
First, define two integers N
and K
, where N >= K
, both known at compile time. For example: N = 8
and K = 3
.
Next, define a set of integers [0, N)
(or [1, N]
if that makes the answer simpler) and call it S
. For example: {0, 1, 2, 3, 4, 5, 6, 7}
The number of subsets of S
with K
elements is given by the formula C(N, K)
. Example
My problem is this: Create a perfect minimal hash for those subsets. The size of the example hash table will be C(8, 3)
or 56
.
I don't care about ordering, only that there be 56 entries in the hash table, and that I can determine the hash quickly from a set of K
integers. I also don't care about reversibility.
Example hash: hash({5, 2, 3}) = 42
. (The number 42 isn't important, at least not here)
Is there a generic algorithm for this that will work with any values of N
and K
? I wasn't able to find one by searching Google, or my own naive efforts.
回答1:
There is an algorithm to code and decode a combination into its number in the lexicographical order of all combinations with a given fixed K
. The algorithm is linear to N
for both code and decode of the combination. What language are you interested in?
EDIT: here is example code in c++(it founds the lexicographical number of a combination in the sequence of all combinations of n elements as opposed to the ones with k
elements but is really good starting point):
typedef long long ll;
// Returns the number in the lexicographical order of all combinations of n numbers
// of the provided combination.
ll code(vector<int> a,int n)
{
sort(a.begin(),a.end());
int cur = 0;
int m = a.size();
ll res =0;
for(int i=0;i<a.size();i++)
{
if(a[i] == cur+1)
{
res++;
cur = a[i];
continue;
}
else
{
res++;
int number_of_greater_nums = n - a[i];
for(int j = a[i]-1,increment=1;j>cur;j--,increment++)
res += 1LL << (number_of_greater_nums+increment);
cur = a[i];
}
}
return res;
}
// Takes the lexicographical code of a combination of n numbers and returns the
// combination
vector<int> decode(ll kod, int n)
{
vector<int> res;
int cur = 0;
int left = n; // Out of how many numbers are we left to choose.
while(kod)
{
ll all = 1LL << left;// how many are the total combinations
for(int i=n;i>=0;i--)
{
if(all - (1LL << (n-i+1)) +1 <= kod)
{
res.push_back(i);
left = n-i;
kod -= all - (1LL << (n-i+1)) +1;
break;
}
}
}
return res;
}
I am sorry I have an algorithm for the problem you are asking for right now, but I believe it will be a good exercise to try to understand what I do above. Truth is this is one of the algorithms I teach in the course "Design and analysis of algorithms" and that is why I had it pre-written.
回答2:
This is what you (and I) need:
hash()
maps k-tuples
from [1..n]
onto the set 1..C(n,k)\subset N
.
The effort is k
subtractions (and O(k)
is a lower bound anyway, see Strandjev's remark above):
// bino[n][k] is (n "over" k) = C(n,k) = {n \choose k}
// these are assumed to be precomputed globals
int hash(V a,int n, int k) {// V is assumed to be ordered, a_k<...<a_1
// hash(a_k,..,a_2,a_1) = (n k) - sum_(i=1)^k (n-a_i i)
// ii is "inverse i", runs from left to right
int res = bino[n][k];
int i;
for(unsigned int ii = 0; ii < a.size(); ++ii) {
i = a.size() - ii;
res = res - bino[n-a[ii]][i];
}
return res;
}
来源:https://stackoverflow.com/questions/14160942/perfect-minimal-hash-for-mathematical-combinations