Assume we have a integer of bitsize n=4;
The problem I am describing is how you would go about indexing a number to an array position based on the Hamming weight and its value knowing the bitsize
. E.g.
An array with 16 elements for bitsize 4 would/could look like this:
|0|1|2|4|8|3|5|6|9|10|12|7|11|13|14|15|
Where elements are grouped by their Hamming weight(necessary) and sorted based on size(not necessary). Sorting is not necessary as long as you can take e.g. 3(0011) do some operations and get back index 5, 5(0101) -> 6 etc.
All combinations of n
bits will be present and there will be no duplication. E.g.
bitsize of 3
would have the array:
|0|1|2|4|3|5|6|7|
I would preferably have a solution without loops. Or any papers that discuss simillar solutions. Or finally just throw out any ides on how you could go about doing that.
Note that you can enumerate numbers (in counting order) with the same hamming weight using the following functions:
int next(int n) { // get the next one with same # of bits set
int lo = n & -n; // lowest one bit
int lz = (n + lo) & ~n; // lowest zero bit above lo
n |= lz; // add lz to the set
n &= ~(lz - 1); // reset bits below lz
n |= (lz / lo / 2) - 1; // put back right number of bits at end
return n;
}
int prev(int n) { // get the prev one with same # of bits set
int y = ~n;
y &= -y; // lowest zero bit
n &= ~(y-1); // reset all bits below y
int z = n & -n; // lowest set bit
n &= ~z; // clear z bit
n |= (z - z / (2*y)); // add requried number of bits below z
return n;
}
As an example, repititive application of prev() on x = 5678:
0: 00000001011000101110 (5678)
1: 00000001011000101101 (5677)
2: 00000001011000101011 (5675)
3: 00000001011000100111 (5671)
4: 00000001011000011110 (5662)
5: 00000001011000011101 (5661)
6: 00000001011000011011 (5659)
.....
Hence theoretically you can compute the index of a number by repititive application of this. However this can take very long. The better approach would be to "jump" over some combinations.
There are 2 rules:
1. if the number starts with: ..XXX10..01..1 we can replace it by ..XXX0..01..1
adding corresponding number of combinations
2. if the number starts with: ..XXX1..10..0 again replace it by XXX0..01..1 with corresponding number of combinations
The following algorithm computes the index of a number among the numbers with the same Hamming weight (i did not bother about fast implementation of binomial):
#define LOG2(x) (__builtin_ffs(x)-1)
int C(int n, int k) { // simple implementation of binomial
int c = n - k;
if(k < c)
std::swap(k,c);
if(c == 0)
return 1;
if(k == n-1)
return n;
int b = k+1;
for(int i = k+2; i <= n; i++)
b = b*i;
for(int i = 2; i <= c; i++)
b = b / i;
return b;
}
int position_jumping(unsigned x) {
int index = 0;
while(1) {
if(x & 1) { // rule 1: x is of the form: ..XXX10..01..1
unsigned y = ~x;
unsigned lo = y & -y; // lowest zero bit
unsigned xz = x & ~(lo-1); // reset all bits below lo
unsigned lz = xz & -xz; // lowest one bit after lo
if(lz == 0) // we are in the first position!
return index;
int nn = LOG2(lz), kk = LOG2(lo)+1;
index += C(nn, kk); // C(n-1,k) where n = log lz and k = log lo + 1
x &= ~lz; //! clear lz bit
x |= lo; //! add lo
} else { // rule 2: x is of the form: ..XXX1..10..0
int lo = x & -x; // lowest set bit
int lz = (x + lo) & ~x; // lowest zero bit above lo
x &= ~(lz-1); // clear all bits below lz
int sh = lz / lo;
if(lz == 0) // special case meaning that lo is in the last position
sh=((1<<31) / lo)*2;
x |= sh-1;
int nn = LOG2(lz), kk = LOG2(sh);
if(nn == 0)
nn = 32;
index += C(nn, kk);
}
std::cout << "x: " << std::bitset<20>(x).to_string() << "; pos: " << index << "\n";
}
}
For example, given the number x=5678 the algorithm will compute its index in just 4 iterations:
x: 00000001011000100111; pos: 4
x: 00000001011000001111; pos: 9
x: 00000001010000011111; pos: 135
x: 00000001000000111111; pos: 345
x: 00000000000001111111; pos: 1137
Note that 1137 is the position of 5678 within the group of numbers with the same Hamming weight. Hence you would have to shift this index accordingly to account for all the numbers with smaller Hamming weights
Here is a concept work, just to get the discussion started.
The step one is hardest - solved using approximation to calculate factorials.
Anymore bright ideas?
#include <stdio.h>
#include <math.h>
//gamma function using Lanczos approximation formula
//output result in log base e
//use exp() to convert back
//has a nice side effect: can store large values in small [power of e] form
double logGamma(double x)
{
double tmp = (x-0.5) * log(x+4.5) - (x+4.5);
double ser = 1.0 + 76.18009173 / (x+0) - 86.50532033 / (x+1)
+ 24.01409822 / (x+2) - 1.231739516 / (x+3)
+ 0.00120858003 / (x+4) - 0.00000536382 / (x+5);
return tmp + log(ser * sqrt(2*M_PI) );
}
//result from logGamma() are actually (n-1)!
double combination(int n, int r)
{
return exp(logGamma(n+1)-( logGamma(r+1) + logGamma(n-r+1) ));
}
//primitive hamming weight counter
int hWeight(int x)
{
int count, y;
for (count=0, y=x; y; count++)
y &= y-1;
return count;
}
//-------------------------------------------------------------------------------------
//recursively find the previous group's "hamming weight member count" and sum them
int rCummGroupCount(int bitsize, int hw)
{
if (hw <= 0 || hw == bitsize)
return 1;
else
return round(combination(bitsize, hw)) + rCummGroupCount(bitsize,hw-1);
}
//-------------------------------------------------------------------------------------
int main(int argc, char* argv[])
{
int bitsize = 4, integer = 14;
int hw = hWeight(integer);
int groupStartIndex = rCummGroupCount(bitsize,hw-1);
printf("bitsize: %d\n", bitsize);
printf("integer: %d hamming weight: %d\n", integer, hw);
printf("group start index: %d\n", groupStartIndex);
}
output:
bitsize: 4
integer: 14 hamming weight: 3
group start index: 11
来源:https://stackoverflow.com/questions/13542794/hamming-weight-based-indexing