Algorithm to calculate the number of 1s for a range of numbers in binary

前端 未结 6 1941
再見小時候
再見小時候 2021-01-30 00:39

So I just got back for the ACM Programing competition and did pretty well but there was one problem that not one team got.

The Problem.

相关标签:
6条回答
  • 2021-01-30 00:48

    Zobgib,

    The key to this problem is not to understand how rapidly the growth of K's pattern grows, but HOW it grows, itself. The first step in this is to understand (as your coach said) how binary numbers count, as this determines everything about how K is determined. Binary numbers follow a pattern that is distinct when counting the number of positive bits. Its a single progressive repetitive pattern. I am going to demonstrate in an unusual way...

    Assume i is an integer value. Assume b is the number of positive bits in i
    i = 1; 
    b = 1; 
    
    i = 2; 3;
    b = 1; 2;
    
    i = 4; 5; 6; 7;
    b = 1; 2; 2; 3;
    
    i = 8; 9; 10; 11; 12; 13; 14; 15;
    b = 1; 2;  2;  3;  2;  3;  3;  4;
    
    i = 16; 17; 18; 19; 20; 21; 22; 23; 24; 25; 26; 27; 28; 29; 30; 31;
    b =  1;  2;  2;  3;  2;  3;  3;  4;  2;  3;  3;  4;  3;  4;  4;  5; 
    
    I assure you, this pattern holds to infinity, but if needed you
    should be able to find or construct a proof easily.
    

    If you look at the data above, you'll notice a distinct pattern related to 2^n. Each time you have an integer exponent of 2, the pattern will reset by including the each term of previous pattern, and then each term of the previous pattern incremented by 1. As such, to get K, you just apply the new number to the pattern above. The key is to find a single expression (that is efficient) to receive your number of bits.

    For demonstration, yet again, you can further extrapolate a new pattern off of this, because it is static and follows the same progression. Below is the original data modified with its K value (based on the recursion).

    Assume i is an integer value. Assume b is the number of positive bits in i
    i = 1; 
    b = 1; 
    K = 1;
    
    i = 2; 3;
    b = 1; 2;
    K = 1; 2;
    
    i = 4; 5; 6; 7;
    b = 1; 2; 2; 3;
    K = 1; 2; 2; 3;
    
    i = 8; 9; 10; 11; 12; 13; 14; 15;
    b = 1; 2;  2;  3;  2;  3;  3;  4;
    K = 1; 2;  2;  3;  2;  3;  3;  2;
    
    i = 16; 17; 18; 19; 20; 21; 22; 23; 24; 25; 26; 27; 28; 29; 30; 31;
    b =  1;  2;  2;  3;  2;  3;  3;  4;  2;  3;  3;  4;  3;  4;  4;  5; 
    K =  1;  2;  2;  3;  2;  3;  3;  2;  2;  3;  3;  2;  3;  2;  2;  3;
    

    If you notice, K follows a similar patterning, with a special condition... Everytime b is a power of 2, it actually lowers the K value by 2. Soooo, if you follow a binary progression, you should be able to easily map your K values. Since this pattern is dependant on powers of 2, and the pattern is dependant upon finding the nearest power of 2 and starting there, I propose the following solution. Take your LOW value and find the nearest power of 2 (p) such that 2^p < LOW. This can be done by "counting the bits" for just the lowest number. Again, once you know which exponent it is, you don't have to count the bits for any other number. You just increment through the pattern and you will have your b and hence K (which is following the same pattern).

    Note: If you are particularly observant, you can use the previous b or K to determine the next. If the current i is odd, add 1 to the previous b. If the current i is divisible by 4, then you decrement b by either 1 or 2, dependent upon whether it's in the first 1/2 of the pattern or second half. And, of course, if i is a power of 2, start over at 1.

    Fuzzical Logic

    Pseudo-code Example (non-Optimized)

    {   var LOW, HIGH
        var power = 0

    //Get Nearest Power Of 2 for (var i = 0 to 60) { // Compare using bitwise AND if (LOW bitAND (2 ^ i) = (2 ^ i)) { if ((2 ^ i) <= LOW) { set power to i } else { // Found the Power: end the for loop set i to 61 } } }

    // Automatically 1 at a Power of 2 set numOfBits to 1 array numbersWithPositiveBits with 64 integers = 0 // Must create the pattern from Power of 2 set foundLOW to false for (var j = (2^power) to HIGH) { set lenOfPatten to (power + 1) // Don't record until we have found the LOW value if ((foundLOW is false) bitAND (j is equal to LOW)) { set foundLOW to true } // If j is odd, increment numOfBits if ((1 bitAND j) is equal to 1) { increment numOfBits } else if (j modulus 4 == 0) { decrement numOfBits accordingly //Figure this one out yourself, please } else if ((j - (2^power)) == (power + 1)) { // We are at the next power increment power // Start pattern over set numOfBits to 1 } // Record if appropriate if (foundLOW is equal to true) { increment element numOfBits in array numbersWithPositiveBits } }

    // From here, derive your K values.

    0 讨论(0)
  • 2021-01-30 00:51

    You can solve this efficiently as follows:

    ret = 0;
    for (i = 1; i <= 64; i++) {
      if (computeK(i) != desiredK) continue;
      ret += numBelow(HIGH, i) - numBelow(LO - 1, i);
    }
    return ret;
    

    The function numBelow(high, numSet) computes the number of integers less than or equal to high and greater than zero that have numSet bits set. To implement numBelow(high, numSet) efficiently, you can use something like the following:

    numBelow(high, numSet) {
      t = floor(lg(high));
      ret = 0;
      if (numBitsSet(high) == numSet) ret++;
      while (numSet > 0 && t > 0) {
        ret += nchoosek(t - 1, numSet);
        numSet--;
        while (--t > 0 && (((1 << t) & high) == 0));
      }
      return ret;
    }
    
    0 讨论(0)
  • 2021-01-30 00:52

    The idea behind this answer can help you develop very fast solution. Having ranges 0..2^N the complexity of a potential algorithm would be O(N) in the worst case (Assuming that complexity of a long arithmetic is O(1)) If programmed correctly it should easily handle N = 1000000 in a matter of milliseconds.

    Imagine we have the following values:

    LO   =          0; (0000000000000000000000000000000)
    HI   = 2147483647; (1111111111111111111111111111111)
    

    The lowest possible N1 in range LO..HI is 0 The highest possible N1 in range LO..HI is 31

    So the computation of N2..NN part is done only for one of 32 values (i.e. 0..31). Which can be done simply, even without a computer. Now lets compute the amount of N1=X for a range of values LO..HI

    When we have X = 0 we have count(N1=X) = 1 this is the following value:

    1   0000000000000000000000000000000
    

    When we have X = 1 we have count(N1=X) = 31 these are the following values:

    01  1000000000000000000000000000000
    02  0100000000000000000000000000000
    03  0010000000000000000000000000000
        ...
    30  0000000000000000000000000000010
    31  0000000000000000000000000000001
    

    When we have X = 2 we have the following pattern:

    1100000000000000000000000000000
    

    How many unique strings can be formed with 29 - '0' and 2 - '1'?

    Imagine the rightmost '1'(#1) is cycling from left to right, we get the following picture:

    01  1100000000000000000000000000000
    02  1010000000000000000000000000000
    03  1001000000000000000000000000000
        ...
    30  1000000000000000000000000000001 
    

    Now we've got 30 unique strings while moving the '1'(#1) from left to right, it is now impossible to create a unique string by moving the '1'(#1) in any direction. This means we should move '1'(#2) to the right, let's also reset the position of '1'(#1) as left as possible remaining uniqueness, we get:

    01  0110000000000000000000000000000
    

    now we do the cycling of '1'(#1) once again

    02  0101000000000000000000000000000
    03  0100100000000000000000000000000
        ...
    29  0100000000000000000000000000001
    

    Now we've got 29 unique strings, continuing this whole operation 28 times we get the following expression

    count(N1=2) = 30 + 29 + 28 + ... + 1 = 465
    

    When we have X = 3 the picture remains similar but we are moving '1'(#1), '1'(#2), '1'(#3)

    Moving the '1'(#1) creates 29 unique strings, when we start moving '1'(#2) we get

    29 + 28 + ... + 1 = 435 unique strings, after that we are left to process '1'(#3) so we have

    29 + 28 + ... + 1 = 435
         28 + ... + 1 = 406
              ...
                  + 1 = 1
    
    435 + 406 + 378 + 351 + 325 + 300 + 276 +
    253 + 231 + 210 + 190 + 171 + 153 + 136 +
    120 + 105 + 091 + 078 + 066 + 055 + 045 +
    036 + 028 + 021 + 015 + 010 + 006 + 003 + 001 = 4495
    

    Let's try to solve the general case i.e. when we have N zeros and M ones. Overall amount of permutations for the string of length (N + M) is equal to (N + M)!

    The amount of '0' duplicates in this string is equal to N! The amount of '1' duplicates in this string is equal to M!

    thus receiving overall amount of unique strings formed of N zeros and M ones is

                  (N + M)!          32!       263130836933693530167218012160000000
    F(N, M) =  ============= => ========== = ====================================== = 4495
                (N!) * (M!)      3! * 29!      6 * 304888344611713860501504000000
    

    Edit:

    F(N, M) = Binomial(N + M, M)
    

    Now let's consider a real life example:

    LO   =   43797207; (0000010100111000100101011010111)
    HI   = 1562866180; (1011101001001110111001000000100)
    

    So how do we apply our unique permutations formula to this example? Since we don't know how many '1' is located below LO and how many '1' is located above HI.

    So lets count these permutations below LO and above HI.

    Lets remember how we cycled '1'(#1), '1'(#2), ...

    1111100000000000000000000000000 => 2080374784
    1111010000000000000000000000000 => 2046820352
    1111001000000000000000000000000 => 2030043136
    1111000000000000000000000000001 => 2013265921
    
    1110110000000000000000000000000 => 1979711488
    1110101000000000000000000000000 => 1962934272
    1110100100000000000000000000000 => 1954545664
    1110100010000000000000000000001 => 1950351361
    

    As you see this cycling process decreases the decimal values smoothly. So we need to count amount of cycles until we reach HI value. But we shouldn't be counting these values by one because the worst case can generate up to 32!/(16!*16!) = 601080390 cycles, which we will be cycling very long :) So we need cycle chunks of '1' at once.

    Having our example we would want to count the amount of cycles of a transformation

    1111100000000000000000000000000 => 1011101000000000000000000000000
                                       1011101001001110111001000000100
    

    So how many cycles causes the transformation

    1111100000000000000000000000000 => 1011101000000000000000000000000
    

    ?

    Lets see, the transformation:

    1111100000000000000000000000000 => 1110110000000000000000000000000
    

    is equal to following set of cycles:

    01  1111100000000000000000000000000
    02  1111010000000000000000000000000
    ...
    27  1111000000000000000000000000001
    28  1110110000000000000000000000000
    

    So we need 28 cycles to transform

    1111100000000000000000000000000 => 1110110000000000000000000000000
    

    How many cycles do we need to transform

    1111100000000000000000000000000 => 1101110000000000000000000000000
    

    performing following moves we need:

    1110110000000000000000000000000 28 cycles
    1110011000000000000000000000000 27 cycles
    1110001100000000000000000000000 26 cycles
    ...
    1110000000000000000000000000011  1 cycle
    

    and 1 cycle for receiving:

    1101110000000000000000000000000  1 cycle
    

    thus receiving 28 + 27 + ... + 1 + 1 = 406 + 1

    but we have seen this value before and it was the result for the amount of unique permutations, which was computed for 2 '1' and 27 '0'. This means that amount of cycles while moving

    11100000000000000000000000000 => 01110000000000000000000000000
    

    is equal to moving

    _1100000000000000000000000000 => _0000000000000000000000000011 
    

    plus one additional cycle

    so this means if we have M zeros and N ones and want to move the chunk of U '1' to the right we will need to perform the following amount of cycles:

          (U - 1 + M)!
    1 + =============== = f(U, M)
         M! * (U - 1)!
    

    Edit:

    f(U, M) = 1 + Binomial(U - 1 + M, M)
    

    Now let's come back to our real life example:

        LO   =   43797207; (0000010100111000100101011010111)
        HI   = 1562866180; (1011101001001110111001000000100)
    

    so what we want to do is count the amount cycles needed to perform the following transformations (suppose N1 = 6)

    1111110000000000000000000000000 => 1011101001000000000000000000000
                                       1011101001001110111001000000100
    

    this is equal to:

    1011101001000000000000000000000    1011101001000000000000000000000
    -------------------------------    -------------------------------
    _111110000000000000000000000000 => _011111000000000000000000000000 f(5, 25) = 118756
    _____11000000000000000000000000 => _____01100000000000000000000000 f(2, 24) = 301
    _______100000000000000000000000 => _______010000000000000000000000 f(1, 23) = 24
    ________10000000000000000000000 => ________01000000000000000000000 f(1, 22) = 23
    

    thus resulting 119104 'lost' cycles which are located above HI

    Regarding LO, there is actually no difference in what direction we are cycling so for computing LO we can do reverse cycling:

    0000010100111000100101011010111    0000010100111000100101011010111
    -------------------------------    -------------------------------
    0000000000000000000000000111___ => 0000000000000000000000001110___ f(3, 25) = 2926
    00000000000000000000000011_____ => 00000000000000000000000110_____ f(2, 24) = 301
    

    Thus resulting 3227 'lost' cycles which are located below LO this means that

    overall amount of lost cycles = 119104 + 3227 = 122331
    
    overall amount of all possible cycles = F(6, 25) = 736281
    
    N1 in range 43797207..1562866180 is equal to 736281 - 122331 = 613950
    

    I wont provide the remaining part of the solution. It is not that hard to grasp the remaining part. Good luck!

    0 讨论(0)
  • 2021-01-30 00:54

    I think a key is first understanding the pattern of K values and how rapidly it grows. Basically, you have:

    K(1) = 0
    K(X) = K(bitcount(X))+1 for X > 1
    

    So finding the smallest X values for a given K we see

    K(1) = 0
    K(2) = 1
    K(3) = 2
    K(7) = 3
    K(127) = 4
    K(170141183460469231731687303715884105727) = 5
    

    So for an example like 48238 10^18 9 the answer is trivially 0. K=0 only for 1, and K=1 only for powers of 2, so in the range of interest, we'll pretty much only see K values of 2, 3 or 4, and never see K >= 5

    edit

    Ok, so we're looking for an algorithm to count the number of values with K=2,3,4 in a range of value LO..HI without iterating over the entire range. So the first step is to find the number of values in the range with bitcount(x)==i for i = 1..59 (since we only care about values up to 10^18 and 10^18 < 2^60). So break down the range lo..hi into subranges that are a power of 2 size and differ only in their lower n bits -- a range of the form x*(2^n)..(x+1)*(2^n)-1. We can break down the arbitray lo..hi range into such subranges easily. For each such subrange there will be choose(n, i) values with i+bitcount(x) set bits. So we just add all the subranges together to get a vector of counts for 1..59, which we then iterate over, adding together those elements with the same K value to get our answer.

    edit (fixed again to be be C89 compatible and work for lo=1/k=0)

    Here's a C program to do what I previously described:

    #include <stdio.h>
    #include <string.h>
    #include <assert.h>
    
    int bitcount(long long x) {
        int rv = 0;
        while(x) { rv++; x &= x-1; }
        return rv; }
    
    long long choose(long long m, long long n) {
        long long rv = 1;
        int i;
        for (i = 0; i < n; i++) {
            rv *= m-i;
            rv /= i+1; }
        return rv; }
    
    void bitcounts_p2range(long long *counts, long long base, int l2range) {
        int i;
        assert((base & ((1LL << l2range) - 1)) == 0);
        counts += bitcount(base);
        for (i = 0; i <= l2range; i++)
            counts[i] += choose(l2range, i); }
    
    void bitcounts_range(long long *counts, long long lo, long long hi) {
        int l2range = 0;
        while (lo + (1LL << l2range) - 1 <= hi) {
            if (lo & (1LL << l2range)) {
                bitcounts_p2range(counts, lo, l2range);
                lo += 1LL << l2range; }
            l2range++; }
        while (l2range >= 0) {
            if (lo + (1LL << l2range) - 1 <= hi) {
                bitcounts_p2range(counts, lo, l2range);
                lo += 1LL << l2range; }
            l2range--; }
        assert(lo == hi+1); }
    
    int K(int x) {
        int rv = 0;
        while(x > 1) {
            x = bitcount(x);
            rv++; }
        return rv; }
    
    int main() {
        long long counts[64];
        long long lo, hi, total;
        int i, k;
        while (scanf("%lld%lld%d", &lo, &hi, &k) == 3) {
            if (lo < 1 || lo > hi || k < 0) break;
            if (lo == 0 || hi == 0 || k == 0) break;
            total = 0;
            if (lo == 1) {
                lo++;
                if (k == 0) total++; }
            memset(counts, 0, sizeof(counts));
            bitcounts_range(counts, lo, hi);
            for (i = 1; i < 64; i++)
                if (K(i)+1 == k)
                    total += counts[i];
            printf("%lld\n", total); }
        return 0; }
    

    which runs just fine for values up to 2^63-1 (LONGLONG_MAX). For 48238 1000000000000000000 3 it gives 513162479025364957, which certainly seems plausible

    edit

    giving the inputs of

    48238 1000000000000000000 1
    48238 1000000000000000000 2
    48238 1000000000000000000 3
    48238 1000000000000000000 4
    

    gives outputs of

    44
    87878254941659920
    513162479025364957
    398959266032926842
    

    Those add up to 999999999999951763 which is correct. The value for k=1 is correct (there are 44 powers of two in that range 2^16 up to 2^59). So while I'm not sure the other 3 values are correct, they're certainly plausible.

    0 讨论(0)
  • 2021-01-30 00:59

    I think it's a problem in Discrete mathematics,
    assuming LOW is 0,
    otherwise we can insert a function for summing numbers below LOW,
    from numbers shown i understand the longest number will consist up to 60 binary digit at most

    alg(HIGH,k)
    
    l=len(HIGH)
    sum=0;
    
    for(i=0;i<l;i++)
    {
     count=(l choose i); 
     nwia=numbers_with_i_above(i,HIGH); 
     if canreach(i,k) sum+=(count-nwia);
    }
    


    all the numbers appear
    non is listed twice
    numbers_with_i_above is trivial
    canreach with numbers up to 60 is easy
    len is it length of a binary represention

    0 讨论(0)
  • 2021-01-30 01:02

    This is a full working example with c++17

    #include <bits/stdc++.h>
    
    using namespace std;
    
    #define BASE_MAX 61
    
    typedef unsigned long long ll;
    
    ll combination[BASE_MAX][BASE_MAX];
    
    vector<vector<ll>> NK(4);
    
    int count_bit(ll n) {
        int ret = 0;
        while (n) {
            if (n & 1) {
                ret++;
            }
            n >>= 1;
        }
    
        return ret;
    }
    
    int get_leftmost_bit_index(ll n) {
        int ret = 0;
        while (n > 1) {
            ret++;
            n >>= 1;
        }
        return ret;
    }
    
    void pre_calculate() {
        for (int i = 0; i < BASE_MAX; i++)
            combination[i][0] = 1;
        for (int i = 1; i < BASE_MAX; i++) {
            for (int j = 1; j < BASE_MAX; j++) {
                combination[i][j] = combination[i - 1][j] + combination[i - 1][j - 1];
            }
        }
    
        NK[0].push_back(1);
        for (int i = 2; i < BASE_MAX; i++) {
            int bitCount = count_bit(i);
            if (find(NK[0].begin(), NK[0].end(), bitCount) != NK[0].end()) {
                NK[1].push_back(i);
            }
        }
        for (int i = 1; i < BASE_MAX; i++) {
            int bitCount = count_bit(i);
            if (find(NK[1].begin(), NK[1].end(), bitCount) != NK[1].end()) {
                NK[2].push_back(i);
            }
        }
        for (int i = 1; i < BASE_MAX; i++) {
            int bitCount = count_bit(i);
            if (find(NK[2].begin(), NK[2].end(), bitCount) != NK[2].end()) {
                NK[3].push_back(i);
            }
        }
    }
    
    
    ll how_many_numbers_have_n_bit_in_range(ll lo, ll hi, int bit_count) {
        if (bit_count == 0) {
            if (lo == 0) return 1;
            else return 0;
        }
    
        if (lo == hi) {
            return count_bit(lo) == bit_count;
        }
    
        int lo_leftmost = get_leftmost_bit_index(lo); // 100 -> 2
        int hi_leftmost = get_leftmost_bit_index(hi); // 1101 -> 3
    
        if (lo_leftmost == hi_leftmost) {
            return how_many_numbers_have_n_bit_in_range(lo & ~(1LL << lo_leftmost), hi & ~(1LL << hi_leftmost),
                                                        bit_count - 1);
        }
    
        if (lo != 0) {
            return how_many_numbers_have_n_bit_in_range(0, hi, bit_count) -
                   how_many_numbers_have_n_bit_in_range(0, lo - 1, bit_count);
        }
    
        ll ret = combination[hi_leftmost][bit_count];
    
        ret += how_many_numbers_have_n_bit_in_range(1LL << hi_leftmost, hi, bit_count);
    
        return ret;
    }
    
    int main(void) {
        pre_calculate();
    
        while (true) {
            ll LO, HI;
            int X;
            scanf("%lld%lld%d", &LO, &HI, &X);
            if (LO == 0 && HI == 0 && X == 0)
                break;
    
            switch (X) {
                case 0:
                    cout << (LO == 1) << endl;
                    break;
                case 1: {
                    int ret = 0;
                    ll power2 = 1;
                    for (int i = 0; i < BASE_MAX; i++) {
                        power2 *= 2;
    
                        if (power2 > HI)
                            break;
                        if (power2 >= LO)
                            ret++;
                    }
                    cout << ret << endl;
                    break;
                }
                case 2:
                case 3:
                case 4: {
                    vector<ll> &addedBitsSizes = NK[X - 1];
    
                    ll ret = 0;
    
                    for (auto bit_count_to_added: addedBitsSizes) {
                        ll result = how_many_numbers_have_n_bit_in_range(LO, HI, bit_count_to_added);
                        ret += result;
                    }
    
                    cout << ret << endl;
                    break;
                }
                default:
                    cout << 0 << endl;
                    break;
            }
        }
    
        return 0;
    }
    
    0 讨论(0)
提交回复
热议问题