Radix Sort Base 16 (Hexadecimals)

前端 未结 4 1298
予麋鹿
予麋鹿 2020-12-21 14:13

I have spent more 10hr+ on trying to sort the following(hexadecimals) in LSD radix sort, but no avail. There is very little material on this subject on web.

相关标签:
4条回答
  • 2020-12-21 14:42

    There's a simpler way to implement a radix sort. After checking for max, find the lowest power of 16 >= max value. This can be done with max >>= 4 in a loop, incrementing x so that when max goes to zero, then 16 to the power x is >= the original max value. For example a max of 0xffff would need 4 radix sort passes, while a max of 0xffffffff would take 8 radix sort passes.

    If the range of values is most likely to take the full range available for an integer, there's no need to bother determining max value, just base the radix sort on integer size.

    The example code you have shows a radix sort that scans an array backwards due to the way the counts are converted into indices. This can be avoided by using an alternate method to convert counts into indices. Here is an example of a base 256 radix sort for 32 bit unsigned integers. It uses a matrix of counts / indices so that all 4 rows of counts are generated with just one read pass of the array, followed by 4 radix sort passes (so the sorted data ends up back in the original array). std::swap is a C++ function to swap the pointers, for a C program, this can be replaced by swapping the pointers inline. t = a; a = b; b = t, where t is of type uint32_t * (ptr to unsigned 32 bit integer). For a base 16 radix sort, the matrix size would be [8][16].

    //  a is input array, b is working array
    uint32_t * RadixSort(uint32_t * a, uint32_t *b, size_t count)
    {
    size_t mIndex[4][256] = {0};            // count / index matrix
    size_t i,j,m,n;
    uint32_t u;
        for(i = 0; i < count; i++){         // generate histograms
            u = a[i];
            for(j = 0; j < 4; j++){
                mIndex[j][(size_t)(u & 0xff)]++;
                u >>= 8;
            }       
        }
        for(j = 0; j < 4; j++){             // convert to indices
            m = 0;
            for(i = 0; i < 256; i++){
                n = mIndex[j][i];
                mIndex[j][i] = m;
                m += n;
            }       
        }
        for(j = 0; j < 4; j++){             // radix sort
            for(i = 0; i < count; i++){     //  sort by current lsb
                u = a[i];
                m = (size_t)(u>>(j<<3))&0xff;
                b[mIndex[j][m]++] = u;
            }
            std::swap(a, b);                //  swap ptrs
        }
        return(a);
    }
    
    0 讨论(0)
  • 2020-12-21 14:49
    void int_radix_sort(void) {
        int group; //because extracting 8 bits
        int buckets = 1 << 8; //using size 256
        int map[buckets];   
        int mask = buckets - 1;
        int i;
        int cnt[buckets];
        int flag = NULL;
        int partition;
        int *src, *dst;
    
        for (group = 0; group < 32; group += 8) {
            // group = 8, number of bits we want per round, we want 4 rounds
            // cnt  
            for (int i = 0; i < buckets; i++) {
                cnt[i] = 0;
            }
            for (int j = 0; j < n; j++) {
                i = (lst[j] >> group) & mask;
                cnt[i]++; 
                tmp[j] = lst[j];
            }
    
            //map
            map[0] = 0;
            for (int i = 1; i < buckets; i++) {
                map[i] = map[i - 1] + cnt[i - 1];
            }
    
            //move
            for (int j = 0; j < n; j++) {   
                i = (tmp[j] >> group) & mask;
                lst[map[i]] = tmp[j];
                map[i]++;
            }
        }
    }
    

    After hours of researching I came across the answer. I'm still do not understand what is going on in this code/answer. I cannot get my head wrapped around the concept. Hopefully, someone can explain.

    0 讨论(0)
  • 2020-12-21 14:50

    I see your points. I think negative numbers are easy to sort after the list has been sorted with something like loop, flag, and swap. wb unsigned float points? – itproxti Nov 1 '16 at 16:02

    As for handling floating points there might be a way, for example 345.768 is the number, it needs to be converted to an integer, i.e. make it 345768, I multiplied 1000 with it. Just like the offset moves the -ve numbers to +ve domain, so will multiplying by 1000, 10000 etc will turn the floats to numbers with their decimal part as all zeros. Then they can be typecasted to int or long. However with large values, the whole reformed number may not be accomodated within the entire int or long range.

    The number that is to be multiplied has to be constant, just like the offset so that the relationship among the magnitudes is preserved. Its better to use powers of 2 such as 8 or 16, as then bitshifting operator can be used. However just like the calculation of offset takes some time, so will calculation of the multiplier will take some time. The whole array is to be searched to calculate the least number that when multiplied will turn all the numbers with zeros in decimal parts.

    This may not compute fast but still can do the job if required.

    0 讨论(0)
  • 2020-12-21 15:01

    Your implementation of radix sort is slightly incorrect:

    • it cannot handle negative numbers
    • the array count[] in function ccsort() should have a size of 10 instead of n. If n is smaller than 10, the function does not work.
    • the loop for cumulating counts goes one step too far: for (i = 1; i <= n; i++). Once again the <= operator causes a bug.
    • you say you sort by hex digits but the code uses decimal digits.

    Here is a (slightly) improved version with explanations:

    void ccsort(int a[], int n, int exp) {
    
        int count[10] = { 0 };
        int output[n];
        int i, last;
    
        for (i = 0; i < n; i++) {
            // compute the number of entries with any given digit at level exp
            ++count[(a[i] / exp) % 10];
        }
        for (i = last = 0; i < 10; i++) {
            // update the counts to have the index of the place to dispatch the next
            // number with a given digit at level exp
            last += count[i];
            count[i] = last - count[i];
        }
        for (i = 0; i < n; i++) {
            // dispatch entries at the right index for its digit at level exp
            output[count[(a[i] / exp) % 10]++] = a[i];
        }
        for (i = 0; i < n; i++) {
            // copy entries batch to original array
            a[i] = output[i];
        }
    }
    
    int getMax(int a[], int n) {
        // find the largest number in the array
        int max = a[0];
        for (int i = 1; i < n; i++) {
            if (a[i] > max) {
                max = a[i];
            }
        }
        return max;
    }
    
    void rsort(int a[], int n) {
        int max = getMax(a, n);
        // for all digits required to express the maximum value
        for (int exp = 1; max / exp > 0; exp *= 10) {   
            // sort the array on one digit at a time
            ccsort(a, n, exp);
        }
    }
    

    The above version is quite inefficient because of all the divisions and modulo operations. Performing on hex digits can be done with shifts and masks:

    void ccsort16(int a[], int n, int shift) {
    
        int count[16] = { 0 };
        int output[n];
        int i, last;
    
        for (i = 0; i < n; i++) {
            ++count[(a[i] >> shift) & 15];
        }
        for (i = last = 0; i < 16; i++) {
            last += count[i];
            count[i] = last - count[i];
        }
        for (i = 0; i < n; i++) {
            output[count[(a[i] >> shift) & 15]++] = a[i];
        }
        for (i = 0; i < n; i++) {
            a[i] = output[i];
        }
    }
    
    void rsort16(int a[], int n) {
        int max = a[0];
        for (int i = 1; i < n; i++) {
            if (a[i] > max) {
                max = a[i];
            }
        }
        for (int shift = 0; (max >> shift) > 0; shift += 4) {   
            ccsort16(a, n, shift);
        }
    }
    

    It would be approximately twice as fast to sort one byte at a time with a count array of 256 entries. It would also be faster to compute the counts for all digits in one pass, as shown in rcgldr's answer.

    Note that this implementation still cannot handle negative numbers.

    0 讨论(0)
提交回复
热议问题