Fastest code C/C++ to select the median in a set of 27 floating point values

后端 未结 15 1030
梦谈多话
梦谈多话 2020-12-07 09:24

This is the well know select algorithm. see http://en.wikipedia.org/wiki/Selection_algorithm.

I need it to find the median value of a set of 3x3x3 voxel values. Sinc

相关标签:
15条回答
  • 2020-12-07 09:33

    Since it sounds like you're performing a median filter on a large array of volume data, you might want to take a look at the Fast Median and Bilateral Filtering paper from SIGGRAPH 2006. That paper deals with 2D image processing, but you might be able to adapt the algorithm for 3D volumes. If nothing else, it might give you some ideas on how to step back and look at the problem from a slightly different perspective.

    0 讨论(0)
  • 2020-12-07 09:33

    You might want to have a look at Knuth's Exercise 5.3.3.13. It describes an algorithm due to Floyd that finds the median of n elements using (3/2)n+O(n^(2/3) log n) comparisons, and the constant hidden in the O(·) seems not to be too large in practice.

    0 讨论(0)
  • 2020-12-07 09:42

    A sorting network generated using the Bose-Nelson algorithm will find the median directly with no loops/recursion using 173 comparisons. If you have the facility to perform comparisions in parallel such as usage of vector-arithmetic instructions then you may be able to group the comparisions into as few as 28 parallel operations.

    If you are sure that the floats are normalized and not (qs)NaN's, then you can use integer operations to compare IEEE-754 floats which can perform more favorably on some CPU's.

    A direct conversion of this sorting network to C (gcc 4.2) yields a worst-case of 388 clock cycles on my Core i7.

    Sorting Networks

    0 讨论(0)
  • 2020-12-07 09:42

    Alex Stepanov's new book Elements of Programming talks at some length about finding order statistics using the minimum number of average comparisons while minimizing runtime overhead. Unfortunately, a sizable amount of code is needed just to compute the median of 5 elements, and even then he gives as a project finding an alternate solution that uses a fraction of a comparison less on average, so I wouldn't dream of extending that framework to finding the median of 27 elements. And the book won't even be available until 15 June 2009. The point is that because this is a fixed-size problem, there is a direct comparison method that is provably optimal.

    Also, there is the fact that this algorithm is not being run once in isolation but rather many times, and between most runs only 9 of the 27 values will change. That means in theory some of the work is done already. However, I have not heard of any median filtering algorithms in image processing that take advantage of this fact.

    0 讨论(0)
  • 2020-12-07 09:42

    My super fast algorithm for calculation of median of a 1-D data set does the job in three passes and doesn't need to sort (!!!) the data set.

    A very generic description is as follows:

    • Pass 1: Scans the 1-D data set and collects some statistical information of the data set
    • Pass 2: Uses the statistical information of the data set and applies some data mining to create an intermediate ( helper ) array
    • Pass 3: Scans the intermediate ( helper ) array in order to find the median

    The algorithm is designed for finding medians of extremely large 1-D data sets greater then 8GE ( giga elements ) of Single-Precision Floating Point values ( on a desktop system with 32GB of physical memory and 128GB of virtual memory ), or for finding medians of small data sets in a hard real-time environment.

    The algorithm is:

    • faster then the classical algorithm, based on Heap or Merge sorting algorithm, in ~60 - ~75 times
    • implemented in pure C language
    • doesn't use any Intel intrinsics functions
    • doesn't use any inline assembler instructions
    • absolutely portable between C/C++ compilers, like MS, Intel, MinGW, Borland, Turbo and Watcom
    • absolutely portable between platforms

    Best regards, Sergey Kostrov

    0 讨论(0)
  • 2020-12-07 09:43

    EDIT: I have to apologize. The code below was WRONG. I have the fixed code, but need to find an icc compiler to redo the measurements.

    The benchmark results of the algorithms considered so far

    For the protocol and short description of algorithms see below. First value is mean time (seconds) over 200 different sequences and second value is stdDev.

    HeapSort     : 2.287 0.2097
    QuickSort    : 2.297 0.2713
    QuickMedian1 : 0.967 0.3487
    HeapMedian1  : 0.858 0.0908
    NthElement   : 0.616 0.1866
    QuickMedian2 : 1.178 0.4067
    HeapMedian2  : 0.597 0.1050
    HeapMedian3  : 0.015 0.0049 <-- best
    

    Protocol: generate 27 random floats using random bits obtained from rand(). Apply each algorithm 5 million times in a row (including prior array copy) and compute average and stdDev over 200 random sequences. C++ code compiled with icc -S -O3 and run on Intel E8400 with 8GB DDR3.

    Algorithms:

    HeapSort : full sort of sequence using heap sort and pick middle value. Naive implementation using subscript access.

    QuickSort: full in place sort of sequence using quick sort and pick middle value. Naive implementation using subscript access.

    QuickMedian1: quick select algorithm with swapping. Naive implementation using subscript access.

    HeapMedian1: in place balanced heap method with prior swapping. Naive implementation using subscript access.

    NthElement : uses the nth_element STL algorithm. Data is copied into the vector using memcpy( vct.data(), rndVal, ... );

    QuickMedian2: uses quick select algorithm with pointers and copy in two buffers to avoid swaping. Based on proposal of MSalters.

    HeapMedian2 : variant of my invented algorithm using dual heaps with shared heads. Left heap has biggest value as head, right has smallest value as head. Initialize with first value as common head and first median value guess. Add subsequent values to left heap if smaller than head, otherwise to right heap, until one of the heap is full. It is full when it contains 14 values. Then consider only the full heap. If its the right heap, for all values bigger than the head, pop head and insert value. Ignore all other values. If its the left heap, for all values smaller than the head, pop head and insert it in heap. Ignore all other values. When all values have been proceeded, the common head is the median value. It uses integer index into array. The version using pointers (64bit) appeared to be nearly twice slower (~1s).

    HeapMedian3 : same algorithm as HeapMedian2 but optimized. It uses unsigned char index, avoids value swapping and various other little things. The mean and stdDev values are computed over 1000 random sequences. For nth_element I measured 0.508s and a stdDev of 0.159537 with the same 1000 random sequences. HeapMedian3 is thus 33 time faster than the nth_element stl function. Each returned median value is checked against the median value returned by heapSort and they all match. I doubt a method using hash may be significantly faster.

    EDIT 1: This algorithm can be further optimized. The first phase where elements are dispatched in the left or right heap based on the comparison result doesn't need heaps. It is sufficient to simply append elements to two unordered sequences. The phase one stops as soon as one sequence is full, which means it contains 14 elements (including the median value). The second phase starts by heapifying the full sequence and then proceed as described in the HeapMedian3 algorithm. I'll provide the new code and benchmark as soon as possible.

    EDIT 2: I implemented and benchmarked the optimized algorithm. But there is no significant performance difference compared heapMedian3. It is even slightly slower on the average. Shown results are confirmed. There might be with much larger sets. Note also that I simply pick the first value as initial median guess. As suggested, one could benefit from the fact that we search a median value in "overlapping" value sets. Using the median of median algorithm would help to pick a much better initial median value guess.


    Source code of HeapMedian3

    // return the median value in a vector of 27 floats pointed to by a
    float heapMedian3( float *a )
    {
       float left[14], right[14], median, *p;
       unsigned char nLeft, nRight;
    
       // pick first value as median candidate
       p = a;
       median = *p++;
       nLeft = nRight = 1;
    
       for(;;)
       {
           // get next value
           float val = *p++;
    
           // if value is smaller than median, append to left heap
           if( val < median )
           {
               // move biggest value to the heap top
               unsigned char child = nLeft++, parent = (child - 1) / 2;
               while( parent && val > left[parent] )
               {
                   left[child] = left[parent];
                   child = parent;
                   parent = (parent - 1) / 2;
               }
               left[child] = val;
    
               // if left heap is full
               if( nLeft == 14 )
               {
                   // for each remaining value
                   for( unsigned char nVal = 27 - (p - a); nVal; --nVal )
                   {
                       // get next value
                       val = *p++;
    
                       // if value is to be inserted in the left heap
                       if( val < median )
                       {
                           child = left[2] > left[1] ? 2 : 1;
                           if( val >= left[child] )
                               median = val;
                           else
                           {
                               median = left[child];
                               parent = child;
                               child = parent*2 + 1;
                               while( child < 14 )
                               {
                                   if( child < 13 && left[child+1] > left[child] )
                                       ++child;
                                   if( val >= left[child] )
                                       break;
                                   left[parent] = left[child];
                                   parent = child;
                                   child = parent*2 + 1;
                               }
                               left[parent] = val;
                           }
                       }
                   }
                   return median;
               }
           }
    
           // else append to right heap
           else
           {
               // move smallest value to the heap top
               unsigned char child = nRight++, parent = (child - 1) / 2;
               while( parent && val < right[parent] )
               {
                   right[child] = right[parent];
                   child = parent;
                   parent = (parent - 1) / 2;
               }
               right[child] = val;
    
               // if right heap is full
               if( nRight == 14 )
               {
                   // for each remaining value
                   for( unsigned char nVal = 27 - (p - a); nVal; --nVal )
                   {
                       // get next value
                       val = *p++;
    
                       // if value is to be inserted in the right heap
                       if( val > median )
                       {
                           child = right[2] < right[1] ? 2 : 1;
                           if( val <= right[child] )
                               median = val;
                           else
                           {
                               median = right[child];
                               parent = child;
                               child = parent*2 + 1;
                               while( child < 14 )
                               {
                                   if( child < 13 && right[child+1] < right[child] )
                                       ++child;
                                   if( val <= right[child] )
                                       break;
                                   right[parent] = right[child];
                                   parent = child;
                                   child = parent*2 + 1;
                               }
                               right[parent] = val;
                           }
                       }
                   }
                   return median;
               }
           }
       }
    } 
    
    0 讨论(0)
提交回复
热议问题