I made the following implementation of the median in C++
and and used it in R
via Rcpp
:
// [[Rcpp::export]]
double median2
[This is more of an extended comment than an answer to the question you actually asked.]
Even your code may be open to significant improvement. In particular, you're sorting the entire input even though you only care about one or two elements.
You can change this from O(n log n) to O(n) by using std::nth_element
instead of std::sort
. In case of an even number of elements, you'd typically want to use std::nth_element
to find the element just before the middle, then use std::min_element
to find the immediately succeeding element--but std::nth_element
also partitions the input items, so the std::min_element
only has to run on the items above the middle after the nth_element
, not the entire input array. That is, after nth_element, you get a situation like this:
The complexity of std::nth_element
is "linear on average", and (of course) std::min_element
is linear as well, so the overall complexity is linear.
So, for the simple case (odd number of elements), you get something like:
auto pos = x.begin() + x.size()/2;
std::nth_element(x.begin(), pos, x.end());
return *pos;
...and for the more complex case (even number of elements):
std::nth_element(x.begin(), pos, x.end());
auto pos2 = std::min_element(pos+1, x.end());
return (*pos + *pos2) / 2.0;