I need to write function that will accept array of decimals and it will find the median.
Is there a function in the .net Math library?
Thanks Rafe, this takes into account the issues your replyers posted.
public static double GetMedian(double[] sourceNumbers) {
//Framework 2.0 version of this method. there is an easier way in F4
if (sourceNumbers == null || sourceNumbers.Length == 0)
throw new System.Exception("Median of empty array not defined.");
//make sure the list is sorted, but use a new array
double[] sortedPNumbers = (double[])sourceNumbers.Clone();
Array.Sort(sortedPNumbers);
//get the median
int size = sortedPNumbers.Length;
int mid = size / 2;
double median = (size % 2 != 0) ? (double)sortedPNumbers[mid] : ((double)sortedPNumbers[mid] + (double)sortedPNumbers[mid - 1]) / 2;
return median;
}
I have an histogram with the variable : group
Here how I calculate my median :
int[] group = new int[nbr];
// -- Fill the group with values---
// sum all data in median
int median = 0;
for (int i =0;i<nbr;i++) median += group[i];
// then divide by 2
median = median / 2;
// find 50% first part
for (int i = 0; i < nbr; i++)
{
median -= group[i];
if (median <= 0)
{
median = i;
break;
}
}
median is the group index of median
Here is the fastest unsafe implementation, same algorithm before, taken from this source
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static unsafe void SwapElements(int* p, int* q)
{
int temp = *p;
*p = *q;
*q = temp;
}
public static unsafe int Median(int[] arr, int n)
{
int middle, ll, hh;
int low = 0; int high = n - 1; int median = (low + high) / 2;
fixed (int* arrptr = arr)
{
for (;;)
{
if (high <= low)
return arr[median];
if (high == low + 1)
{
if (arr[low] > arr[high])
SwapElements(arrptr + low, arrptr + high);
return arr[median];
}
middle = (low + high) / 2;
if (arr[middle] > arr[high])
SwapElements(arrptr + middle, arrptr + high);
if (arr[low] > arr[high])
SwapElements(arrptr + low, arrptr + high);
if (arr[middle] > arr[low])
SwapElements(arrptr + middle, arrptr + low);
SwapElements(arrptr + middle, arrptr + low + 1);
ll = low + 1;
hh = high;
for (;;)
{
do ll++; while (arr[low] > arr[ll]);
do hh--; while (arr[hh] > arr[low]);
if (hh < ll)
break;
SwapElements(arrptr + ll, arrptr + hh);
}
SwapElements(arrptr + low, arrptr + hh);
if (hh <= median)
low = ll;
if (hh >= median)
high = hh - 1;
}
}
}
Looks like other answers are using sorting. That's not optimal from performance point of view because it takes O(n logn)
time. It is possible to calculate median in O(n)
time instead. The generalized version of this problem is known as "n-order statistics" which means finding an element K in a set such that we have n elements smaller or equal to K and rest are larger or equal K. So 0th order statistic would be minimal element in the set (Note: Some literature use index from 1 to N instead of 0 to N-1). Median is simply (Count-1)/2
-order statistic.
Below is the code adopted from Introduction to Algorithms by Cormen et al, 3rd Edition.
/// <summary>
/// Partitions the given list around a pivot element such that all elements on left of pivot are <= pivot
/// and the ones at thr right are > pivot. This method can be used for sorting, N-order statistics such as
/// as median finding algorithms.
/// Pivot is selected ranodmly if random number generator is supplied else its selected as last element in the list.
/// Reference: Introduction to Algorithms 3rd Edition, Corman et al, pp 171
/// </summary>
private static int Partition<T>(this IList<T> list, int start, int end, Random rnd = null) where T : IComparable<T>
{
if (rnd != null)
list.Swap(end, rnd.Next(start, end+1));
var pivot = list[end];
var lastLow = start - 1;
for (var i = start; i < end; i++)
{
if (list[i].CompareTo(pivot) <= 0)
list.Swap(i, ++lastLow);
}
list.Swap(end, ++lastLow);
return lastLow;
}
/// <summary>
/// Returns Nth smallest element from the list. Here n starts from 0 so that n=0 returns minimum, n=1 returns 2nd smallest element etc.
/// Note: specified list would be mutated in the process.
/// Reference: Introduction to Algorithms 3rd Edition, Corman et al, pp 216
/// </summary>
public static T NthOrderStatistic<T>(this IList<T> list, int n, Random rnd = null) where T : IComparable<T>
{
return NthOrderStatistic(list, n, 0, list.Count - 1, rnd);
}
private static T NthOrderStatistic<T>(this IList<T> list, int n, int start, int end, Random rnd) where T : IComparable<T>
{
while (true)
{
var pivotIndex = list.Partition(start, end, rnd);
if (pivotIndex == n)
return list[pivotIndex];
if (n < pivotIndex)
end = pivotIndex - 1;
else
start = pivotIndex + 1;
}
}
public static void Swap<T>(this IList<T> list, int i, int j)
{
if (i==j) //This check is not required but Partition function may make many calls so its for perf reason
return;
var temp = list[i];
list[i] = list[j];
list[j] = temp;
}
/// <summary>
/// Note: specified list would be mutated in the process.
/// </summary>
public static T Median<T>(this IList<T> list) where T : IComparable<T>
{
return list.NthOrderStatistic((list.Count - 1)/2);
}
public static double Median<T>(this IEnumerable<T> sequence, Func<T, double> getValue)
{
var list = sequence.Select(getValue).ToList();
var mid = (list.Count - 1) / 2;
return list.NthOrderStatistic(mid);
}
Few notes:
O(n)
expected time. If you want O(n)
worse case time then there is technique to use median-of-median. While this would improve worse case performance, it degrades average case because constant in O(n)
is now larger. However if you would be calculating median mostly on very large data then its worth to look at.(Count-1)/2
in sorted array. But when you even number of element (Count-1)/2
is not an integer anymore and you have two medians: Lower median Math.Floor((Count-1)/2)
and Math.Ceiling((Count-1)/2)
. Some textbooks use lower median as "standard" while others propose to use average of two. This question becomes particularly critical for set of 2 elements. Above code returns lower median. If you wanted instead average of lower and upper then you need to call above code twice. In that case make sure to measure performance for your data to decide if you should use above code VS just straight sorting.MethodImplOptions.AggressiveInlining
attribute on Swap<T>
method for slightly improved performance.Sometime in the future. This is I think as simple as it can get.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Median
{
class Program
{
static void Main(string[] args)
{
var mediaValue = 0.0;
var items = new[] { 1, 2, 3, 4,5 };
var getLengthItems = items.Length;
Array.Sort(items);
if (getLengthItems % 2 == 0)
{
var firstValue = items[(items.Length / 2) - 1];
var secondValue = items[(items.Length / 2)];
mediaValue = (firstValue + secondValue) / 2.0;
}
if (getLengthItems % 2 == 1)
{
mediaValue = items[(items.Length / 2)];
}
Console.WriteLine(mediaValue);
Console.WriteLine("Enter to Exit!");
Console.ReadKey();
}
}
}
CenterSpace's NMath library provides a function:
double[] values = new double[arraySize];
double median = NMathFunctions.Median(values);
Optionally you can opt to use NaNMedian (if your array may contain null values) but you will need to convert the array to a vector:
double median = NMathFunctions.NaNMedian(new DoubleVector(values));
CenterSpace's NMath Library isn't free, but many universities have licenses