This stackoverflow question raised an interesting question about sorting double arrays with NaN values. The OP posted the following code:
static void Main(st
[This is my response shameless ripped from the other post. It would be nice is someone explored this further -- double.CompareTo
and double.CompareTo(double)
are well-defined, as indicated below so I suspect there is some Array.Sort
magic happening for the specific type.]
Array.Sort(double[])
: doesn't seem to be using CompareTo(double[])
as expected and this may very well be a bug -- note the difference in Array.Sort(object[]) and Array.Sort(double[]) below. I would love clarification/corrections on the following.
First, the double.CompareTo(T) method documentation -- this ordering is well-defined according to the documentation:
Less than zero: This instance is less than value. -or- This instance is not a number (NaN) and value is a number.
Zero: This instance is equal to value. -or- Both this instance and value are not a number (NaN), PositiveInfinity, or NegativeInfinity.
Greater than zero: This instance is greater than value. -or- This instance is a number and value is not a number (NaN).
In LINQPad (3.5 and 4, both have same results):
0d.CompareTo(0d).Dump(); // 0
double.NaN.CompareTo(0d).Dump(); // -1
double.NaN.CompareTo(double.NaN).Dump(); // 0
0d.CompareTo(double.NaN).Dump(); // 1
Using CompareTo(object)
has the same results:
0d.CompareTo((object)0d).Dump(); // 0
double.NaN.CompareTo((object)0d).Dump(); // -1
double.NaN.CompareTo((object)double.NaN).Dump(); // 0
0d.CompareTo((object)double.NaN).Dump(); // 1
So that's not the problem.
Now, from the Array.Sort(object[]) documentation -- there is no use of >
, <
or ==
(according to the documentation) -- just CompareTo(object)
.
Sorts the elements in an entire one-dimensional Array using the
IComparable
implementation of each element of the Array.
Likewise, Array.Sort(T[]) uses CompareTo(T)
.
Sorts the elements in an entire Array using the IComparable(Of T) generic interface implementation of each element of the Array.
Let's see:
LINQPad (4):
var ar = new double[] {double.NaN, 0, 1, double.NaN};
Array.Sort(ar);
ar.Dump();
// NaN, NaN, 0, 1
LINQPad (3.5):
var ar = new double[] {double.NaN, 0, 1, double.NaN};
Array.Sort(ar);
ar.Dump();
// NaN, 0, NaN, 1
LINQPad (3.5) -- NOTE THE ARRAY IS OF OBJECT and the behavior is "expected" per the CompareTo
contract.
var ar = new object[] {double.NaN, 0d, 1d, double.NaN};
Array.Sort(ar);
ar.Dump();
// NaN, NaN, 0, 1
Hmm. Really. In conclusion:
I HAVE NO IDEA -- but I suspect there is some "optimization" resulting in CompareTo(double)
not being invoked.
Happy coding.
I don't have the code for the .NET 3.5 runtime to verify this, but I'm thinking they fixed a bug in the default comparer for double
to bring it into line with the documentation.
According to that document, Double.Compare
treats NaN
as equal to PositiveInfinity
and NegativeInfinity
, and less than any other value.
The documentation is the same for .NET 3.5 and .NET 4.0, so I have to think that this was a bug fix to make the code work as documented.
EDIT:
After reading the comments in the linked question, I have to think that the problem wasn't in Double.Compare
, but rather in whatever method Array.Sort
(which is what List.Sort
depends on) uses to compare Double
values. Somehow, I don't think it's really Double.Compare
.
It's a bug fix. The feedback report with the bug details is here. Microsoft's response to the bug report:
Note that this bug affects the following:
- Array.Sort(), where the array contains Double.NaN
- Array.Sort(), where the array contains Single.NaN
- any callers of above, for example on List.Sort(), where list contains Double.NaN
This bug will be fixed in the next major version of the runtime; until then you can work around this by using a custom IComparer that does the correct sorting. As mentioned in the workaround comments, don't use Comparer.Default, because this is special-cased with a shortcut sort routine that doesn't handle NaN correctly. Instead, you can provide your own comparer that provides an equivalent comparision, but won't be special-cased.
Not really an answer, but perhaps a clue... You can reproduce the weird 3.5 behavior in 4.0 with this code:
void Main()
{
double[] someArray = { 4.0, 2.0, double.NaN, 1.0, 5.0, 3.0, double.NaN, 10.0, 9.0, 8.0 };
Array.Sort(someArray, CompareDouble);
someArray.Dump();
}
int CompareDouble(double a, double b)
{
if (a > b)
return 1;
if (a < b)
return -1;
return 0;
}
Here, both a > b
and a < b
return false if a or b is NaN
, so the CompareDouble
method returns 0, so NaN
is considered equal to everything... This gives the same result as in 3.5:
1,4,NaN,2,3,5,8,9,10,NaN