Can someone please explain this:
SELECT
CASE WHEN CAST(\'iX\' AS nvarchar(20))
> CAST(\'-X\' AS nvarchar(20)) THEN 1 ELSE 0 END,
CASE WHEN CAS
When comparing unicode text, hyphens are treated specially. Unicode comparison uses "dictionary order", which ignores hyphens. This is not the case with non-unicode text comparison.
Comparing -X
and iX
, is like comparing X
and iX
, so -X
, the left side, is greater.
When comparing "-" and "i", is like comparing "" and "i", so "i", the right side is greater.
From MSDN,
A SQL collation's rules for sorting non-Unicode data are incompatible with any sort routine that is provided by the Microsoft Windows operating system; however, the sorting of Unicode data is compatible with a particular version of the Windows sorting rules. Because the comparison rules for non-Unicode and Unicode data are different, when you use a SQL collation you might see different results for comparisons of the same characters, depending on the underlying data type. For example, if you are using the SQL collation "SQL_Latin1_General_CP1_CI_AS", the non-Unicode string 'a-c' is less than the string 'ab' because the hyphen ("-") is sorted as a separate character that comes before "b". However, if you convert these strings to Unicode and you perform the same comparison, the Unicode string N'a-c' is considered to be greater than N'ab' because the Unicode sorting rules use a "word sort" that ignores the hyphen.
SELCT body From MSDN_Articles WHERE url IN
("http://support.microsoft.com/kb/322112")
A nice question!
Digging around, I found that the issue is related to hyphens and apostrophes. Your example exhibits the same 'odd' behaviour with '''X' as with '-X'.
I can't take credit for finding the answer, because it's here: SQL Sorting and hyphens