Just a note on whatever method you finally choose, if that method happens to include the use of strcmp
that some answers suggest:
strcmp
doesn't work with Unicode data in general. In general, it doesn't even work with byte-based Unicode encodings, such as utf-8, since strcmp
only makes byte-per-byte comparisons and Unicode code points encoded in utf-8 can take more than 1 byte. The only specific Unicode case strcmp
properly handle is when a string encoded with a byte-based encoding contains only code points below U+00FF - then the byte-per-byte comparison is enough.