What are the consequences of storing a C# string (UTF-16) in a SQL Server nvarchar (UCS-2) column?

前端 未结 2 765
悲&欢浪女
悲&欢浪女 2021-02-04 14:13

It seems that SQL Server uses Unicode UCS-2, a 2-byte fixed-length character encoding, for nchar/nvarchar fields. Meanwhile, C# uses Unicode

2条回答
  •  无人及你
    2021-02-04 14:42

    It's all a bit of a fudge really.

    First the similarities

    • The SQL Server nchar/nvarchar/ntext data types store text as a string of 2-byte characters. It doesn't really care what you put in them until you come to do searching and sorting (then it uses the appropriate Unicode collation sequence).
    • The CLR String data type also stores text as a string of 2-byte Chars. It also doesn't really care what you put in it until you come to do searching and sorting (then it uses the appropriate culture-specific methods).

    Now the differences

    • .NET allows you to access the actual Unicode code points in a CLR string via the StringInfo class.
    • .NET has tons of support for encoding and decoding text data in a variety of encodings. When converting an arbitrary byte stream to a String, it will always encode the string as UTF-16 (with full multilingual plane support).

    In short, as long as you treat both CLR and SQL Server string variables as whole blobs of text, then you can freely assign from one to the other with no loss of information. The underlying storage format is exactly the same, even though the abstractions layered on top are slightly different.

提交回复
热议问题