问题
Currently my column datatype is varchar
in my SQL Server table.
I want to store both English and Chinese characters in my column.
What steps do I have to follow to use collation, or do I have to change the datatype to NVARCHAR
and insert with N'
as unicode?
If I have to use collation what collation I should use.
Please help me in this
回答1:
You are mixing two concepts:
- data type and encoding
VARCHAR
stores your data in chunks of 8 bit. basic characters are one chunk. But sometimes there is one special chunk telling the engine, that this character has to be interpreted as a two-byte character. If you get in touch with languages with a very differing char-set you'll find even characters which need three bytes.
NVARCHAR
stores each character as UniCode 16-bit (or 2-byte). This allows one single encoding for all characters, no tricks needed.
- Collation
The Collation is used for string comparisons and is used when you deal with string values in WHERE
within JOIN
s or in indexes and for sorting.
SQL Server has a default collation which is used with new databases and - very important! - within your temp table.
It is allowed to define a different default collation on database level, but this can lead to severe problems, if you run queries against the temp table where the collation is not the same.
You are allowed to define the collation on column level too.
And you are allowed to define the collation even within your statements for each column separately. This is the highest grade of control but means a lot of typing and very hard-to-read code...
If you want to store English and Chinese in one column you must use NVARCHAR
. There is no fitting-to-everything collation, this you must try.
You might store your strings in a side table with proper configuration and bind it within your queries...
UPDATE: ad proper configuration:
You should use different columns for English and Chinese strings. Or even one separate side table for each language... This allows you to set the best collation for each column/language separately. And it makes it easy to add new languages in a multi-language environment.
回答2:
If you want to store Far Eastern alphabets (Chinese, Japanese, Korean etc.) or Cyrillic, or Arab or Hebrew, then yes, you must use nvarchar
.
And if you define a string literal in a SQL script that contains Unicode characters, then yes you must use the N'....'
string prefix.
来源:https://stackoverflow.com/questions/39100264/sql-server-when-to-use-collation-and-nvarchar