Is it just that nvarchar
supports multibyte characters? If that is the case, is there really any point, other than storage concerns, to using varchars
An nvarchar
column can store any Unicode data. A varchar
column is restricted to an 8-bit codepage. Some people think that varchar
should be used because it takes up less space. I believe this is not the correct answer. Codepage incompatabilities are a pain, and Unicode is the cure for codepage problems. With cheap disk and memory nowadays, there is really no reason to waste time mucking around with code pages anymore.
All modern operating systems and development platforms use Unicode internally. By using nvarchar
rather than varchar
, you can avoid doing encoding conversions every time you read from or write to the database. Conversions take time, and are prone to errors. And recovery from conversion errors is a non-trivial problem.
If you are interfacing with an application that uses only ASCII, I would still recommend using Unicode in the database. The OS and database collation algorithms will work better with Unicode. Unicode avoids conversion problems when interfacing with other systems. And you will be preparing for the future. And you can always validate that your data is restricted to 7-bit ASCII for whatever legacy system you're having to maintain, even while enjoying some of the benefits of full Unicode storage.
Since SQL Server 2019 varchar columns support UTF-8 encoding.
Thus, from now on, the difference is size.
In a database system that translates to difference in speed.
Less size = Less IO + Less Memory = More speed in general. Read the article above for the numbers.
Unless you have a very specific use-case go for varchar in UTF8 from now on!
varchar: Variable-length, non-Unicode character data. The database collation determines which code page the data is stored using.
nvarchar: Variable-length Unicode character data. Dependent on the database collation for comparisons.
Armed with this knowledge, use whichever one matches your input data (ASCII v. Unicode).
Mainly nvarchar stores Unicode characters and varchar stores non-Unicode characters.
"Unicodes" means 16-bit character encoding scheme allowing characters from lots of other languages like Arabic, Hebrew, Chinese, Japanese, to be encoded in a single character set.
That means unicodes is using 2 bytes per character to store and nonunicodes uses only one byte per character to store. Which means unicodes need double capacity to store compared to non-unicodes.
nVarchar will help you to store Unicode characters. It is the way to go if you want to store localized data.
Jeffrey L Whitledge with ~47000 reputation score recommends usage of nvarchar
Solomon Rutzky with with ~33200 reputation score recommends: Do NOT always use NVARCHAR. That is a very dangerous, and often costly, attitude / approach.
What are the main performance differences between varchar and nvarchar SQL Server data types?
https://www.sqlservercentral.com/articles/disk-is-cheap-orly-4
Both persons of such a high reputation, what does a learning sql server database developer choose?
There are many warnings in answers and comments about performance issues if you are not consistent in choices.
There are comments pro/con nvarchar for performance.
There are comments pro/con varchar for performance.
I have a particular requirement for a table with many hundreds of columns, which in itself is probably unusual ?
I'm choosing varchar to avoid going close to the 8060 byte table record size limit of SQL*server 2012.
Use of nvarchar, for me, goes over this 8060 byte limit.
I'm also thinking that I should match the data types of the related code tables to the data types of the primary central table.
I have seen use of varchar column at this place of work, South Australian Government, by previous experienced database developers, where the table row count is going to be several millions or more (and very few nvarchar columns, if any, in these very large tables), so perhaps the expected data row volumes becomes part of this decision.