JavaScript strings - UTF-16 vs UCS-2?

前端未结

关注

 3  1829

I\'ve read in some places that JavaScript strings are UTF-16, and in other places they\'re UCS-2. I did some searching around to try to figure out the difference and found t

相关标签:

3条回答

盖世英雄少女心

2020-12-01 06:00

Its just a 16-bit value with no encoding specified in the ECMAScript standard.

See section 7.8.4 String Literals in this document: http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf

0 讨论(0)
发布评论:

提交评论
- 加载中...
有刺的猬

2020-12-01 06:06

JavaScript, strictly speaking, ECMAScript, pre-dates Unicode 2.0, so in some cases you may find references to UCS-2 simply because that was correct at the time the reference was written. Can you point us to specific citations of JavaScript being "UCS-2"?

Specifications for ECMAScript versions 3 and 5 at least both explicitly declare a String to be a collection unsigned 16-bit integers and that if those integer values are meant to represent textual data, then they are UTF-16 code units. See section 8.4 of the ECMAScript Language Specification.

EDIT: I'm no longer sure my answer is entirely correct. See the excellent article mentioned above, http://mathiasbynens.be/notes/javascript-encoding, which in essence says that while a JavaScript engine may use UTF-16 internally, and most do, the language itself effectively exposes those characters as if they were UCS-2.

0 讨论(0)
发布评论:

提交评论
- 加载中...
北恋

2020-12-01 06:12

It's UTF-16/USC-2. It can handle surrogate pairs, but the charAt/charCodeAt returns a 16-bit char and not the Unicode codepoint. If you want to have it handle surrogate pairs, I suggest a quick read through this.

0 讨论(0)
发布评论:

提交评论
- 加载中...