If I use Java 8's String.codePoints to get an array of int codePoints, is it true that the length of the array is the count of characters?

十年热恋 提交于 2019-12-01 01:31:32

No.

For example:


Now it is debatable whether some of these might be "actual characters that a human would find meaningful" ... but the overall answer is still No.


You clarified as follows:

By "human" I kind of meant "programmer" as I would imagine most programmers would see \r\n as two characters ...

It is more complicated than that. I am a programmer, and for me it depends on the context whether \r\n are meaningful or not. If I am reading a README file, my brain will treat differences in white space as having no semantic importance. But if I am writing a parser, my code would take whitespace into account ... depending on the language it is intended to parse.

Just check the Javadoc of CharSequence for the codePoints() method :

Returns a stream of code point values from this sequence. Any surrogate pairs encountered in the sequence are combined as if by Character.toCodePoint and the result is passed to the stream. Any other code units, including ordinary BMP characters, unpaired surrogates, and undefined code units, are zero-extended to int values which are then passed to the stream. https://docs.oracle.com/javase/8/docs/api/java/lang/CharSequence.html#codePoints--

And the one in the String classes related to code points to understand what a code point is :

String(int[] codePoints, int offset, int count) Allocates a new String that contains characters from a subarray of the Unicode code point array argument.https://docs.oracle.com/javase/8/docs/api/java/lang/String.html

A code point is an int representing a Unicode code point (https://docs.oracle.com/javase/8/docs/api/java/lang/Character.html#unicode) so all characters are included even those non-human-readable.

String object.codePoints() returns a stream of characters in Java 8.On which you are calling toArray method,so it will treat each character in a seperate manner and will return number of characters.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!