Just for completeness (and this is not at all recommended):
int length;
try
{
length = str.getBytes("UTF-16BE").length / 2
}
catch (UnsupportedEncodingException e)
{
throw new AssertionError("Cannot happen: UTF-16BE is always a supported encoding");
}
This works because a char
is a UTF-16 code unit, and str.length()
returns the number of such code units. Each UTF-16 code unit takes up 2 bytes, so we divide by 2. Additionally, there is no byte order mark written with UTF-16BE.