I need to count how long in bytes a textarea is when UTF8 encoded using javascript. Any idea how I would do this?
thanks!
encodeURIComponent(text).replace(/%[A-F\d]{2}/g, 'U').length
Combining various answers, the following method should be fast and accurate, and avoids issues with invalid surrogate pairs that can cause errors in encodeURIComponent():
function getUTF8Length(s) {
var len = 0;
for (var i = 0; i < s.length; i++) {
var code = s.charCodeAt(i);
if (code <= 0x7f) {
len += 1;
} else if (code <= 0x7ff) {
len += 2;
} else if (code >= 0xd800 && code <= 0xdfff) {
// Surrogate pair: These take 4 bytes in UTF-8 and 2 chars in UCS-2
// (Assume next char is the other [valid] half and just skip it)
len += 4; i++;
} else if (code < 0xffff) {
len += 3;
} else {
len += 4;
}
}
return len;
}
encodeURI(text).split(/%..|./).length - 1
set meta UTF-8
just & it's OK!
<meta charset="UTF-8">
<meta http-equiv="content-type" content="text/html;charset=utf-8">
and js:
if($mytext.length > 10){
// its okkk :)
}