How to calculate byte length containing UTF8 characters using javascript?

前端 未结 3 1830
春和景丽
春和景丽 2021-02-13 21:35

I have textbox, in which the user can enter the characters in ASCII/UTF-8 or a combination of both. Is there any API in javascript which we can calculate the length of string in

3条回答
  •  醉话见心
    2021-02-13 21:43

    Counting UTF8 bytes comes up quite a bit in JavaScript, a bit of looking around and you'll find a number of libraries (here's one example: https://github.com/mathiasbynens/utf8.js) that can help. I also found a thread (https://gist.github.com/mathiasbynens/1010324) full of solutions specifically for utf8 byte counts.

    Here is the smallest, and most accurate function out of that thread:

    function countUtf8Bytes(s){
        var b = 0, i = 0, c
        for(;c=s.charCodeAt(i++);b+=c>>11?3:c>>7?2:1);
        return b
    }
    

    Note: I rearranged it a bit so that the signature is easier to read. However its still a very compact function that might be hard to understand for some.

    You can check its results with this tool: https://mothereff.in/byte-counter

    One correction to your OP, the example string you provided i ♥ u is actually 7 bytes, this function does count it correctly.

提交回复
热议问题