Detecting a utf8mb4 charset requirement

白昼怎懂夜的黑 提交于 2019-12-02 11:06:27

问题


We have a mySQL DB that only supports utf8. But we are getting some data feeds that require utf8mb4 for storing in mySQL. How can we detect (in Java) if a string will require utf8mb4 charset?


回答1:


Characters that require utf8mb4 are represented as a surrogate pair in Java, and occupy 2 chars. A simple way to detect them is therefore checking if the length of the string in chars is the same as the number of code points:

boolean requiresMb4(String s) {
    int len = s.length();
    return len != s.codePointCount(0, len);
}


来源:https://stackoverflow.com/questions/21465439/detecting-a-utf8mb4-charset-requirement

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!