How to (url-)encode filenames containing special UTF-8 or CP1252 characters

安稳与你 提交于 2020-01-06 08:27:54

问题


I have a Server, which hosts many files (e.g. also files, where the filenames contain special chars like "Ü" and "թ")

now I am facing a big problem, because I can't create the correct URLs, because I have to encode the special chars in a form the browser understands (e.g. %XX):

  • www..../.../SPRÜCHE.txt --> needs to encode to "SPR%DCCHE.txt" to be found (otherwise 404)
  • www..../.../SPRCHEթ.txt --> needs to encode to "SPRCHE%D5%A9.txt" to be found (otherwise 404)

As you see, the first one needs one %XX fragment for the "special char", while the second one needs two (%XX%XX) of them.

currently I am encoding the links with this function, but while it works with one file, the other doesn't work (dependin on the encoding I choose)

public static String encodeURIComponent(String filename) {
        String result;

        try {
//          result = URLEncoder.encode(filename, "CP1252") //works only for SPRÜCHE
            result = URLEncoder.encode(filename, "UTF-8") // works only for SPRCHEթ
                    .replaceAll("\\+", "%20").replaceAll("\\%21", "!")
                    .replaceAll("\\%27", "'").replaceAll("\\%28", "(")
                    .replaceAll("\\%29", ")").replaceAll("\\%7E", "~");
        } catch (UnsupportedEncodingException e) {
            result = filename;
        }

        return result;
    }

Is there a all-in-one url-safe encoding functions in the java world?

It is very important for me to solve this problem, so I ask you guys for help pls (it is used only for direct http-accessable file-links - no websites or other stuff).

THANK YOU

PS: the DB is utf8_general_ci, and the filenames which are (correcly displayed SPRCHEթ and SPRÜCHE in the DB) are also used as filenames (the file is uploaded from C:...\SPRCHEթ.txt etc.) The FTP-Viewer displays the uploaded files as: SPRÜCHE.txt and SPRCHEÕ©.txt (which could be a hint?) I am asking myself, why SPRÜCHE.txt works with CP1252 while SPRCHEթ.txt uses UTF-8?

来源:https://stackoverflow.com/questions/23067713/how-to-url-encode-filenames-containing-special-utf-8-or-cp1252-characters

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!