How to set UTF-8 for file upload in java?

送分小仙女□ 提交于 2021-01-29 13:02:11

问题


I have function to get file upload below :

public static Map<Integer, Map<String, byte[]>> getFiles(IMultipartBody bimp) {
        List<IAttachment> parts = bimp.getAllAttachments();
        Iterator<IAttachment> it = parts.iterator();
        ByteArrayOutputStream baos = null;
        InputStream inputStream = null;
        String fileName = null;
        byte[] bytes = null;

        Map<Integer, Map<String, byte[]>> files = new HashMap<Integer, Map<String, byte[]>>();
        Map<String, String> duplicateFileMap = new HashMap<String, String>();
        int counter = 0;

        while (it.hasNext()) {
            try {
                IAttachment name = (IAttachment) it.next();
                MultivaluedMap<String, String> headers = name.getHeaders();

                if (headers.get("Content-Disposition") != null
                        && !headers.get("Content-Disposition").isEmpty()) {
                    String header = headers.get("Content-Disposition").get(0);
                    String[] dispositions = header.split(";");
                    for (String disposition : dispositions) {
                        if (disposition.indexOf("filename") != -1) {
                            String tmpStr = disposition.substring(
                                    disposition.indexOf("=") + 1,
                                    disposition.length()).replaceAll("\"",
                                    Constant.EMPTY);
                            ByteBuffer byteBuffs = StandardCharsets.UTF_8.encode(tmpStr);
                            fileName = StandardCharsets.UTF_8.decode(byteBuffs).toString();
//                          fileName = new String(tmpStr.getBytes(), Charset.forName("UTF-8"));

                        }
                    }
                }

                inputStream = name.getDataHandler().getInputStream();
                baos = new ByteArrayOutputStream();
                int reads = inputStream.read();
                while (reads != -1) {
                    baos.write(reads);
                    reads = inputStream.read();
                }
                bytes = baos.toByteArray();
                if (bytes == null || bytes.length < 1) {
                    continue;
                }

                Map<String, byte[]> file = new HashMap<String, byte[]>();
                if (fileName != null ){
                    // Fix for firefox, remove '/'
                    if (fileName.startsWith("/")){
                        fileName = fileName.substring(1);
                    }

                    // Fix for IE, remove physical address, only get file name
                    if (fileName.lastIndexOf("\\") != -1 ){
                        fileName = fileName.substring(fileName.lastIndexOf("\\") + 1);
                    }
                }

                String md5 = generateMD5CheckSum(bytes);
                if (duplicateFileMap.containsKey(md5)
                        && duplicateFileMap.get(md5).equalsIgnoreCase(fileName)){
                    continue;
                }
                counter++;
                file.put(fileName, bytes);
                duplicateFileMap.put(md5,fileName);
                files.put(Integer.valueOf(counter), file);

            } catch (IOException e) {
                e.printStackTrace();
                LOGGER.error(e.getMessage());
            } finally {
                try {
                    if (inputStream != null) {
                        inputStream.close();
                    }

                    if (baos != null) {
                        baos.close();
                    }

                } catch (IOException e) {
                    e.printStackTrace();
                    LOGGER.error(e.getMessage());
                }
            }
        }
        return files;
    }

But when I debug with file upload has fileName: ALMS_ขั้นตอนลงทะเบียน.pdf (it is Thai language), the headers of Attachment have below:

{Content-Disposition=[form-data; name="file"; filename="ALMS_ขั้นตอนลงทะเบียน.pdf"], Content-Type=[application/pdf], Content-ID=[root.message@cxf.apache.org]}

I think the IMultipartBody is not set UTF-8 before uploaded. Anyone can help me resolve this problem? Thanks.


回答1:


Use of Content-Disposition header is covered by the RFC6266

The filename attribute must be encoded in ISO-8859-1. Other charsets can be supported using the same name attribute followed by an asterisk, filename*, and a URL encoded filename.

See the example section 5 of the RFC, for the filename "€ rates" (euro rates) encoded in UTF-8:

filename*=UTF-8''%e2%82%ac%20rates

Yes, that's a weird notation, not a typo: the original attribute name followed by an asterisk, and the value starts with the encoding (UTF-8) followed by two quotes, and the filename URL-encoded (note that is path encoding, not parameter encoding: spaces are replaced by %20, not +).



来源:https://stackoverflow.com/questions/61242637/how-to-set-utf-8-for-file-upload-in-java

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!